101
|
Meta-imputation of transcriptome from genotypes across multiple datasets by leveraging publicly available summary-level data. PLoS Genet 2022; 18:e1009571. [PMID: 35100255 PMCID: PMC8830793 DOI: 10.1371/journal.pgen.1009571] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 02/10/2022] [Accepted: 01/07/2022] [Indexed: 11/22/2022] Open
Abstract
Transcriptome wide association studies (TWAS) can be used as a powerful method to identify and interpret the underlying biological mechanisms behind GWAS by mapping gene expression levels with phenotypes. In TWAS, gene expression is often imputed from individual-level genotypes of regulatory variants identified from external resources, such as Genotype-Tissue Expression (GTEx) Project. In this setting, a straightforward approach to impute expression levels of a specific tissue is to use the model trained from the same tissue type. When multiple tissues are available for the same subjects, it has been demonstrated that training imputation models from multiple tissue types improves the accuracy because of shared eQTLs between the tissues and increase in effective sample size. However, existing joint-tissue methods require access of genotype and expression data across all tissues. Moreover, they cannot leverage the abundance of various expression datasets across various tissues for non-overlapping individuals. Here, we explore the optimal way to combine imputed levels across training models from multiple tissues and datasets in a flexible manner using summary-level data. Our proposed method (SWAM) combines arbitrary number of transcriptome imputation models to linearly optimize the imputation accuracy given a target tissue. By integrating models across tissues and/or individuals, SWAM can improve the accuracy of transcriptome imputation or to improve power to TWAS while only requiring individual-level data from a single reference cohort. To evaluate the accuracy of SWAM, we combined 49 tissue-specific gene expression imputation models from the GTEx Project as well as from a large eQTL study of Depression Susceptibility Genes and Networks (DGN) Project and tested imputation accuracy in GEUVADIS lymphoblastoid cell lines samples. We also extend our meta-imputation method to meta-TWAS to leverage multiple tissues in TWAS analysis with summary-level statistics. Our results capitalize on the importance of integrating multiple tissues to unravel regulatory impacts of genetic variants on complex traits. The gene expression levels within a cell are affected by various factors, including DNA variation, cell type, cellular microenvironment, disease status, and other environmental factors surrounding the individual. The genetic component of gene expression is known to explain a substantial fraction of transcriptional variation among individuals and can be imputed from genotypes in a tissue-specific manner, by training from population-scale transcriptomic profiles designed to identify expression quantitative loci (eQTLs). Imputing gene expression levels is shown to help understand the genetic basis of human disease through Transcriptome-wide association analysis (TWAS) and Mendelian Randomization (MR). However, it has been unclear how to integrate multiple imputation models trained from individual datasets to maximize their accuracy without having to access individual genotypes and expression levels that are often protected for privacy concerns. We developed SWAM (Smartly Weighted Averaging across Multiple datasets), a meta-imputation framework which can accurately impute gene expression levels from genotypes by integrating multiple imputation models without requiring individual-level data. Our method examines the similarity or differences between resources and borrowing information most relevant to the tissue of interest. We demonstrate that SWAM outperforms existing single-tissue and multi-tissue imputation models and continue to increase accuracy when integrating additional imputation models.
Collapse
|
102
|
Liu D, Zhu J, Zhou D, Nikas EG, Mitanis NT, Sun Y, Wu C, Mancuso N, Cox NJ, Wang L, Freedland SJ, Haiman CA, Gamazon ER, Nikas JB, Wu L. A transcriptome-wide association study identifies novel candidate susceptibility genes for prostate cancer risk. Int J Cancer 2022; 150:80-90. [PMID: 34520569 PMCID: PMC8595764 DOI: 10.1002/ijc.33808] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/20/2021] [Accepted: 08/30/2021] [Indexed: 01/03/2023]
Abstract
A large proportion of heritability for prostate cancer risk remains unknown. Transcriptome-wide association study combined with validation comparing overall levels will help to identify candidate genes potentially playing a role in prostate cancer development. Using data from the Genotype-Tissue Expression Project, we built genetic models to predict normal prostate tissue gene expression using the statistical framework PrediXcan, a modified version of the unified test for molecular signatures and Joint-Tissue Imputation. We applied these prediction models to the genetic data of 79 194 prostate cancer cases and 61 112 controls to investigate the associations of genetically determined gene expression with prostate cancer risk. Focusing on associated genes, we compared their expression in prostate tumor vs normal prostate tissue, compared methylation of CpG sites located at these loci in prostate tumor vs normal tissue, and assessed the correlations between the differentiated genes' expression and the methylation of corresponding CpG sites, by analyzing The Cancer Genome Atlas (TCGA) data. We identified 573 genes showing an association with prostate cancer risk at a false discovery rate (FDR) ≤ 0.05, including 451 novel genes and 122 previously reported genes. Of the 573 genes, 152 showed differential expression in prostate tumor vs normal tissue samples. At loci of 57 genes, 151 CpG sites showed differential methylation in prostate tumor vs normal tissue samples. Of these, 20 CpG sites were correlated with expression of 11 corresponding genes. In this TWAS, we identified novel candidate susceptibility genes for prostate cancer risk, providing new insights into prostate cancer genetics and biology.
Collapse
Affiliation(s)
- Duo Liu
- Department of Pharmacy, Harbin Medical University Cancer Hospital, Harbin, China
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA
| | - Jingjing Zhu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA
| | - Dan Zhou
- Vanderbilt Genetics Institute and Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Emily G Nikas
- School of Mathematics, University of Minnesota, Minneapolis, MN, USA
| | - Nikos T Mitanis
- Department of Mathematics, University of the Aegean, Samos, Greece
| | - Yanfa Sun
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA
- College of Life Science, Longyan University, Longyan, Fujian, P. R. China
- Fujian Provincial Key Laboratory for the Prevention and Control of Animal Infectious Diseases and Biotechnology, Longyan, Fujian, 364012, P.R. China
- Key Laboratory of Preventive Veterinary Medicine and Biotechnology (Longyan University), Fujian Province University, Longyan, Fujian, 364012, P.R. China
| | - Chong Wu
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Nicholas Mancuso
- Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA; Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, USA
| | - Nancy J Cox
- Vanderbilt Genetics Institute and Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Liang Wang
- Department of Tumor Biology, H. Lee Moffitt Cancer Center, Tampa, FL, USA
| | - Stephen J Freedland
- Center for Integrated Research in Cancer and Lifestyle, Cedars-Sinai Medical Center, Los Angeles, CA
- Section of Urology, Durham VA Medical Center, Durham, NC, USA
| | - Christopher A Haiman
- Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA; Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, USA
| | - Eric R Gamazon
- Vanderbilt Genetics Institute and Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Clare Hall, University of Cambridge, Cambridge, UK
- MRC Epidemiology Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Jason B Nikas
- Research & Development, Genomix Inc., Minneapolis, MN, USA
| | - Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA
| |
Collapse
|
103
|
Shuey MM, Xiang RR, Moss ME, Carvajal BV, Wang Y, Camarda N, Fabbri D, Rahman P, Ramsey J, Stepanian A, Sebastiani P, Wells QS, Beckman JA, Jaffe IZ. Systems Approach to Integrating Preclinical Apolipoprotein E-Knockout Investigations Reveals Novel Etiologic Pathways and Master Atherosclerosis Network in Humans. Arterioscler Thromb Vasc Biol 2022; 42:35-48. [PMID: 34758633 PMCID: PMC8887835 DOI: 10.1161/atvbaha.121.317071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
OBJECTIVE Animal models of atherosclerosis are used extensively to interrogate molecular mechanisms in serial fashion. We tested whether a novel systems biology approach to integration of preclinical data identifies novel pathways and regulators in human disease. Approach and Results: Of 716 articles published in ATVB from 1995 to 2019 using the apolipoprotein E knockout mouse to study atherosclerosis, data were extracted from 360 unique studies in which a gene was experimentally perturbed to impact plaque size or composition and analyzed using Ingenuity Pathway Analysis software. TREM1 (triggering receptor expressed on myeloid cells) signaling and LXR/RXR (liver X receptor/retinoid X receptor) activation were identified as the top atherosclerosis-associated pathways in mice (both P<1.93×10-4, TREM1 implicated early and LXR/RXR in late atherogenesis). The top upstream regulatory network in mice (sc-58125, a COX2 inhibitor) linked 64.0% of the genes into a single network. The pathways and networks identified in mice were interrogated by testing for associations between the genetically predicted gene expression of each mouse pathway-identified human homolog with clinical atherosclerosis in a cohort of 88 660 human subjects. Homologous human pathways and networks were significantly enriched for gene-atherosclerosis associations (empirical P<0.01 for TREM1 and LXR/RXR pathways and COX2 network). This included 12(60.0%) TREM1 pathway genes, 15(53.6%) LXR/RXR pathway genes, and 67(49.3%) COX2 network genes. Mouse analyses predicted, and human study validated, the strong association of COX2 expression (PTGS2) with increased likelihood of atherosclerosis (odds ratio, 1.68 per SD of genetically predicted gene expression; P=1.07×10-6). CONCLUSIONS PRESCIANT (Preclinical Science Integration and Translation) leverages published preclinical investigations to identify high-confidence pathways, networks, and regulators of human disease.
Collapse
Affiliation(s)
| | | | - M. Elizabeth Moss
- Department of Medicine (M.M.S., J.R., Q.S.W., J.A.B.) and Department of Biomedical Informatics (D.F., P.R.), Vanderbilt University Medical Center, Nashville, TN. Molecular Cardiology Research Institute (R.R.X., M.E.M., B.V.C., Y.W., N.C., A.S., I.Z.J.) and Institute for Clinical Research and Health Policy Studies (P.S.), Tufts Medical Center, Boston, MA
| | - Brigett V. Carvajal
- Department of Medicine (M.M.S., J.R., Q.S.W., J.A.B.) and Department of Biomedical Informatics (D.F., P.R.), Vanderbilt University Medical Center, Nashville, TN. Molecular Cardiology Research Institute (R.R.X., M.E.M., B.V.C., Y.W., N.C., A.S., I.Z.J.) and Institute for Clinical Research and Health Policy Studies (P.S.), Tufts Medical Center, Boston, MA
| | - Yihua Wang
- Department of Medicine (M.M.S., J.R., Q.S.W., J.A.B.) and Department of Biomedical Informatics (D.F., P.R.), Vanderbilt University Medical Center, Nashville, TN. Molecular Cardiology Research Institute (R.R.X., M.E.M., B.V.C., Y.W., N.C., A.S., I.Z.J.) and Institute for Clinical Research and Health Policy Studies (P.S.), Tufts Medical Center, Boston, MA
| | - Nicholas Camarda
- Department of Medicine (M.M.S., J.R., Q.S.W., J.A.B.) and Department of Biomedical Informatics (D.F., P.R.), Vanderbilt University Medical Center, Nashville, TN. Molecular Cardiology Research Institute (R.R.X., M.E.M., B.V.C., Y.W., N.C., A.S., I.Z.J.) and Institute for Clinical Research and Health Policy Studies (P.S.), Tufts Medical Center, Boston, MA
| | - Daniel Fabbri
- Department of Medicine (M.M.S., J.R., Q.S.W., J.A.B.) and Department of Biomedical Informatics (D.F., P.R.), Vanderbilt University Medical Center, Nashville, TN. Molecular Cardiology Research Institute (R.R.X., M.E.M., B.V.C., Y.W., N.C., A.S., I.Z.J.) and Institute for Clinical Research and Health Policy Studies (P.S.), Tufts Medical Center, Boston, MA
| | - Protiva Rahman
- Department of Medicine (M.M.S., J.R., Q.S.W., J.A.B.) and Department of Biomedical Informatics (D.F., P.R.), Vanderbilt University Medical Center, Nashville, TN. Molecular Cardiology Research Institute (R.R.X., M.E.M., B.V.C., Y.W., N.C., A.S., I.Z.J.) and Institute for Clinical Research and Health Policy Studies (P.S.), Tufts Medical Center, Boston, MA
| | - Jacob Ramsey
- Department of Medicine (M.M.S., J.R., Q.S.W., J.A.B.) and Department of Biomedical Informatics (D.F., P.R.), Vanderbilt University Medical Center, Nashville, TN. Molecular Cardiology Research Institute (R.R.X., M.E.M., B.V.C., Y.W., N.C., A.S., I.Z.J.) and Institute for Clinical Research and Health Policy Studies (P.S.), Tufts Medical Center, Boston, MA
| | - Alec Stepanian
- Department of Medicine (M.M.S., J.R., Q.S.W., J.A.B.) and Department of Biomedical Informatics (D.F., P.R.), Vanderbilt University Medical Center, Nashville, TN. Molecular Cardiology Research Institute (R.R.X., M.E.M., B.V.C., Y.W., N.C., A.S., I.Z.J.) and Institute for Clinical Research and Health Policy Studies (P.S.), Tufts Medical Center, Boston, MA
| | - Paola Sebastiani
- Department of Medicine (M.M.S., J.R., Q.S.W., J.A.B.) and Department of Biomedical Informatics (D.F., P.R.), Vanderbilt University Medical Center, Nashville, TN. Molecular Cardiology Research Institute (R.R.X., M.E.M., B.V.C., Y.W., N.C., A.S., I.Z.J.) and Institute for Clinical Research and Health Policy Studies (P.S.), Tufts Medical Center, Boston, MA
| | | | | | | |
Collapse
|
104
|
Ngwa JS, Yanek LR, Kammers K, Kanchan K, Taub MA, Scharpf RB, Faraday N, Becker LC, Mathias RA, Ruczinski I. Secondary analyses for genome-wide association studies using expression quantitative trait loci. Genet Epidemiol 2022; 46:170-181. [PMID: 35312098 PMCID: PMC9086181 DOI: 10.1002/gepi.22448] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 11/19/2021] [Accepted: 01/20/2022] [Indexed: 01/01/2023]
Abstract
Genome-wide association studies (GWAS) have successfully identified thousands of single nucleotide polymorphisms (SNPs) associated with complex traits; however, the identified SNPs account for a fraction of trait heritability, and identifying the functional elements through which genetic variants exert their effects remains a challenge. Recent evidence suggests that SNPs associated with complex traits are more likely to be expression quantitative trait loci (eQTL). Thus, incorporating eQTL information can potentially improve power to detect causal variants missed by traditional GWAS approaches. Using genomic, transcriptomic, and platelet phenotype data from the Genetic Study of Atherosclerosis Risk family-based study, we investigated the potential to detect novel genomic risk loci by incorporating information from eQTL in the relevant target tissues (i.e., platelets and megakaryocytes) using established statistical principles in a novel way. Permutation analyses were performed to obtain family-wise error rates for eQTL associations, substantially lowering the genome-wide significance threshold for SNP-phenotype associations. In addition to confirming the well known association between PEAR1 and platelet aggregation, our eQTL-focused approach identified a novel locus (rs1354034) and gene (ARHGEF3) not previously identified in a GWAS of platelet aggregation phenotypes. A colocalization analysis showed strong evidence for a functional role of this eQTL.
Collapse
Affiliation(s)
- Julius S. Ngwa
- Department of BiostatisticsJohns Hopkins Bloomberg School of Public HealthBaltimoreMarylandUSA
| | - Lisa R. Yanek
- Department of MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Kai Kammers
- Department of OncologyJohns Hopkins University, School of MedicineBaltimoreMarylandUSA
| | - Kanika Kanchan
- Department of MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Margaret A. Taub
- Department of BiostatisticsJohns Hopkins Bloomberg School of Public HealthBaltimoreMarylandUSA
| | - Robert B. Scharpf
- Department of OncologyJohns Hopkins University, School of MedicineBaltimoreMarylandUSA
| | - Nauder Faraday
- Department of Anesthesiology and Critical Care MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Lewis C. Becker
- Department of MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Rasika A. Mathias
- Department of MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Ingo Ruczinski
- Department of BiostatisticsJohns Hopkins Bloomberg School of Public HealthBaltimoreMarylandUSA
| |
Collapse
|
105
|
Mahoney E, Janve V, Hohman TJ, Dumitrescu L. Evaluation of Sex-Aware PrediXcan Models for Predicting Gene Expression. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2022; 27:361-372. [PMID: 34890163 PMCID: PMC8924937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Gene-based methods such as PrediXcan use expression quantitative trait loci to build tissue-specific gene expression models when only genetic data is available. There are known sex differences in tissue-specific gene expression and in the genetic architecture of gene expression, but such differences have not been incorporated into predicted gene expression models to date. We built sex-aware PrediXcan models using whole blood transcriptomic data from the Genotype-Tissue Expression (GTEx) project (195 females and 371 males) and evaluated their performance in an independent dataset. Specifically, PrediXcan models were built following the method described in Gamazon et al. 2015, but we included both whole-sample and sex-specific models. Validation was evaluated leveraging lymphoblast RNA sequencing data from the EUR cohort of the 1000 Genomes Project (178 females and 171 males). Correlations (R2) between observed and predicted expression were evaluated in 5,283 autosomal genes to determine performance of models. In sum, we successfully predicted 1,149 genes in males and 623 in females, while 3,511 genes appeared to be not sex-specific. Of the sex-specific genes, 15% (189 genes in males and 73 genes in females) exhibited higher R2 in sex-specific models compared to whole-sample models, although the overall gain in predictive power was generally minimal and well within measurement error. Nevertheless, two female-specific genes and six male-specific genes showed significantly better prediction when using the sex-specific weights versus the whole-sample weights; furthermore, several of these genes play a role in mitochondrial metabolism, which is known to be influenced by sex hormones. Taken together, these results support previous reports of the small contribution of genetic architecture to sex-specific expression. Still, sex-aware PrediXcan models were able to provide robust sex-specific prediction signals. Future studies exploring the contribution of the X chromosome and tissue specificity on sex-specific genetically regulated expression will clarify the utility of this method.
Collapse
Affiliation(s)
- Emily Mahoney
- Vanderbilt Memory and Alzheimer’s Center, Vanderbilt University Medical Center, Nashville, TN 37212, USA
| | - Vaibhav Janve
- Vanderbilt Memory and Alzheimer’s Center, Vanderbilt University Medical Center, Nashville, TN 37212, USA
| | - Timothy J. Hohman
- Vanderbilt Memory and Alzheimer’s Center, Vanderbilt University Medical Center, Nashville, TN 37212, USA,Vanderbilt Genetics Institute, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37212, USA
| | - Logan Dumitrescu
- Vanderbilt Memory and Alzheimer’s Center, Vanderbilt University Medical Center, Nashville, TN 37212, USA,Vanderbilt Genetics Institute, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37212, USA,
| |
Collapse
|
106
|
Ke X, Tian X, Yao S, Wu H, Duan YY, Wang NN, Shi W, Yang TL, Dong SS, Huang D, Guo Y. Transcriptome-wide association study identifies multiple genes and pathways associated with thyroid function. Hum Mol Genet 2021; 31:1871-1883. [PMID: 34962261 DOI: 10.1093/hmg/ddab371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 12/03/2021] [Accepted: 12/20/2021] [Indexed: 11/12/2022] Open
Abstract
Thyroid dysfunction is a common endocrine disease measured by thyroid-stimulating hormone (TSH) level. Although more than 70 genetic loci associated with TSH have been reported through genome-wide association studies (GWASs), the variants can only explain a small fraction of the thyroid function heritability. To identify novel candidate genes for thyroid function, we conducted the first large-scale transcriptome-wide association study (TWAS) for thyroid function using GWAS-summary data for TSH levels in up to 119 715 individuals combined with pre-computed gene expression weights of six panels from four tissue types. The candidate genes identified by TWAS were further validated by TWAS replication and gene expression profiles. We identified 74 conditionally independent genes significantly associated with thyroid function, such as PDE8B (P = 1.67 × 10-282), PDE10A (P = 7.61 × 10-119), NR3C2 (P = 1.50 × 10-92), and CAPZB (P = 3.13 × 10-79). After TWAS replication using UKBB datasets, 26 genes were replicated for significant associations with thyroid-relevant diseases/traits. Among them, 16 gene were causal for their associations to thyroid-relevant diseases/traits and further validated in differential expression analyses, including two novel genes (MFSD6 and RBM47) that did not implicate in previous GWASs. Enrichment analyses detected several pathways associated with thyroid function, such as the cAMP signaling pathway (P = 7.27 × 10-4), hemostasis (P = 3.74 × 10-4), and platelet activation, signaling, and aggregation (P = 9.98 × 10-4). Our study identified multiple candidate genes and pathways associated with thyroid function, providing novel clues for revealing the genetic mechanisms of thyroid function and disease.
Collapse
Affiliation(s)
- Xin Ke
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China, 710049
| | - Xin Tian
- Department of Orthopaedics, Honghui Hospital, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China
| | - Shi Yao
- National and Local Joint Engineering Research Center of Biodiagnosis and Biotherapy, The Second Affiliated Hospital, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China, 710004
| | - Hao Wu
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China, 710049
| | - Yuan-Yuan Duan
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China, 710049
| | - Nai-Ning Wang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China, 710049
| | - Wei Shi
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China, 710049
| | - Tie-Lin Yang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China, 710049.,National and Local Joint Engineering Research Center of Biodiagnosis and Biotherapy, The Second Affiliated Hospital, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China, 710004
| | - Shan-Shan Dong
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China, 710049.,Research Institute of Xi'an Jiaotong University, Hangzhou, Zhejiang, P. R. China
| | - Dageng Huang
- Department of Orthopaedics, Honghui Hospital, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China
| | - Yan Guo
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China, 710049.,Department of Orthopaedics, Honghui Hospital, Xi'an Jiaotong University, Xi'an, Shaanxi, P. R. China
| |
Collapse
|
107
|
Sun Y, Zhou D, Rahman MR, Zhu J, Ghoneim D, Cox NJ, Beach TG, Wu C, Gamazon ER, Wu L. A transcriptome-wide association study identifies novel blood-based gene biomarker candidates for Alzheimer's disease risk. Hum Mol Genet 2021; 31:289-299. [PMID: 34387340 PMCID: PMC8831284 DOI: 10.1093/hmg/ddab229] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 07/12/2021] [Accepted: 07/23/2021] [Indexed: 11/12/2022] Open
Abstract
Alzheimer's disease (ad) adversely affects the health, quality of life and independence of patients. There is a critical need to identify novel blood gene biomarkers for ad risk assessment. We performed a transcriptome-wide association study to identify biomarker candidates for ad risk. We leveraged two sets of gene expression prediction models of blood developed using different reference panels and modeling strategies. By applying the prediction models to a meta-GWAS including 71 880 (proxy) cases and 383 378 (proxy) controls, we identified significant associations of genetically determined expression of 108 genes in blood with ad risk. Of these, 15 genes were differentially expressed between ad patients and controls with concordant directions in measured expression data. With evidence from the analyses based on both genetic instruments and directly measured expression levels, this study identifies 15 genes with strong support as biomarkers in blood for ad risk, which may enhance ad risk assessment and mechanism-focused studies.
Collapse
Affiliation(s)
- Yanfa Sun
- Department of Animal Science and Veterinary Medicine, College of Life Science, Longyan University, Longyan, Fujian, 364012, P.R. China
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI 96813, USA
- Fujian Provincial Key Laboratory for the Prevention and Control of Animal Infectious Diseases and Biotechnology, Longyan, Fujian 364012, P.R. China
- Fujian Province Universities Key Laboratory of Preventive Veterinary Medicine and Biotechnology (Longyan University), Longyan, Fujian, 364012, P.R. China
| | - Dan Zhou
- Vanderbilt Genetics Institute and Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Md Rezanur Rahman
- Queensland Brain Institute, The University of Queensland, Brisbane, Qld 4072, Australia
| | - Jingjing Zhu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI 96813, USA
| | - Dalia Ghoneim
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI 96813, USA
| | - Nancy J Cox
- Vanderbilt Genetics Institute and Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Thomas G Beach
- Banner Sun Health Research Institute, Sun City, AZ 85351, USA
| | - Chong Wu
- Department of Statistics, Florida State University, Tallahassee, FL 32306, USA
| | - Eric R Gamazon
- Vanderbilt Genetics Institute and Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
- Clare Hall, University of Cambridge, Cambridge CB3 9AL, UK
- MRC Epidemiology Unit, School of Clinical Medicine, University of Cambridge, Cambridge CB2 0SL, UK
| | - Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI 96813, USA
| |
Collapse
|
108
|
Bae YE, Wu L, Wu C. InTACT: An adaptive and powerful framework for joint-tissue transcriptome-wide association studies. Genet Epidemiol 2021; 45:848-859. [PMID: 34255882 PMCID: PMC8604767 DOI: 10.1002/gepi.22425] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 06/22/2021] [Accepted: 06/24/2021] [Indexed: 11/05/2022]
Abstract
Transcriptome-wide association studies (TWAS) that integrate transcriptomic reference data and genome-wide association studies (GWAS) have successfully enhanced the discovery of candidate genes for many complex traits. However, existing methods may suffer from substantial power loss because they fail to effectively consider that expression of many genes tends to be consistent across tissues. Here we propose a computationally efficient testing method, referred to as Integrative Test for Associations via Cauchy Transformation (InTACT), that effectively combines information across multiple tissues and thus improves the power of identifying associated genes. Through simulation studies, we show that InTACT maintains high power while properly controls for Type 1 error rates. We applied InTACT to the largest GWAS of Alzheimer's disease (AD) to date and identified 227 genome-wide significant genes, of which 130 were not identified by benchmark methods, TWAS and MultiXcan. Importantly, InTACT identified five novel loci for AD. We implemented InTACT in publicly available software, "InTACT."
Collapse
Affiliation(s)
- Ye Eun Bae
- Department of Statistics, Florida State University
| | - Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa
| | - Chong Wu
- Department of Statistics, Florida State University
| |
Collapse
|
109
|
Cao C, Kossinna P, Kwok D, Li Q, He J, Su L, Guo X, Zhang Q, Long Q. Disentangling genetic feature selection and aggregation in transcriptome-wide association studies. Genetics 2021; 220:6444993. [PMID: 34849857 DOI: 10.1093/genetics/iyab216] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 11/04/2021] [Indexed: 12/14/2022] Open
Abstract
The success of transcriptome-wide association studies (TWAS) has led to substantial research towards improving the predictive accuracy of its core component of Genetically Regulated eXpression (GReX). GReX links expression information with genotype and phenotype by playing two roles simultaneously: it acts as both the outcome of the genotype-based predictive models (for predicting expressions) and the linear combination of genotypes (as the predicted expressions) for association tests. From the perspective of machine learning (considering SNPs as features), these are actually two separable steps-feature selection and feature aggregation-which can be independently conducted. In this work, we show that the single approach of GReX limits the adaptability of TWAS methodology and practice. By conducting simulations and real data analysis, we demonstrate that disentangled protocols adapting straightforward approaches for feature selection (e.g., simple marker test) and aggregation (e.g., kernel machines) outperform the standard TWAS protocols that rely on GReX. Our development provides more powerful novel tools for conducting TWAS. More importantly, our characterization of the exact nature of TWAS suggests that, instead of questionably binding two distinct steps into the same statistical form (GReX), methodological research focusing on optimal combinations of feature selection and aggregation approaches will bring higher power to TWAS protocols.
Collapse
Affiliation(s)
- Chen Cao
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Devin Kwok
- Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Jingni He
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Liya Su
- Department of Pathology, Anatomy and Cell Biology, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN 37203, USA
| | - Qingrun Zhang
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.,Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Quan Long
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.,Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada.,Department of Medical Genetics, University of Calgary, Calgary, AB T2N 4N1, Canada.,Hotchkiss Brain Institute, O'Brien Institute for Public Health, University of Calgary, Calgary, AB T2N 4N1, Canada
| |
Collapse
|
110
|
Schmiedel BJ, Rocha J, Gonzalez-Colin C, Bhattacharyya S, Madrigal A, Ottensmeier CH, Ay F, Chandra V, Vijayanand P. COVID-19 genetic risk variants are associated with expression of multiple genes in diverse immune cell types. Nat Commun 2021; 12:6760. [PMID: 34799557 PMCID: PMC8604964 DOI: 10.1038/s41467-021-26888-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 10/18/2021] [Indexed: 12/20/2022] Open
Abstract
Common genetic polymorphisms associated with COVID-19 illness can be utilized for discovering molecular pathways and cell types driving disease pathogenesis. Given the importance of immune cells in the pathogenesis of COVID-19 illness, here we assessed the effects of COVID-19-risk variants on gene expression in a wide range of immune cell types. Transcriptome-wide association study and colocalization analysis revealed putative causal genes and the specific immune cell types where gene expression is most influenced by COVID-19-risk variants. Notable examples include OAS1 in non-classical monocytes, DTX1 in B cells, IL10RB in NK cells, CXCR6 in follicular helper T cells, CCR9 in regulatory T cells and ARL17A in TH2 cells. By analysis of transposase accessible chromatin and H3K27ac-based chromatin-interaction maps of immune cell types, we prioritized potentially functional COVID-19-risk variants. Our study highlights the potential of COVID-19 genetic risk variants to impact the function of diverse immune cell types and influence severe disease manifestations.
Collapse
Affiliation(s)
| | - Job Rocha
- La Jolla Institute for Immunology, La Jolla, CA, USA
- Center for Genomic Sciences, National Autonomous University of Mexico, Cuernavaca, Morelos, Mexico
| | - Cristian Gonzalez-Colin
- La Jolla Institute for Immunology, La Jolla, CA, USA
- Center for Genomic Sciences, National Autonomous University of Mexico, Cuernavaca, Morelos, Mexico
| | | | | | - Christian H Ottensmeier
- La Jolla Institute for Immunology, La Jolla, CA, USA
- Liverpool Head and Neck Centre, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Ferhat Ay
- La Jolla Institute for Immunology, La Jolla, CA, USA
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
| | - Vivek Chandra
- La Jolla Institute for Immunology, La Jolla, CA, USA
| | - Pandurangan Vijayanand
- La Jolla Institute for Immunology, La Jolla, CA, USA.
- Liverpool Head and Neck Centre, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK.
- Department of Medicine, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
111
|
Colbran LL, Johnson MR, Mathieson I, Capra JA. Tracing the Evolution of Human Gene Regulation and Its Association with Shifts in Environment. Genome Biol Evol 2021; 13:evab237. [PMID: 34718543 PMCID: PMC8576593 DOI: 10.1093/gbe/evab237] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/16/2021] [Indexed: 12/16/2022] Open
Abstract
As humans populated the world, they adapted to many varying environmental factors, including climate, diet, and pathogens. Because many of these adaptations were mediated by multiple noncoding variants with small effects on gene regulation, it has been difficult to link genomic signals of selection to specific genes, and to describe the regulatory response to selection. To overcome this challenge, we adapted PrediXcan, a machine learning method for imputing gene regulation from genotype data, to analyze low-coverage ancient human DNA (aDNA). First, we used simulated genomes to benchmark strategies for adapting PrediXcan to increase robustness to incomplete data. Applying the resulting models to 490 ancient Eurasians, we found that genes with the strongest divergent regulation among ancient populations with hunter-gatherer, pastoralist, and agricultural lifestyles are enriched for metabolic and immune functions. Next, we explored the contribution of divergent gene regulation to two traits with strong evidence of recent adaptation: dietary metabolism and skin pigmentation. We found enrichment for divergent regulation among genes proposed to be involved in diet-related local adaptation, and the predicted effects on regulation often suggest explanations for known signals of selection, for example, at FADS1, GPX1, and LEPR. In contrast, skin pigmentation genes show little regulatory change over a 38,000-year time series of 2,999 ancient Europeans, suggesting that adaptation mainly involved large-effect coding variants. This work demonstrates that combining aDNA with present-day genomes is informative about the biological differences among ancient populations, the role of gene regulation in adaptation, and the relationship between genetic diversity and complex traits.
Collapse
Affiliation(s)
- Laura L Colbran
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, USA
| | - Maya R Johnson
- School for Science and Math at Vanderbilt, Vanderbilt University, USA
- Department of Computer Science, Bryn Mawr College, Pennsylvania, USA
| | - Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, USA
| | - John A Capra
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, USA
- Department of Biological Sciences, Vanderbilt University, USA
- Department of Biomedical Informatics, Vanderbilt University, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, USA
| |
Collapse
|
112
|
Wang Y, Yen FS, Zhu XG, Timson RC, Weber R, Xing C, Liu Y, Allwein B, Luo H, Yeh HW, Heissel S, Unlu G, Gamazon ER, Kharas MG, Hite R, Birsoy K. SLC25A39 is necessary for mitochondrial glutathione import in mammalian cells. Nature 2021; 599:136-140. [PMID: 34707288 PMCID: PMC10981497 DOI: 10.1038/s41586-021-04025-w] [Citation(s) in RCA: 87] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 09/09/2021] [Indexed: 01/20/2023]
Abstract
Glutathione (GSH) is a small-molecule thiol that is abundant in all eukaryotes and has key roles in oxidative metabolism1. Mitochondria, as the major site of oxidative reactions, must maintain sufficient levels of GSH to perform protective and biosynthetic functions2. GSH is synthesized exclusively in the cytosol, yet the molecular machinery involved in mitochondrial GSH import remains unknown. Here, using organellar proteomics and metabolomics approaches, we identify SLC25A39, a mitochondrial membrane carrier of unknown function, as a regulator of GSH transport into mitochondria. Loss of SLC25A39 reduces mitochondrial GSH import and abundance without affecting cellular GSH levels. Cells lacking both SLC25A39 and its paralogue SLC25A40 exhibit defects in the activity and stability of proteins containing iron-sulfur clusters. We find that mitochondrial GSH import is necessary for cell proliferation in vitro and red blood cell development in mice. Heterologous expression of an engineered bifunctional bacterial GSH biosynthetic enzyme (GshF) in mitochondria enables mitochondrial GSH production and ameliorates the metabolic and proliferative defects caused by its depletion. Finally, GSH availability negatively regulates SLC25A39 protein abundance, coupling redox homeostasis to mitochondrial GSH import in mammalian cells. Our work identifies SLC25A39 as an essential and regulated component of the mitochondrial GSH-import machinery.
Collapse
Affiliation(s)
- Ying Wang
- Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA
| | - Frederick S Yen
- Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA
| | - Xiphias Ge Zhu
- Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA
| | - Rebecca C Timson
- Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA
| | - Ross Weber
- Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA
| | - Changrui Xing
- Structural Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yuyang Liu
- Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA
| | - Benjamin Allwein
- Structural Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Hanzhi Luo
- Molecular Pharmacology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Hsi-Wen Yeh
- Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA
| | - Søren Heissel
- The Proteomics Resource Center, The Rockefeller University, New York, NY, USA
| | - Gokhan Unlu
- Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA
| | - Eric R Gamazon
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Clare Hall and MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
| | - Michael G Kharas
- Molecular Pharmacology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Richard Hite
- Structural Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Kıvanç Birsoy
- Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
113
|
Vysotskiy M, Zhong X, Miller-Fleming TW, Zhou D, Cox NJ, Weiss LA. Integration of genetic, transcriptomic, and clinical data provides insight into 16p11.2 and 22q11.2 CNV genes. Genome Med 2021; 13:172. [PMID: 34715901 PMCID: PMC8557010 DOI: 10.1186/s13073-021-00972-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Accepted: 09/16/2021] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Deletions and duplications of the multigenic 16p11.2 and 22q11.2 copy number variant (CNV) regions are associated with brain-related disorders including schizophrenia, intellectual disability, obesity, bipolar disorder, and autism spectrum disorder (ASD). The contribution of individual CNV genes to each of these identified phenotypes is unknown, as well as the contribution of these CNV genes to other potentially subtler health implications for carriers. Hypothesizing that DNA copy number exerts most effects via impacts on RNA expression, we attempted a novel in silico fine-mapping approach in non-CNV carriers using both GWAS and biobank data. METHODS We first asked whether gene expression level in any individual gene in the CNV region alters risk for a known CNV-associated behavioral phenotype(s). Using transcriptomic imputation, we performed association testing for CNV genes within large genotyped cohorts for schizophrenia, IQ, BMI, bipolar disorder, and ASD. Second, we used a biobank containing electronic health data to compare the medical phenome of CNV carriers to controls within 700,000 individuals in order to investigate the full spectrum of health effects of the CNVs. Third, we used genotypes for over 48,000 individuals within the biobank to perform phenome-wide association studies between imputed expressions of individual 16p11.2 and 22q11.2 genes and over 1500 health traits. RESULTS Using large genotyped cohorts, we found individual genes within 16p11.2 associated with schizophrenia (TMEM219, INO80E, YPEL3), BMI (TMEM219, SPN, TAOK2, INO80E), and IQ (SPN), using conditional analysis to identify upregulation of INO80E as the driver of schizophrenia, and downregulation of SPN and INO80E as increasing BMI. We identified both novel and previously observed over-represented traits within the electronic health records of 16p11.2 and 22q11.2 CNV carriers. In the phenome-wide association study, we found seventeen significant gene-trait pairs, including psychosis (NPIPB11, SLX1B) and mood disorders (SCARF2), and overall enrichment of mental traits. CONCLUSIONS Our results demonstrate how integration of genetic and clinical data aids in understanding CNV gene function and implicates pleiotropy and multigenicity in CNV biology.
Collapse
Affiliation(s)
- Mikhail Vysotskiy
- Department of Psychiatry and Behavioral Sciences, University of California San Francisco, 513 Parnassus Ave., Health Sciences East 9th floor HSE901E, San Francisco, CA, 94143, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, 94143, USA
- Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, 94143, USA
- Pharmaceutical Sciences and Pharmacogenomics Graduate Program, University of California San Francisco, San Francisco, CA, 94143, USA
| | - Xue Zhong
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Vanderbilt Genetics Institute, Nashville, TN, 37232, USA
| | - Tyne W Miller-Fleming
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Vanderbilt Genetics Institute, Nashville, TN, 37232, USA
| | - Dan Zhou
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Vanderbilt Genetics Institute, Nashville, TN, 37232, USA
| | - Nancy J Cox
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Vanderbilt Genetics Institute, Nashville, TN, 37232, USA
| | - Lauren A Weiss
- Department of Psychiatry and Behavioral Sciences, University of California San Francisco, 513 Parnassus Ave., Health Sciences East 9th floor HSE901E, San Francisco, CA, 94143, USA.
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, 94143, USA.
- Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, 94143, USA.
| |
Collapse
|
114
|
Multi-tissue transcriptome-wide association study identifies eight candidate genes and tissue-specific gene expression underlying endometrial cancer susceptibility. Commun Biol 2021; 4:1211. [PMID: 34675350 PMCID: PMC8531339 DOI: 10.1038/s42003-021-02745-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 09/27/2021] [Indexed: 11/24/2022] Open
Abstract
Genome-wide association studies (GWAS) have revealed sixteen risk loci for endoemtrial cancer but the identification of candidate susceptibility genes remains challenging. Here, we perform transcriptome-wide association study (TWAS) analyses using the largest endometrial cancer GWAS and gene expression from six relevant tissues, prioritizing eight candidate endometrial cancer susceptibility genes, one of which (EEFSEC) is located at a potentially novel endometrial cancer risk locus. We also show evidence of biologically relevant tissue-specific expression associations for CYP19A1 (adipose), HEY2 (ovary) and SKAP1 (whole blood). A phenome-wide association study demonstrates associations of candidate susceptibility genes with anthropometric, cardiovascular, diabetes, bone health and sex hormone traits that are related to endometrial cancer risk factors. Lastly, analysis of TWAS data highlights candidate compounds for endometrial cancer repurposing. In summary, this study reveals endometrial cancer susceptibility genes, including those with evidence of tissue specificity, providing insights into endometrial cancer aetiology and avenues for therapeutic development. Pik Fang Kho et al. conduct multi-tissue transcriptome-wide association studies of endometrial cancer risk. Their results identify potential susceptibility genes for endometrial cancer, and provide avenues for the development of future treatments for this disease.
Collapse
|
115
|
Cao C, Wang J, Kwok D, Cui F, Zhang Z, Zhao D, Li MJ, Zou Q. webTWAS: a resource for disease candidate susceptibility genes identified by transcriptome-wide association study. Nucleic Acids Res 2021; 50:D1123-D1130. [PMID: 34669946 PMCID: PMC8728162 DOI: 10.1093/nar/gkab957] [Citation(s) in RCA: 110] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 09/24/2021] [Accepted: 10/05/2021] [Indexed: 12/20/2022] Open
Abstract
The development of transcriptome-wide association studies (TWAS) has enabled researchers to better identify and interpret causal genes in many diseases. However, there are currently no resources providing a comprehensive listing of gene-disease associations discovered by TWAS from published GWAS summary statistics. TWAS analyses are also difficult to conduct due to the complexity of TWAS software pipelines. To address these issues, we introduce a new resource called webTWAS, which integrates a database of the most comprehensive disease GWAS datasets currently available with credible sets of potential causal genes identified by multiple TWAS software packages. Specifically, a total of 235 064 gene-diseases associations for a wide range of human diseases are prioritized from 1298 high-quality downloadable European GWAS summary statistics. Associations are calculated with seven different statistical models based on three popular and representative TWAS software packages. Users can explore associations at the gene or disease level, and easily search for related studies or diseases using the MeSH disease tree. Since the effects of diseases are highly tissue-specific, webTWAS applies tissue-specific enrichment analysis to identify significant tissues. A user-friendly web server is also available to run custom TWAS analyses on user-provided GWAS summary statistics data. webTWAS is freely available at http://www.webtwas.net.
Collapse
Affiliation(s)
- Chen Cao
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.,Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Canada
| | - Jianhua Wang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Devin Kwok
- School of Computer Science, McGill University, Montreal, Canada
| | - Feifei Cui
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Zilong Zhang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Da Zhao
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Mulin Jun Li
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
116
|
Li B, Ritchie MD. From GWAS to Gene: Transcriptome-Wide Association Studies and Other Methods to Functionally Understand GWAS Discoveries. Front Genet 2021; 12:713230. [PMID: 34659337 PMCID: PMC8515949 DOI: 10.3389/fgene.2021.713230] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Accepted: 07/27/2021] [Indexed: 12/12/2022] Open
Abstract
Since their inception, genome-wide association studies (GWAS) have identified more than a hundred thousand single nucleotide polymorphism (SNP) loci that are associated with various complex human diseases or traits. The majority of GWAS discoveries are located in non-coding regions of the human genome and have unknown functions. The valley between non-coding GWAS discoveries and downstream affected genes hinders the investigation of complex disease mechanism and the utilization of human genetics for the improvement of clinical care. Meanwhile, advances in high-throughput sequencing technologies reveal important genomic regulatory roles that non-coding regions play in the transcriptional activities of genes. In this review, we focus on data integrative bioinformatics methods that combine GWAS with functional genomics knowledge to identify genetically regulated genes. We categorize and describe two types of data integrative methods. First, we describe fine-mapping methods. Fine-mapping is an exploratory approach that calibrates likely causal variants underneath GWAS signals. Fine-mapping methods connect GWAS signals to potentially causal genes through statistical methods and/or functional annotations. Second, we discuss gene-prioritization methods. These are hypothesis generating approaches that evaluate whether genetic variants regulate genes via certain genetic regulatory mechanisms to influence complex traits, including colocalization, mendelian randomization, and the transcriptome-wide association study (TWAS). TWAS is a gene-based association approach that investigates associations between genetically regulated gene expression and complex diseases or traits. TWAS has gained popularity over the years due to its ability to reduce multiple testing burden in comparison to other variant-based analytic approaches. Multiple types of TWAS methods have been developed with varied methodological designs and biological hypotheses over the past 5 years. We dive into discussions of how TWAS methods differ in many aspects and the challenges that different TWAS methods face. Overall, TWAS is a powerful tool for identifying complex trait-associated genes. With the advent of single-cell sequencing, chromosome conformation capture, gene editing technologies, and multiplexing reporter assays, we are expecting a more comprehensive understanding of genomic regulation and genetically regulated genes underlying complex human diseases and traits in the future.
Collapse
Affiliation(s)
- Binglan Li
- Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, United States.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
117
|
Wu Y, Yu XL, Xiao X, Li M, Li Y. Joint-Tissue Integrative Analysis Identified Hundreds of Schizophrenia Risk Genes. Mol Neurobiol 2021; 59:107-116. [PMID: 34628600 DOI: 10.1007/s12035-021-02572-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 09/16/2021] [Indexed: 12/14/2022]
Abstract
Genome-wide association studies (GWAS) have identified a large number of schizophrenia risk variants, and most of them are mapped to noncoding regions. By leveraging multiple joint-tissue gene expression data and GWAS data, we herein performed a transcriptome-wide association study (TWAS) and Mendelian randomization (MR) analysis and identified 144 genes whose mRNA levels were related to genetic risk of schizophrenia. Most of these genes exhibited diametrically opposite trends of expression in prenatal and postnatal brain tissues, despite that their expression levels in dorsolateral prefrontal cortex (DLPFC) tissues did not significantly differ between schizophrenics and healthy controls. We then found significant enrichment of these genes in dopamine-related pathways that were repeatedly implicated in schizophrenia pathogenesis and in the action of antipsychotic drugs. Gene expression analysis using single cell RNA-sequencing (scRNA-seq) data of mid-gestation fetal brains further revealed enrichment of these genes in glutamatergic excitatory neurons and cycling progenitors. These lines of evidence, in consistency with previous findings, confirmed the polygenic nature of schizophrenia and highlighted involvement of early neurodevelopment aberrations in this disorder. Further investigations using advanced algorithms in both bulk brain tissues and in single cells and at different developmental stages are necessary to characterize transcriptomic features of schizophrenia pathogenesis along brain development.
Collapse
Affiliation(s)
- Yong Wu
- Affiliated Wuhan Mental Health Center, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China
| | - Xiao-Lin Yu
- Affiliated Wuhan Mental Health Center, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China
| | - Xiao Xiao
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences & Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650201, Yunnan, China
| | - Ming Li
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences & Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650201, Yunnan, China. .,CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China.
| | - Yi Li
- Affiliated Wuhan Mental Health Center, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China. .,Research Center for Psychological and Health Sciences, China University of Geosciences, Wuhan, 430074, Hubei, China.
| |
Collapse
|
118
|
Tin A, Köttgen A. Mendelian Randomization Analysis as a Tool to Gain Insights into Causes of Diseases: A Primer. J Am Soc Nephrol 2021; 32:2400-2407. [PMID: 34135084 PMCID: PMC8722812 DOI: 10.1681/asn.2020121760] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 05/24/2021] [Indexed: 02/04/2023] Open
Abstract
Many Mendelian randomization (MR) studies have been published recently, with inferences on the causal relationships between risk factors and diseases that have potential implications for clinical research. In nephrology, MR methods have been applied to investigate potential causal relationships of traditional risk factors, lifestyle factors, and biomarkers from omics technologies with kidney function or CKD. This primer summarizes the basic concepts of MR studies, highlighting methods used in recent applications, and emphasizes key elements in conducting and reporting of MR studies that are important for interpreting the results.
Collapse
Affiliation(s)
- Adrienne Tin
- Department of Medicine, University of Mississippi Medical Center, Jackson, Mississippi
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - Anna Köttgen
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
- Department of Data Driven Medicine, Institute of Genetic Epidemiology, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
| |
Collapse
|
119
|
Wu C, Zhu J, King A, Tong X, Lu Q, Park JY, Wang L, Gao G, Deng HW, Yang Y, Knudsen KE, Rebbeck TR, Long J, Zheng W, Pan W, Conti DV, Haiman CA, Wu L. Novel strategy for disease risk prediction incorporating predicted gene expression and DNA methylation data: a multi-phased study of prostate cancer. Cancer Commun (Lond) 2021; 41:1387-1397. [PMID: 34520132 PMCID: PMC8696216 DOI: 10.1002/cac2.12205] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 06/10/2021] [Accepted: 07/26/2021] [Indexed: 12/15/2022] Open
Abstract
Background DNA methylation and gene expression are known to play important roles in the etiology of human diseases such as prostate cancer (PCa). However, it has not yet been possible to incorporate information of DNA methylation and gene expression into polygenic risk scores (PRSs). Here, we aimed to develop and validate an improved PRS for PCa risk by incorporating genetically predicted gene expression and DNA methylation, and other genomic information using an integrative method. Methods Using data from the PRACTICAL consortium, we derived multiple sets of genetic scores, including those based on available single‐nucleotide polymorphisms through widely used methods of pruning and thresholding, LDpred, LDpred‐funt, AnnoPred, and EBPRS, as well as PRS constructed using the genetically predicted gene expression and DNA methylation through a revised pruning and thresholding strategy. In the tuning step, using the UK Biobank data (1458 prevalent cases and 1467 controls), we selected PRSs with the best performance. Using an independent set of data from the UK Biobank, we developed an integrative PRS combining information from individual scores. Furthermore, in the testing step, we tested the performance of the integrative PRS in another independent set of UK Biobank data of incident cases and controls. Results Our constructed PRS had improved performance (C statistics: 76.1%) over PRSs constructed by individual benchmark methods (from 69.6% to 74.7%). Furthermore, our new PRS had much higher risk assessment power than family history. The overall net reclassification improvement was 69.0% by adding PRS to the baseline model compared with 12.5% by adding family history. Conclusions We developed and validated a new PRS which may improve the utility in predicting the risk of developing PCa. Our innovative method can also be applied to other human diseases to improve risk prediction across multiple outcomes.
Collapse
Affiliation(s)
- Chong Wu
- Department of Statistics, Florida State University, Tallahassee, FL, 32304, USA
| | - Jingjing Zhu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, 96813, USA
| | - Austin King
- Department of Statistics, Florida State University, Tallahassee, FL, 32304, USA
| | - Xiaoran Tong
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI, 48824, USA
| | - Qing Lu
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| | - Jong Y Park
- Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Liang Wang
- Department of Tumor Biology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Guimin Gao
- Department of Public Health Sciences, University of Chicago, Chicago, IL, 60637, USA
| | - Hong-Wen Deng
- Center of Bioinformatics and Genomics, Department of Global Biostatistics and Data Science, Tulane University, New Orleans, LA, 70112, USA
| | - Yaohua Yang
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, 37203, USA
| | - Karen E Knudsen
- Department of Cancer Biology, Sidney Kimmel Cancer Center, Thomas Jefferson University, Philadelphia, PA, 19107, USA
| | - Timothy R Rebbeck
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.,Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA, 02115, USA
| | - Jirong Long
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, 37203, USA
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, 37203, USA
| | - Wei Pan
- Division of Biostatistics, University of Minnesota, Minneapolis, MN, 55455, USA
| | - David V Conti
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California/Norris Comprehensive Cancer Center, Los Angeles, CA, 90033, USA
| | - Christopher A Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California/Norris Comprehensive Cancer Center, Los Angeles, CA, 90033, USA
| | - Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, 96813, USA
| |
Collapse
|
120
|
A transcriptome-wide association study of Alzheimer's disease using prediction models of relevant tissues identifies novel candidate susceptibility genes. Genome Med 2021; 13:141. [PMID: 34470669 PMCID: PMC8408990 DOI: 10.1186/s13073-021-00959-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 08/25/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) have identified over 56 susceptibility loci associated with Alzheimer's disease (AD), but the genes responsible for these associations remain largely unknown. METHODS We performed a large transcriptome-wide association study (TWAS) leveraging modified UTMOST (Unified Test for MOlecular SignaTures) prediction models of ten brain tissues that are potentially related to AD to discover novel AD genetic loci and putative target genes in 71,880 (proxy) cases and 383,378 (proxy) controls of European ancestry. RESULTS We identified 53 genes with predicted expression associations with AD risk at Bonferroni correction threshold (P value < 3.38 × 10-6). Based on fine-mapping analyses, 21 genes at nine loci showed strong support for being causal. CONCLUSIONS Our study provides new insights into the etiology and underlying genetic architecture of AD.
Collapse
|
121
|
Pathak GA, Singh K, Miller-Fleming TW, Wendt FR, Ehsan N, Hou K, Johnson R, Lu Z, Gopalan S, Yengo L, Mohammadi P, Pasaniuc B, Polimanti R, Davis LK, Mancuso N. Integrative genomic analyses identify susceptibility genes underlying COVID-19 hospitalization. Nat Commun 2021; 12:4569. [PMID: 34315903 PMCID: PMC8316582 DOI: 10.1038/s41467-021-24824-z] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Accepted: 07/07/2021] [Indexed: 12/11/2022] Open
Abstract
Despite rapid progress in characterizing the role of host genetics in SARS-Cov-2 infection, there is limited understanding of genes and pathways that contribute to COVID-19. Here, we integrate a genome-wide association study of COVID-19 hospitalization (7,885 cases and 961,804 controls from COVID-19 Host Genetics Initiative) with mRNA expression, splicing, and protein levels (n = 18,502). We identify 27 genes related to inflammation and coagulation pathways whose genetically predicted expression was associated with COVID-19 hospitalization. We functionally characterize the 27 genes using phenome- and laboratory-wide association scans in Vanderbilt Biobank (n = 85,460) and identified coagulation-related clinical symptoms, immunologic, and blood-cell-related biomarkers. We replicate these findings across trans-ethnic studies and observed consistent effects in individuals of diverse ancestral backgrounds in Vanderbilt Biobank, pan-UK Biobank, and Biobank Japan. Our study highlights and reconfirms putative causal genes impacting COVID-19 severity and symptomology through the host inflammatory response.
Collapse
Affiliation(s)
- Gita A Pathak
- Yale School of Medicine, Department of Psychiatry, Division of Human Genetics, New Haven, CT, USA
- Veteran Affairs Connecticut Healthcare System, West Haven, CT, USA
| | - Kritika Singh
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Tyne W Miller-Fleming
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Frank R Wendt
- Yale School of Medicine, Department of Psychiatry, Division of Human Genetics, New Haven, CT, USA
- Veteran Affairs Connecticut Healthcare System, West Haven, CT, USA
| | - Nava Ehsan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Kangcheng Hou
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Ruth Johnson
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
| | - Zeyun Lu
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Shyamalika Gopalan
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Loic Yengo
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
- Scripps Translational Science Institute, The Scripps Research Institute, La Jolla, CA, USA
| | - Bogdan Pasaniuc
- Departments of Computational Medicine, Human Genetics, Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Renato Polimanti
- Yale School of Medicine, Department of Psychiatry, Division of Human Genetics, New Haven, CT, USA
- Veteran Affairs Connecticut Healthcare System, West Haven, CT, USA
| | - Lea K Davis
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Nicholas Mancuso
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
122
|
Contextualizing genetic risk score for disease screening and rare variant discovery. Nat Commun 2021; 12:4418. [PMID: 34285202 PMCID: PMC8292385 DOI: 10.1038/s41467-021-24387-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Accepted: 06/07/2021] [Indexed: 11/08/2022] Open
Abstract
Studies of the genetic basis of complex traits have demonstrated a substantial role for common, small-effect variant polygenic burden (PB) as well as large-effect variants (LEV, primarily rare). We identify sufficient conditions in which GWAS-derived PB may be used for well-powered rare pathogenic variant discovery or as a sample prioritization tool for whole-genome or exome sequencing. Through extensive simulations of genetic architectures and generative models of disease liability with parameters informed by empirical data, we quantify the power to detect, among cases, a lower PB in LEV carriers than in non-carriers. Furthermore, we uncover clinically useful conditions wherein the risk derived from the PB is comparable to the LEV-derived risk. The resulting summary-statistics-based methodology (with publicly available software, PB-LEV-SCAN) makes predictions on PB-based LEV screening for 36 complex traits, which we confirm in several disease datasets with available LEV information in the UK Biobank, with important implications on clinical decision-making.
Collapse
|
123
|
Wu L, Zhu J, Liu D, Sun Y, Wu C. An integrative multiomics analysis identifies putative causal genes for COVID-19 severity. Genet Med 2021; 23:2076-2086. [PMID: 34183789 PMCID: PMC8237048 DOI: 10.1038/s41436-021-01243-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Revised: 05/24/2021] [Accepted: 05/27/2021] [Indexed: 11/17/2022] Open
Abstract
Purpose It is critical to identify putative causal targets for SARS coronavirus 2, which may guide drug repurposing options to reduce the public health burden of COVID-19. Methods We applied complementary methods and multiphased design to pinpoint the most likely causal genes for COVID-19 severity. First, we applied cross-methylome omnibus (CMO) test and leveraged data from the COVID-19 Host Genetics Initiative (HGI) comparing 9,986 hospitalized COVID-19 patients and 1,877,672 population controls. Second, we evaluated associations using the complementary S-PrediXcan method and leveraging blood and lung tissue gene expression prediction models. Third, we assessed associations of the identified genes with another COVID-19 phenotype, comparing very severe respiratory confirmed COVID versus population controls. Finally, we applied a fine-mapping method, fine-mapping of gene sets (FOGS), to prioritize putative causal genes. Results Through analyses of the COVID-19 HGI using complementary CMO and S-PrediXcan methods along with fine-mapping, XCR1, CCR2, SACM1L, OAS3, NSF, WNT3, NAPSA, and IFNAR2 are identified as putative causal genes for COVID-19 severity. Conclusion We identified eight genes at five genomic loci as putative causal genes for COVID-19 severity.
Collapse
Affiliation(s)
- Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA.
| | - Jingjing Zhu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA
| | - Duo Liu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA.,Department of Pharmacy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Yanfa Sun
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA.,College of Life Science, Longyan University, Longyan, Fujian, P. R. China.,Fujian Provincial Key Laboratory for the Prevention and Control of Animal Infectious Diseases and Biotechnology, Longyan, Fujian, P.R. China.,Key Laboratory of Preventive Veterinary Medicine and Biotechnology (Longyan University), Fujian Province University, Longyan, Fujian, P.R. China
| | - Chong Wu
- Department of Statistics, Florida State University, Tallahassee, FL, USA.
| |
Collapse
|
124
|
Multilayer modelling of the human transcriptome and biological mechanisms of complex diseases and traits. NPJ Syst Biol Appl 2021; 7:24. [PMID: 34045472 PMCID: PMC8160250 DOI: 10.1038/s41540-021-00186-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 04/28/2021] [Indexed: 01/03/2023] Open
Abstract
Here, we performed a comprehensive intra-tissue and inter-tissue multilayer network analysis of the human transcriptome. We generated an atlas of communities in gene co-expression networks in 49 tissues (GTEx v8), evaluated their tissue specificity, and investigated their methodological implications. UMAP embeddings of gene expression from the communities (representing nearly 18% of all genes) robustly identified biologically-meaningful clusters. Notably, new gene expression data can be embedded into our algorithmically derived models to accelerate discoveries in high-dimensional molecular datasets and downstream diagnostic or prognostic applications. We demonstrate the generalisability of our approach through systematic testing in external genomic and transcriptomic datasets. Methodologically, prioritisation of the communities in a transcriptome-wide association study of the biomarker C-reactive protein (CRP) in 361,194 individuals in the UK Biobank identified genetically-determined expression changes associated with CRP and led to considerably improved performance. Furthermore, a deep learning framework applied to the communities in nearly 11,000 tumors profiled by The Cancer Genome Atlas across 33 different cancer types learned biologically-meaningful latent spaces, representing metastasis (p < 2.2 × 10−16) and stemness (p < 2.2 × 10−16). Our study provides a rich genomic resource to catalyse research into inter-tissue regulatory mechanisms, and their downstream consequences on human disease.
Collapse
|
125
|
Viñas R, Azevedo T, Gamazon ER, Liò P. Deep Learning Enables Fast and Accurate Imputation of Gene Expression. Front Genet 2021; 12:624128. [PMID: 33927746 PMCID: PMC8076954 DOI: 10.3389/fgene.2021.624128] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 03/12/2021] [Indexed: 11/26/2022] Open
Abstract
A question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we propose two novel deep learning methods, PMI and GAIN-GTEx, for gene expression imputation. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We show that our approaches compare favorably to several standard and state-of-the-art imputation methods in terms of predictive performance and runtime in two case studies and two imputation scenarios. In comparison conducted on the protein-coding genes, PMI attains the highest performance in inductive imputation whereas GAIN-GTEx outperforms the other methods in in-place imputation. Furthermore, our results indicate strong generalization on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.
Collapse
Affiliation(s)
- Ramon Viñas
- Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
| | - Tiago Azevedo
- Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
| | - Eric R Gamazon
- Vanderbilt Genetics Institute and Data Science Institute, VUMC, Nashville, TN, United States.,MRC Epidemiology Unit, University of Cambridge, Cambridge, United Kingdom.,Clare Hall, University of Cambridge, Cambridge, United Kingdom
| | - Pietro Liò
- Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
126
|
Feng H, Mancuso N, Gusev A, Majumdar A, Major M, Pasaniuc B, Kraft P. Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies. PLoS Genet 2021; 17:e1008973. [PMID: 33831007 PMCID: PMC8057593 DOI: 10.1371/journal.pgen.1008973] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 04/20/2021] [Accepted: 03/16/2021] [Indexed: 12/17/2022] Open
Abstract
Transcriptome-wide association studies (TWAS) test the association between traits and genetically predicted gene expression levels. The power of a TWAS depends in part on the strength of the correlation between a genetic predictor of gene expression and the causally relevant gene expression values. Consequently, TWAS power can be low when expression quantitative trait locus (eQTL) data used to train the genetic predictors have small sample sizes, or when data from causally relevant tissues are not available. Here, we propose to address these issues by integrating multiple tissues in the TWAS using sparse canonical correlation analysis (sCCA). We show that sCCA-TWAS combined with single-tissue TWAS using an aggregate Cauchy association test (ACAT) outperforms traditional single-tissue TWAS. In empirically motivated simulations, the sCCA+ACAT approach yielded the highest power to detect a gene associated with phenotype, even when expression in the causal tissue was not directly measured, while controlling the Type I error when there is no association between gene expression and phenotype. For example, when gene expression explains 2% of the variability in outcome, and the GWAS sample size is 20,000, the average power difference between the ACAT combined test of sCCA features and single-tissue, versus single-tissue combined with Generalized Berk-Jones (GBJ) method, single-tissue combined with S-MultiXcan, UTMOST, or summarizing cross-tissue expression patterns using Principal Component Analysis (PCA) approaches was 5%, 8%, 5% and 38%, respectively. The gain in power is likely due to sCCA cross-tissue features being more likely to be detectably heritable. When applied to publicly available summary statistics from 10 complex traits, the sCCA+ACAT test was able to increase the number of testable genes and identify on average an additional 400 additional gene-trait associations that single-trait TWAS missed. Our results suggest that aggregating eQTL data across multiple tissues using sCCA can improve the sensitivity of TWAS while controlling for the false positive rate.
Collapse
Affiliation(s)
- Helian Feng
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Nicholas Mancuso
- Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
- Division of Biostatistics, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Alexander Gusev
- Department of Medical Oncology, Dana-Farber Cancer Institute & Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham & Women’s Hospital, Boston, MA, United States of America
- Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Arunabha Majumdar
- Department of Human Genetics, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Pathology and Laboratory Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | - Megan Major
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, California, United States of America
| | - Bogdan Pasaniuc
- Department of Human Genetics, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Pathology and Laboratory Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | - Peter Kraft
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| |
Collapse
|
127
|
Li X, Kulkarni AS, Liu X, Gao WQ, Huang L, Hu Z, Qian K. Metal-Organic Framework Hybrids Aid Metabolic Profiling for Colorectal Cancer. SMALL METHODS 2021; 5:e2001001. [PMID: 34927854 DOI: 10.1002/smtd.202001001] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 12/05/2020] [Indexed: 06/14/2023]
Abstract
Colorectal cancer (CRC) is the third most common fatal cancer worldwide, accounting for ≈10% of cancer-related mortality. Metabolic shift occurs from the very early stage during the development of CRC, which is of significant etiological and diagnostic importance toward precision medicine. Here, an advanced molecular tool to characterize the metabolic alterations in CRC, based on metal-organic framework (MOF) hybrids is reported. Consuming only 500 nL of plasma without any sample pretreatment, MOF hybrids yield direct metabolic fingerprints by laser desorption/ionization mass spectrometry in seconds. A diagnostic prediction model by a machine learning algorithm is constructed, to discriminate CRC patients from normal controls with an average area under the curve of 0.947 for the discovery cohort and 0.912 for the independent validation cohort. In addition, CRC-specific metabolic signature consisting of 34 potential biomarkers, based on the aforementioned diagnostic model is identified. The results advance the design of nanomaterial-based platforms for metabolic analysis and establish a new liquid biopsy tool for CRC screening compatible with the current clinical workflow in practice.
Collapse
Affiliation(s)
- Xinxing Li
- Department of Gastrointestinal Surgery, Tongji Hospital, Medical College of Tongji University, Shanghai, 200065, P. R. China
- Department of General Surgery, Changzheng Hospital, Naval Medical University, Shanghai, 200003, P. R. China
| | - Anuja Shreeram Kulkarni
- State Key Laboratory for Oncogenes and Related Genes, Division of Cardiology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
- School of Biomedical Engineering, and Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Xun Liu
- State Key Laboratory for Oncogenes and Related Genes, Division of Cardiology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
- School of Biomedical Engineering, and Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Wei-Qiang Gao
- State Key Laboratory for Oncogenes and Related Genes, Division of Cardiology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
- School of Biomedical Engineering, and Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Lin Huang
- Stem Cell Research Center, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
| | - Zhiqian Hu
- Department of Gastrointestinal Surgery, Tongji Hospital, Medical College of Tongji University, Shanghai, 200065, P. R. China
- Department of General Surgery, Changzheng Hospital, Naval Medical University, Shanghai, 200003, P. R. China
| | - Kun Qian
- State Key Laboratory for Oncogenes and Related Genes, Division of Cardiology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, 160 Pujian Road, Shanghai, 200127, P. R. China
- School of Biomedical Engineering, and Med-X Research Institute, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| |
Collapse
|
128
|
Gerring ZF. Dissecting Genetically Regulated Gene Expression in Major Depression. Biol Psychiatry 2021; 89:e31-e33. [PMID: 33594984 DOI: 10.1016/j.biopsych.2020.12.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 12/14/2020] [Indexed: 12/23/2022]
Affiliation(s)
- Zachary F Gerring
- Translational Neurogenomics, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia.
| |
Collapse
|