1
|
Jiang L, Shen J, Darst BF, Haiman CA, Mancuso N, Conti DV. Hierarchical joint analysis of marginal summary statistics-Part II: High-dimensional instrumental analysis of omics data. Genet Epidemiol 2024. [PMID: 38887957 DOI: 10.1002/gepi.22577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 04/04/2024] [Accepted: 05/15/2024] [Indexed: 06/20/2024]
Abstract
Instrumental variable (IV) analysis has been widely applied in epidemiology to infer causal relationships using observational data. Genetic variants can also be viewed as valid IVs in Mendelian randomization and transcriptome-wide association studies. However, most multivariate IV approaches cannot scale to high-throughput experimental data. Here, we leverage the flexibility of our previous work, a hierarchical model that jointly analyzes marginal summary statistics (hJAM), to a scalable framework (SHA-JAM) that can be applied to a large number of intermediates and a large number of correlated genetic variants-situations often encountered in modern experiments leveraging omic technologies. SHA-JAM aims to estimate the conditional effect for high-dimensional risk factors on an outcome by incorporating estimates from association analyses of single-nucleotide polymorphism (SNP)-intermediate or SNP-gene expression as prior information in a hierarchical model. Results from extensive simulation studies demonstrate that SHA-JAM yields a higher area under the receiver operating characteristics curve (AUC), a lower mean-squared error of the estimates, and a much faster computation speed, compared to an existing approach for similar analyses. In two applied examples for prostate cancer, we investigated metabolite and transcriptome associations, respectively, using summary statistics from a GWAS for prostate cancer with more than 140,000 men and high dimensional publicly available summary data for metabolites and transcriptomes.
Collapse
Affiliation(s)
- Lai Jiang
- Department of Population and Public Health Sciences, Division of Biostatistics, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | - Jiayi Shen
- Department of Population and Public Health Sciences, Division of Biostatistics, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | - Burcu F Darst
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
- Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Christopher A Haiman
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
- Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA
| | - Nicholas Mancuso
- Department of Population and Public Health Sciences, Division of Biostatistics, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
- Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA
| | - David V Conti
- Department of Population and Public Health Sciences, Division of Biostatistics, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
- Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA
| |
Collapse
|
2
|
Guo S, Yang J. Bayesian genome-wide TWAS with reference transcriptomic data of brain and blood tissues identified 141 risk genes for Alzheimer's disease dementia. Alzheimers Res Ther 2024; 16:120. [PMID: 38824563 PMCID: PMC11144322 DOI: 10.1186/s13195-024-01488-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 05/27/2024] [Indexed: 06/03/2024]
Abstract
BACKGROUND Transcriptome-wide association study (TWAS) is an influential tool for identifying genes associated with complex diseases whose genetic effects are likely mediated through transcriptome. TWAS utilizes reference genetic and transcriptomic data to estimate effect sizes of genetic variants on gene expression (i.e., effect sizes of a broad sense of expression quantitative trait loci, eQTL). These estimated effect sizes are employed as variant weights in gene-based association tests, facilitating the mapping of risk genes with genome-wide association study (GWAS) data. However, most existing TWAS of Alzheimer's disease (AD) dementia are limited to studying only cis-eQTL proximal to the test gene. To overcome this limitation, we applied the Bayesian Genome-wide TWAS (BGW-TWAS) method to leveraging both cis- and trans- eQTL of brain and blood tissues, in order to enhance mapping risk genes for AD dementia. METHODS We first applied BGW-TWAS to the Genotype-Tissue Expression (GTEx) V8 dataset to estimate cis- and trans- eQTL effect sizes of the prefrontal cortex, cortex, and whole blood tissues. Estimated eQTL effect sizes were integrated with the summary data of the most recent GWAS of AD dementia to obtain BGW-TWAS (i.e., gene-based association test) p-values of AD dementia per gene per tissue type. Then we used the aggregated Cauchy association test to combine TWAS p-values across three tissues to obtain omnibus TWAS p-values per gene. RESULTS We identified 85 significant genes in prefrontal cortex, 82 in cortex, and 76 in whole blood that were significantly associated with AD dementia. By combining BGW-TWAS p-values across these three tissues, we obtained 141 significant risk genes including 34 genes primarily due to trans-eQTL and 35 mapped risk genes in GWAS Catalog. With these 141 significant risk genes, we detected functional clusters comprised of both known mapped GWAS risk genes of AD in GWAS Catalog and our identified TWAS risk genes by protein-protein interaction network analysis, as well as several enriched phenotypes related to AD. CONCLUSION We applied BGW-TWAS and aggregated Cauchy test methods to integrate both cis- and trans- eQTL data of brain and blood tissues with GWAS summary data, identifying 141 TWAS risk genes of AD dementia. These identified risk genes provide novel insights into the underlying biological mechanisms of AD dementia and potential gene targets for therapeutics development.
Collapse
Affiliation(s)
- Shuyi Guo
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| |
Collapse
|
3
|
Parrish RL, Buchman AS, Tasaki S, Wang Y, Avey D, Xu J, De Jager PL, Bennett DA, Epstein MP, Yang J. SR-TWAS: Leveraging Multiple Reference Panels to Improve TWAS Power by Ensemble Machine Learning. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.06.20.23291605. [PMID: 37425698 PMCID: PMC10327185 DOI: 10.1101/2023.06.20.23291605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Multiple reference panels of a given tissue or multiple tissues often exist, and multiple regression methods could be used for training gene expression imputation models for TWAS. To leverage expression imputation models (i.e., base models) trained with multiple reference panels, regression methods, and tissues, we develop a Stacked Regression based TWAS (SR-TWAS) tool which can obtain optimal linear combinations of base models for a given validation transcriptomic dataset. Both simulation and real studies showed that SR-TWAS improved power, due to increased effective training sample sizes and borrowed strength across multiple regression methods and tissues. Leveraging base models across multiple reference panels, tissues, and regression methods, our real application studies identified 6 independent significant risk genes for Alzheimer's disease (AD) dementia for supplementary motor area tissue and 9 independent significant risk genes for Parkinson's disease (PD) for substantia nigra tissue. Relevant biological interpretations were found for these significant risk genes.
Collapse
|
4
|
Mews MA, Naj AC, Griswold AJ, Below JE, Bush WS. Brain and Blood Transcriptome-Wide Association Studies Identify Five Novel Genes Associated with Alzheimer's Disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.17.24305737. [PMID: 38699333 PMCID: PMC11065015 DOI: 10.1101/2024.04.17.24305737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
INTRODUCTION Transcriptome-wide Association Studies (TWAS) extend genome-wide association studies (GWAS) by integrating genetically-regulated gene expression models. We performed the most powerful AD-TWAS to date, using summary statistics from cis -eQTL meta-analyses and the largest clinically-adjudicated Alzheimer's Disease (AD) GWAS. METHODS We implemented the OTTERS TWAS pipeline, leveraging cis -eQTL data from cortical brain tissue (MetaBrain; N=2,683) and blood (eQTLGen; N=31,684) to predict gene expression, then applied these models to AD-GWAS data (Cases=21,982; Controls=44,944). RESULTS We identified and validated five novel gene associations in cortical brain tissue ( PRKAG1 , C3orf62 , LYSMD4 , ZNF439 , SLC11A2 ) and six genes proximal to known AD-related GWAS loci (Blood: MYBPC3 ; Brain: MTCH2 , CYB561 , MADD , PSMA5 , ANXA11 ). Further, using causal eQTL fine-mapping, we generated sparse models that retained the strength of the AD-TWAS association for MTCH2 , MADD , ZNF439 , CYB561 , and MYBPC3 . DISCUSSION Our comprehensive AD-TWAS discovered new gene associations and provided insights into the functional relevance of previously associated variants.
Collapse
|
5
|
Li Q, Bian J, Qian Y, Kossinna P, Gau C, Gordon PMK, Zhou X, Guo X, Yan J, Wu J, Long Q. An expression-directed linear mixed model discovering low-effect genetic variants. Genetics 2024; 226:iyae018. [PMID: 38314848 DOI: 10.1093/genetics/iyae018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 11/29/2023] [Accepted: 01/05/2024] [Indexed: 02/07/2024] Open
Abstract
Detecting genetic variants with low-effect sizes using a moderate sample size is difficult, hindering downstream efforts to learn pathology and estimating heritability. In this work, by utilizing informative weights learned from training genetically predicted gene expression models, we formed an alternative approach to estimate the polygenic term in a linear mixed model. Our linear mixed model estimates the genetic background by incorporating their relevance to gene expression. Our protocol, expression-directed linear mixed model, enables the discovery of subtle signals of low-effect variants using moderate sample size. By applying expression-directed linear mixed model to cohorts of around 5,000 individuals with either binary (WTCCC) or quantitative (NFBC1966) traits, we demonstrated its power gain at the low-effect end of the genetic etiology spectrum. In aggregate, the additional low-effect variants detected by expression-directed linear mixed model substantially improved estimation of missing heritability. Expression-directed linear mixed model moves precision medicine forward by accurately detecting the contribution of low-effect genetic variants to human diseases.
Collapse
Affiliation(s)
- Qing Li
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Jiayi Bian
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Yanzhao Qian
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Cooper Gau
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Paul M K Gordon
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
| | - Xiang Zhou
- School of Public Health, University of Michigan, Ann Arbor 48109, USA
| | - Xingyi Guo
- Department of Medicine & Biomedical Informatics, Vanderbilt University Medical Center, Nashville 37203, USA
| | - Jun Yan
- Physiology and Pharmacology, University of Calgary, Calgary T2N 1N4, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary T2N 1N4, Canada
| | - Jingjing Wu
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Quan Long
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary T2N 1N4, Canada
- Department of Medical Genetics, University of Calgary, Calgary T2N 1N4, Canada
| |
Collapse
|
6
|
Hu T, Dai Q, Epstein MP, Yang J. Proteome-wide association studies using summary proteomic data identified 23 risk genes of Alzheimer's disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.28.24305044. [PMID: 38585769 PMCID: PMC10996749 DOI: 10.1101/2024.03.28.24305044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Characterizing the genetic mechanisms underlying Alzheimer's disease (AD) dementia is crucial for developing new therapeutics. Proteome-wide association study (PWAS) integrating proteomics data with genome-wide association study (GWAS) summary data was shown as a powerful tool for detecting risk genes. The identified PWAS risk genes can be interpretated as having genetic effects mediated through the genetically regulated protein abundances. Existing PWAS analyses of AD often rely on the availability of individual-level proteomics and genetics data of a reference cohort. Leveraging summary-level protein quantitative trait loci (pQTL) reference data of multiple relevant tissues is expected to improve PWAS findings for studying AD. Here, we applied our recently developed OTTERS tool to conduct PWAS of AD dementia, by leveraging summary-level pQTL data of brain, cerebrospinal fluid (CSF), and plasma tissues, and multiple statistical methods. For each target protein, imputation models of the protein abundance with genetic predictors were trained from summary-level pQTL data, estimating a set of pQTL weights for considered genetic predictors. PWAS p-values were obtained by integrating GWAS summary data of AD dementia with estimated pQTL weights. PWAS p-values from multiple statistical methods were combined by the aggregated Cauchy association test to yield one omnibus PWAS p-value for the target protein. We identified significant PWAS risk genes through omnibus PWAS p-values and analyzed their protein-protein interactions using STRING. Their potential causal effects were assessed by the probabilistic Mendelian randomization (PMR-Egger). As a result, we identified a total of 23 significant PWAS risk genes for AD dementia in brain, CSF, and plasma tissues, including 7 novel findings. We showed that 15 of these risk genes were interconnected within a protein-protein interaction network involving the well-known AD risk gene of APOE and 5 novel findings, and enriched in immune functions and lipids pathways including positive regulation of immune system process, positive regulation of macrophage proliferation, humoral immune response, and high-density lipoprotein particle clearance. Existing biological evidence was found to relate our novel findings with AD. We validated the mediated causal effects of 14 risk genes (60.8%). In conclusion, we identified both known and novel PWAS risk genes, providing novel insights into the genetic mechanisms in brain, CSF, and plasma tissues, and targeted therapeutics development of AD dementia. Our study also demonstrated the effectiveness of integrating public available summary-level pQTL data with GWAS summary data for mapping risk genes of complex human diseases.
Collapse
Affiliation(s)
- Tingyang Hu
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
- Division of Biostatistics and Bioinformatics, Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Qile Dai
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
- Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA, 30322, USA
| | - Michael P. Epstein
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| |
Collapse
|
7
|
He J, Li Q, Zhang Q. rvTWAS: identifying gene-trait association using sequences by utilizing transcriptome-directed feature selection. Genetics 2024; 226:iyad204. [PMID: 38001381 DOI: 10.1093/genetics/iyad204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 11/14/2023] [Accepted: 11/16/2023] [Indexed: 11/26/2023] Open
Abstract
Toward the identification of genetic basis of complex traits, transcriptome-wide association study (TWAS) is successful in integrating transcriptome data. However, TWAS is only applicable for common variants, excluding rare variants in exome or whole-genome sequences. This is partly because of the inherent limitation of TWAS protocols that rely on predicting gene expressions. Our previous research has revealed the insight into TWAS: the 2 steps in TWAS, building and applying the expression prediction models, are essentially genetic feature selection and aggregations that do not have to involve predictions. Based on this insight disentangling TWAS, rare variants' inability of predicting expression traits is no longer an obstacle. Herein, we developed "rare variant TWAS," or rvTWAS, that first uses a Bayesian model to conduct expression-directed feature selection and then uses a kernel machine to carry out feature aggregation, forming a model leveraging expressions for association mapping including rare variants. We demonstrated the performance of rvTWAS by thorough simulations and real data analysis in 3 psychiatric disorders, namely schizophrenia, bipolar disorder, and autism spectrum disorder. We confirmed that rvTWAS outperforms existing TWAS protocols and revealed additional genes underlying psychiatric disorders. Particularly, we formed a hypothetical mechanism in which zinc finger genes impact all 3 disorders through transcriptional regulations. rvTWAS will open a door for sequence-based association mappings integrating gene expressions.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qingrun Zhang
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
- Arnie Charbonneau Cancer Institute, University of Calgary, Calgary T2N 1N4, Canada
| |
Collapse
|
8
|
Liu L, Yan R, Guo P, Ji J, Gong W, Xue F, Yuan Z, Zhou X. Conditional transcriptome-wide association study for fine-mapping candidate causal genes. Nat Genet 2024; 56:348-356. [PMID: 38279040 DOI: 10.1038/s41588-023-01645-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 12/08/2023] [Indexed: 01/28/2024]
Abstract
Transcriptome-wide association studies (TWASs) aim to integrate genome-wide association studies with expression-mapping studies to identify genes with genetically predicted expression (GReX) associated with a complex trait. In the present report, we develop a method, GIFT (gene-based integrative fine-mapping through conditional TWAS), that performs conditional TWAS analysis by explicitly controlling for GReX of all other genes residing in a local region to fine-map putatively causal genes. GIFT is frequentist in nature, explicitly models both expression correlation and cis-single nucleotide polymorphism linkage disequilibrium across multiple genes and uses a likelihood framework to account for expression prediction uncertainty. As a result, GIFT produces calibrated P values and is effective for fine-mapping. We apply GIFT to analyze six traits in the UK Biobank, where GIFT narrows down the set size of putatively causal genes by 32.16-91.32% compared with existing TWAS fine-mapping approaches. The genes identified by GIFT highlight the importance of vessel regulation in determining blood pressures and lipid metabolism for regulating lipid levels.
Collapse
Affiliation(s)
- Lu Liu
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Ran Yan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Ping Guo
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Jiadong Ji
- Institute for Financial Studies, Shandong University, Jinan, China
| | - Weiming Gong
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Zhongshang Yuan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China.
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China.
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
9
|
Shi JJ, Mao CY, Guo YZ, Fan Y, Hao XY, Li SJ, Tian J, Hu ZW, Li MJ, Li JD, Ma DR, Guo MN, Zuo CY, Liang YY, Xu YM, Yang J, Shi CH. Joint analysis of proteome, transcriptome, and multi-trait analysis to identify novel Parkinson's disease risk genes. Aging (Albany NY) 2024; 16:1555-1580. [PMID: 38240717 PMCID: PMC10866412 DOI: 10.18632/aging.205444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 12/04/2023] [Indexed: 02/06/2024]
Abstract
Genome-wide association studies (GWAS) have identified multiple risk variants for Parkinson's disease (PD). Nevertheless, how the risk variants confer the risk of PD remains largely unknown. We conducted a proteome-wide association study (PWAS) and summary-data-based mendelian randomization (SMR) analysis by integrating PD GWAS with proteome and protein quantitative trait loci (pQTL) data from human brain, plasma and CSF. We also performed a large transcriptome-wide association study (TWAS) and Fine-mapping of causal gene sets (FOCUS), leveraging joint-tissue imputation (JTI) prediction models of 22 tissues to identify and prioritize putatively causal genes. We further conducted PWAS, SMR, TWAS, and FOCUS using a multi-trait analysis of GWAS (MTAG) to identify additional PD risk genes to boost statistical power. In this large-scale study, we identified 16 genes whose genetically regulated protein abundance levels were associated with Parkinson's disease risk. We undertook a large-scale analysis of PD and correlated traits, through TWAS and FOCUS studies, and discovered 26 casual genes related to PD that had not been reported in previous TWAS. 5 genes (CD38, GPNMB, RAB29, TMEM175, TTC19) showed significant associations with PD at both the proteome-wide and transcriptome-wide levels. Our study provides new insights into the etiology and underlying genetic architecture of PD.
Collapse
Affiliation(s)
- Jing-Jing Shi
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Cheng-Yuan Mao
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Ya-Zhou Guo
- School of Life Sciences, Westlake University, Hangzhou 310024, Zhejiang, China
| | - Yu Fan
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Xiao-Yan Hao
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Shuang-Jie Li
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Jie Tian
- Zhengzhou Railway Vocational and Technical College, Zhengzhou 450000, Henan, China
| | - Zheng-Wei Hu
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Meng-Jie Li
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Jia-Di Li
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Dong-Rui Ma
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Meng-Nan Guo
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Chun-Yan Zuo
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Yuan-Yuan Liang
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Yu-Ming Xu
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- NHC Key Laboratory of Prevention and Treatment of Cerebrovascular Diseases, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- Henan Key Laboratory of Cerebrovascular Diseases, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- Institute of Neuroscience, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Jian Yang
- School of Life Sciences, Westlake University, Hangzhou 310024, Zhejiang, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, Zhejiang, China
| | - Chang-He Shi
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- NHC Key Laboratory of Prevention and Treatment of Cerebrovascular Diseases, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- Henan Key Laboratory of Cerebrovascular Diseases, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- Institute of Neuroscience, Zhengzhou University, Zhengzhou 450000, Henan, China
| |
Collapse
|
10
|
He J, Antonyan L, Zhu H, Ardila K, Li Q, Enoma D, Zhang W, Liu A, Chekouo T, Cao B, MacDonald ME, Arnold PD, Long Q. A statistical method for image-mediated association studies discovers genes and pathways associated with four brain disorders. Am J Hum Genet 2024; 111:48-69. [PMID: 38118447 PMCID: PMC10806749 DOI: 10.1016/j.ajhg.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 11/04/2023] [Accepted: 11/16/2023] [Indexed: 12/22/2023] Open
Abstract
Brain imaging and genomics are critical tools enabling characterization of the genetic basis of brain disorders. However, imaging large cohorts is expensive and may be unavailable for legacy datasets used for genome-wide association studies (GWASs). Using an integrated feature selection/aggregation model, we developed an image-mediated association study (IMAS), which utilizes borrowed imaging/genomics data to conduct association mapping in legacy GWAS cohorts. By leveraging the UK Biobank image-derived phenotypes (IDPs), the IMAS discovered genetic bases underlying four neuropsychiatric disorders and verified them by analyzing annotations, pathways, and expression quantitative trait loci (eQTLs). A cerebellar-mediated mechanism was identified to be common to the four disorders. Simulations show that, if the goal is identifying genetic risk, our IMAS is more powerful than a hypothetical protocol in which the imaging results were available in the GWAS dataset. This implies the feasibility of reanalyzing legacy GWAS datasets without conducting additional imaging, yielding cost savings for integrated analysis of genetics and imaging.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Lilit Antonyan
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Harold Zhu
- Department of Biological Sciences, Faculty of Science, University of Calgary, Calgary, AB, Canada
| | - Karen Ardila
- Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - David Enoma
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | | | - Andy Liu
- Sir Winston Churchill High School, Calgary, AB, Canada; College of Letters and Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Thierry Chekouo
- Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada; Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Bo Cao
- Department of Psychiatry, Faculty of Medicine & Dentistry, University of Alberta, Edmonton, AB, Canada
| | - M Ethan MacDonald
- The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Electrical and Software Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Radiology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Paul D Arnold
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Psychiatry, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
| | - Quan Long
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada.
| |
Collapse
|
11
|
Chandrashekar PB, Alatkar S, Wang J, Hoffman GE, He C, Jin T, Khullar S, Bendl J, Fullard JF, Roussos P, Wang D. DeepGAMI: deep biologically guided auxiliary learning for multimodal integration and imputation to improve genotype-phenotype prediction. Genome Med 2023; 15:88. [PMID: 37904203 PMCID: PMC10617196 DOI: 10.1186/s13073-023-01248-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 10/16/2023] [Indexed: 11/01/2023] Open
Abstract
BACKGROUND Genotypes are strongly associated with disease phenotypes, particularly in brain disorders. However, the molecular and cellular mechanisms behind this association remain elusive. With emerging multimodal data for these mechanisms, machine learning methods can be applied for phenotype prediction at different scales, but due to the black-box nature of machine learning, integrating these modalities and interpreting biological mechanisms can be challenging. Additionally, the partial availability of these multimodal data presents a challenge in developing these predictive models. METHOD To address these challenges, we developed DeepGAMI, an interpretable neural network model to improve genotype-phenotype prediction from multimodal data. DeepGAMI leverages functional genomic information, such as eQTLs and gene regulation, to guide neural network connections. Additionally, it includes an auxiliary learning layer for cross-modal imputation allowing the imputation of latent features of missing modalities and thus predicting phenotypes from a single modality. Finally, DeepGAMI uses integrated gradient to prioritize multimodal features for various phenotypes. RESULTS We applied DeepGAMI to several multimodal datasets including genotype and bulk and cell-type gene expression data in brain diseases, and gene expression and electrophysiology data of mouse neuronal cells. Using cross-validation and independent validation, DeepGAMI outperformed existing methods for classifying disease types, and cellular and clinical phenotypes, even using single modalities (e.g., AUC score of 0.79 for Schizophrenia and 0.73 for cognitive impairment in Alzheimer's disease). CONCLUSION We demonstrated that DeepGAMI improves phenotype prediction and prioritizes phenotypic features and networks in multiple multimodal datasets in complex brains and brain diseases. Also, it prioritized disease-associated variants, genes, and regulatory networks linked to different phenotypes, providing novel insights into the interpretation of gene regulatory mechanisms. DeepGAMI is open-source and available for general use.
Collapse
Affiliation(s)
- Pramod Bharadwaj Chandrashekar
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Sayali Alatkar
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Jiebiao Wang
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, 15261, USA
| | - Gabriel E Hoffman
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Chenfeng He
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Ting Jin
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Saniya Khullar
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - John F Fullard
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Panos Roussos
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters VA Medical Center, Bronx, NY, 10468, USA
- Center for Dementia Research, Nathan Kline Institute for Psychiatric Research, Orangeburg, NY, 10962, USA
| | - Daifeng Wang
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA.
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA.
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, 53076, USA.
| |
Collapse
|
12
|
Statsenko Y, Kuznetsov NV, Morozova D, Liaonchyk K, Simiyu GL, Smetanina D, Kashapov A, Meribout S, Gorkom KNV, Hamoudi R, Ismail F, Ansari SA, Emerald BS, Ljubisavljevic M. Reappraisal of the Concept of Accelerated Aging in Neurodegeneration and Beyond. Cells 2023; 12:2451. [PMID: 37887295 PMCID: PMC10605227 DOI: 10.3390/cells12202451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 09/01/2023] [Accepted: 09/06/2023] [Indexed: 10/28/2023] Open
Abstract
BACKGROUND Genetic and epigenetic changes, oxidative stress and inflammation influence the rate of aging, which diseases, lifestyle and environmental factors can further accelerate. In accelerated aging (AA), the biological age exceeds the chronological age. OBJECTIVE The objective of this study is to reappraise the AA concept critically, considering its weaknesses and limitations. METHODS We reviewed more than 300 recent articles dealing with the physiology of brain aging and neurodegeneration pathophysiology. RESULTS (1) Application of the AA concept to individual organs outside the brain is challenging as organs of different systems age at different rates. (2) There is a need to consider the deceleration of aging due to the potential use of the individual structure-functional reserves. The latter can be restored by pharmacological and/or cognitive therapy, environment, etc. (3) The AA concept lacks both standardised terminology and methodology. (4) Changes in specific molecular biomarkers (MBM) reflect aging-related processes; however, numerous MBM candidates should be validated to consolidate the AA theory. (5) The exact nature of many potential causal factors, biological outcomes and interactions between the former and the latter remain largely unclear. CONCLUSIONS Although AA is commonly recognised as a perspective theory, it still suffers from a number of gaps and limitations that assume the necessity for an updated AA concept.
Collapse
Affiliation(s)
- Yauhen Statsenko
- Department of Radiology, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates; (Y.S.); (G.L.S.); (D.S.); (A.K.); (S.M.); (K.N.-V.G.)
- ASPIRE Precision Medicine Research Institute Abu Dhabi, United Arab Emirates University, Al Ain 27272, United Arab Emirates; (D.M.); (K.L.); (R.H.); (S.A.A.); (B.S.E.); (M.L.)
- Big Data Analytic Center, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Nik V. Kuznetsov
- ASPIRE Precision Medicine Research Institute Abu Dhabi, United Arab Emirates University, Al Ain 27272, United Arab Emirates; (D.M.); (K.L.); (R.H.); (S.A.A.); (B.S.E.); (M.L.)
| | - Daria Morozova
- ASPIRE Precision Medicine Research Institute Abu Dhabi, United Arab Emirates University, Al Ain 27272, United Arab Emirates; (D.M.); (K.L.); (R.H.); (S.A.A.); (B.S.E.); (M.L.)
| | - Katsiaryna Liaonchyk
- ASPIRE Precision Medicine Research Institute Abu Dhabi, United Arab Emirates University, Al Ain 27272, United Arab Emirates; (D.M.); (K.L.); (R.H.); (S.A.A.); (B.S.E.); (M.L.)
| | - Gillian Lylian Simiyu
- Department of Radiology, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates; (Y.S.); (G.L.S.); (D.S.); (A.K.); (S.M.); (K.N.-V.G.)
| | - Darya Smetanina
- Department of Radiology, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates; (Y.S.); (G.L.S.); (D.S.); (A.K.); (S.M.); (K.N.-V.G.)
| | - Aidar Kashapov
- Department of Radiology, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates; (Y.S.); (G.L.S.); (D.S.); (A.K.); (S.M.); (K.N.-V.G.)
| | - Sarah Meribout
- Department of Radiology, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates; (Y.S.); (G.L.S.); (D.S.); (A.K.); (S.M.); (K.N.-V.G.)
| | - Klaus Neidl-Van Gorkom
- Department of Radiology, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates; (Y.S.); (G.L.S.); (D.S.); (A.K.); (S.M.); (K.N.-V.G.)
| | - Rifat Hamoudi
- ASPIRE Precision Medicine Research Institute Abu Dhabi, United Arab Emirates University, Al Ain 27272, United Arab Emirates; (D.M.); (K.L.); (R.H.); (S.A.A.); (B.S.E.); (M.L.)
- Department of Clinical Sciences, College of Medicine, University of Sharjah, Sharjah 27272, United Arab Emirates
- Division of Surgery and Interventional Science, University College London, London NW3 2PS, UK
| | - Fatima Ismail
- Department of Pediatrics, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates;
| | - Suraiya Anjum Ansari
- ASPIRE Precision Medicine Research Institute Abu Dhabi, United Arab Emirates University, Al Ain 27272, United Arab Emirates; (D.M.); (K.L.); (R.H.); (S.A.A.); (B.S.E.); (M.L.)
- Department of Biochemistry and Molecular Biology, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Bright Starling Emerald
- ASPIRE Precision Medicine Research Institute Abu Dhabi, United Arab Emirates University, Al Ain 27272, United Arab Emirates; (D.M.); (K.L.); (R.H.); (S.A.A.); (B.S.E.); (M.L.)
- Department of Anatomy, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Milos Ljubisavljevic
- ASPIRE Precision Medicine Research Institute Abu Dhabi, United Arab Emirates University, Al Ain 27272, United Arab Emirates; (D.M.); (K.L.); (R.H.); (S.A.A.); (B.S.E.); (M.L.)
- Department of Physiology, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| |
Collapse
|
13
|
He J, Wen W, Ping J, Li Q, Chen Z, Perera D, Shu X, Long J, Cai Q, Shu XO, Zheng W, Long Q, Guo X. Enhancing Disease Risk Gene Discovery by Integrating Transcription Factor-Linked Trans-located Variants into Transcriptome-Wide Association Analyses. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.10.10.23295443. [PMID: 37873299 PMCID: PMC10593059 DOI: 10.1101/2023.10.10.23295443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Transcriptome-wide association studies (TWAS) have been successful in identifying putative disease susceptibility genes by integrating gene expression predictions with genome-wide association studies (GWAS) data. However, current TWAS models only consider cis-located variants to predict gene expression. Here, we introduce transTF-TWAS, which includes transcription factor (TF)-linked trans-located variants for model building. Using data from the Genotype-Tissue Expression project, we predict alternative splicing and gene expression and applied these models to large GWAS datasets for breast, prostate, and lung cancers. Our analysis revealed 887 putative cancer susceptibility genes, including 465 in regions not yet reported by previous GWAS and 137 in known GWAS loci but not yet reported previously, at Bonferroni-corrected P < 0.05. We demonstrate that transTF-TWAS surpasses other approaches in both building gene prediction models and identifying disease-associated genes. These results have shed new light on several genetically driven key regulators and their associated regulatory networks underlying disease susceptibility.
Collapse
|
14
|
Pividori M, Lu S, Li B, Su C, Johnson ME, Wei WQ, Feng Q, Namjou B, Kiryluk K, Kullo IJ, Luo Y, Sullivan BD, Voight BF, Skarke C, Ritchie MD, Grant SFA, Greene CS. Projecting genetic associations through gene expression patterns highlights disease etiology and drug mechanisms. Nat Commun 2023; 14:5562. [PMID: 37689782 PMCID: PMC10492839 DOI: 10.1038/s41467-023-41057-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 08/18/2023] [Indexed: 09/11/2023] Open
Abstract
Genes act in concert with each other in specific contexts to perform their functions. Determining how these genes influence complex traits requires a mechanistic understanding of expression regulation across different conditions. It has been shown that this insight is critical for developing new therapies. Transcriptome-wide association studies have helped uncover the role of individual genes in disease-relevant mechanisms. However, modern models of the architecture of complex traits predict that gene-gene interactions play a crucial role in disease origin and progression. Here we introduce PhenoPLIER, a computational approach that maps gene-trait associations and pharmacological perturbation data into a common latent representation for a joint analysis. This representation is based on modules of genes with similar expression patterns across the same conditions. We observe that diseases are significantly associated with gene modules expressed in relevant cell types, and our approach is accurate in predicting known drug-disease pairs and inferring mechanisms of action. Furthermore, using a CRISPR screen to analyze lipid regulation, we find that functionally important players lack associations but are prioritized in trait-associated modules by PhenoPLIER. By incorporating groups of co-expressed genes, PhenoPLIER can contextualize genetic associations and reveal potential targets missed by single-gene strategies.
Collapse
Affiliation(s)
- Milton Pividori
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Sumei Lu
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Binglan Li
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Chun Su
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Matthew E Johnson
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Wei-Qi Wei
- Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Qiping Feng
- Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Bahram Namjou
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, 45229, USA
| | - Krzysztof Kiryluk
- Department of Medicine, Division of Nephrology, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, 10032, USA
| | | | - Yuan Luo
- Northwestern University, Chicago, IL, 60611, USA
| | - Blair D Sullivan
- Kahlert School of Computing, University of Utah, Salt Lake City, UT, 84112, USA
| | - Benjamin F Voight
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Carsten Skarke
- Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Marylyn D Ritchie
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Struan F A Grant
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Division of Endocrinology and Diabetes, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Casey S Greene
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
- Center for Health AI, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| |
Collapse
|
15
|
de Leeuw C, Werme J, Savage JE, Peyrot WJ, Posthuma D. On the interpretation of transcriptome-wide association studies. PLoS Genet 2023; 19:e1010921. [PMID: 37676898 PMCID: PMC10508613 DOI: 10.1371/journal.pgen.1010921] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 09/19/2023] [Accepted: 08/15/2023] [Indexed: 09/09/2023] Open
Abstract
Transcriptome-wide association studies (TWAS) aim to detect relationships between gene expression and a phenotype, and are commonly used for secondary analysis of genome-wide association study (GWAS) results. Results from TWAS analyses are often interpreted as indicating a genetic relationship between gene expression and a phenotype, but this interpretation is not consistent with the null hypothesis that is evaluated in the traditional TWAS framework. In this study we provide a mathematical outline of this TWAS framework, and elucidate what interpretations are warranted given the null hypothesis it actually tests. We then use both simulations and real data analysis to assess the implications of misinterpreting TWAS results as indicative of a genetic relationship between gene expression and the phenotype. Our simulation results show considerably inflated type 1 error rates for TWAS when interpreted this way, with 41% of significant TWAS associations detected in the real data analysis found to have insufficient statistical evidence to infer such a relationship. This demonstrates that in current implementations, TWAS cannot reliably be used to investigate genetic relationships between gene expression and a phenotype, but that local genetic correlation analysis can serve as a potential alternative.
Collapse
Affiliation(s)
- Christiaan de Leeuw
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
| | - Josefin Werme
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
| | - Jeanne E. Savage
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
| | - Wouter J. Peyrot
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
- Department of Psychiatry, Amsterdam UMC, location VUmc, Amsterdam, the Netherlands
| | - Danielle Posthuma
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
- Department of Child and Adolescent Psychology and Psychiatry, section Complex Trait Genetics, Amsterdam Neuroscience, VU University Medical Centre, Amsterdam, The Netherlands
| |
Collapse
|
16
|
Guo S, Yang J. Bayesian genome-wide TWAS with reference transcriptomic data of brain and blood tissues identified 93 risk genes for Alzheimer's disease dementia. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.07.06.23292336. [PMID: 37503151 PMCID: PMC10370241 DOI: 10.1101/2023.07.06.23292336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Background Transcriptome-wide association study (TWAS) is an influential tool for identifying novel genes associated with complex diseases, where their genetic effects may be mediated through transcriptome. TWAS utilizes reference genetic and transcriptomic data to estimate genetic effect sizes on expression quantitative traits of target genes (i.e., effect sizes of a broad sense of expression quantitative trait loci, eQTL). These estimated effect sizes are then employed as variant weights in burden gene-based association test statistics, facilitating the mapping of risk genes for complex diseases with genome-wide association study (GWAS) data. However, most existing TWAS of Alzheimer's disease (AD) dementia have primarily focused on cis -eQTL, disregarding potential trans -eQTL. To overcome this limitation, we applied the Bayesian Genome-wide TWAS (BGW-TWAS) method which incorporated both cis - and trans -eQTL of brain and blood tissues to enhance mapping risk genes for AD dementia. Methods We first applied BGW-TWAS to the Genotype-Tissue Expression (GTEx) V8 dataset to estimate cis - and trans -eQTL effect sizes of the prefrontal cortex, cortex, and whole blood tissues. Subsequently, estimated eQTL effect sizes were integrated with the summary data of the most recent GWAS of AD dementia to obtain BGW-TWAS (i.e., gene-based association test) p-values of AD dementia per tissue type. Finally, we used the aggregated Cauchy association test to combine TWAS p-values across three tissues to obtain omnibus TWAS p-values per gene. Results We identified 37 genes in prefrontal cortex, 55 in cortex, and 51 in whole blood that were significantly associated with AD dementia. By combining BGW-TWAS p-values across these three tissues, we obtained 93 significant risk genes including 29 genes primarily due to trans -eQTL and 50 novel genes. Utilizing protein-protein interaction network and phenotype enrichment analyses with these 93 significant risk genes, we detected 5 functional clusters comprised of both known and novel AD risk genes and 7 enriched phenotypes. Conclusion We applied BGW-TWAS and aggregated Cauchy test methods to integrate both cis - and trans -eQTL data of brain and blood tissues with GWAS summary data to identify risk genes of AD dementia. The risk genes we identified provide novel insights into the underlying biological pathways implicated in AD dementia.
Collapse
|
17
|
Wang YH, Luo PP, Geng AY, Li X, Liu TH, He YJ, Huang L, Tang YQ. Identification of highly reliable risk genes for Alzheimer's disease through joint-tissue integrative analysis. Front Aging Neurosci 2023; 15:1183119. [PMID: 37416324 PMCID: PMC10320295 DOI: 10.3389/fnagi.2023.1183119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Accepted: 05/30/2023] [Indexed: 07/08/2023] Open
Abstract
Numerous genetic variants associated with Alzheimer's disease (AD) have been identified through genome-wide association studies (GWAS), but their interpretation is hindered by the strong linkage disequilibrium (LD) among the variants, making it difficult to identify the causal variants directly. To address this issue, the transcriptome-wide association study (TWAS) was employed to infer the association between gene expression and a trait at the genetic level using expression quantitative trait locus (eQTL) cohorts. In this study, we applied the TWAS theory and utilized the improved Joint-Tissue Imputation (JTI) approach and Mendelian Randomization (MR) framework (MR-JTI) to identify potential AD-associated genes. By integrating LD score, GTEx eQTL data, and GWAS summary statistic data from a large cohort using MR-JTI, a total of 415 AD-associated genes were identified. Then, 2873 differentially expressed genes from 11 AD-related datasets were used for the Fisher test of these AD-associated genes. We finally obtained 36 highly reliable AD-associated genes, including APOC1, CR1, ERBB2, and RIN3. Moreover, the GO and KEGG enrichment analysis revealed that these genes are primarily involved in antigen processing and presentation, amyloid-beta formation, tau protein binding, and response to oxidative stress. The identification of these potential AD-associated genes not only provides insights into the pathogenesis of AD but also offers biomarkers for early diagnosis of the disease.
Collapse
Affiliation(s)
- Yong Heng Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
- Joint International Research Laboratory of Reproduction and Development, Chongqing Medical University, Chongqing, China
| | - Pan Pan Luo
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Ao Yi Geng
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Xinwei Li
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, China
| | - Tai-Hang Liu
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
- Joint International Research Laboratory of Reproduction and Development, Chongqing Medical University, Chongqing, China
| | - Yi Jie He
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Lin Huang
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Ya Qin Tang
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| |
Collapse
|
18
|
Dai Q, Zhou G, Zhao H, Võsa U, Franke L, Battle A, Teumer A, Lehtimäki T, Raitakari OT, Esko T, Epstein MP, Yang J. OTTERS: a powerful TWAS framework leveraging summary-level reference data. Nat Commun 2023; 14:1271. [PMID: 36882394 PMCID: PMC9992663 DOI: 10.1038/s41467-023-36862-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 02/20/2023] [Indexed: 03/09/2023] Open
Abstract
Most existing TWAS tools require individual-level eQTL reference data and thus are not applicable to summary-level reference eQTL datasets. The development of TWAS methods that can harness summary-level reference data is valuable to enable TWAS in broader settings and enhance power due to increased reference sample size. Thus, we develop a TWAS framework called OTTERS (Omnibus Transcriptome Test using Expression Reference Summary data) that adapts multiple polygenic risk score (PRS) methods to estimate eQTL weights from summary-level eQTL reference data and conducts an omnibus TWAS. We show that OTTERS is a practical and powerful TWAS tool by both simulations and application studies.
Collapse
Affiliation(s)
- Qile Dai
- Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA, 30322, USA
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Geyu Zhou
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA
| | - Hongyu Zhao
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, 06520, USA
| | - Urmo Võsa
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 50090, Tartu, Estonia
| | - Lude Franke
- Department of Genetics, University of Groningen, University Medical Center Groningen, 9700 RB, Groningen, The Netherlands
- Oncode Institute, 3521 AL, Utrecht, The Netherlands
| | - Alexis Battle
- Department of Computer Science, and Departments of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Alexander Teumer
- Institute for Community Medicine, University Medicine Greifswald, 17489, Greifswald, Germany
| | - Terho Lehtimäki
- Department of Clinical Chemistry, Fimlab Laboratories and Finnish Centre for Cardiovascular Disease Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, 33520, Finland
| | - Olli T Raitakari
- Centre for Population Health Research, University of Turku and Turku University Hospital, 20520, Turku, Finland
- Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, 20520, Turku, Finland
- Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, 20521, Turku, Finland
| | - Tõnu Esko
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 50090, Tartu, Estonia
| | - Michael P Epstein
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| |
Collapse
|
19
|
CoNet: Efficient Network Regression for Survival Analysis in Transcriptome-Wide Association Studies—With Applications to Studies of Breast Cancer. Genes (Basel) 2023; 14:genes14030586. [PMID: 36980857 PMCID: PMC10048118 DOI: 10.3390/genes14030586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Revised: 02/23/2023] [Accepted: 02/23/2023] [Indexed: 03/02/2023] Open
Abstract
Transcriptome-wide association studies (TWASs) aim to detect associations between genetically predicted gene expression and complex diseases or traits through integrating genome-wide association studies (GWASs) and expression quantitative trait loci (eQTL) mapping studies. Most current TWAS methods analyze one gene at a time, ignoring the correlations between multiple genes. Few of the existing TWAS methods focus on survival outcomes. Here, we propose a novel method, namely a COx proportional hazards model for NEtwork regression in TWAS (CoNet), that is applicable for identifying the association between one given network and the survival time. CoNet considers the general relationship among the predicted gene expression as edges of the network and quantifies it through pointwise mutual information (PMI), which is under a two-stage TWAS. Extensive simulation studies illustrate that CoNet can not only achieve type I error calibration control in testing both the node effect and edge effect, but it can also gain more power compared with currently available methods. In addition, it demonstrates superior performance in real data application, namely utilizing the breast cancer survival data of UK Biobank. CoNet effectively accounts for network structure and can simultaneously identify the potential effecting nodes and edges that are related to survival outcomes in TWAS.
Collapse
|
20
|
Chen F, Wang X, Jang SK, Quach BC, Weissenkampen JD, Khunsriraksakul C, Yang L, Sauteraud R, Albert CM, Allred NDD, Arnett DK, Ashley-Koch AE, Barnes KC, Barr RG, Becker DM, Bielak LF, Bis JC, Blangero J, Boorgula MP, Chasman DI, Chavan S, Chen YDI, Chuang LM, Correa A, Curran JE, David SP, Fuentes LDL, Deka R, Duggirala R, Faul JD, Garrett ME, Gharib SA, Guo X, Hall ME, Hawley NL, He J, Hobbs BD, Hokanson JE, Hsiung CA, Hwang SJ, Hyde TM, Irvin MR, Jaffe AE, Johnson EO, Kaplan R, Kardia SLR, Kaufman JD, Kelly TN, Kleinman JE, Kooperberg C, Lee IT, Levy D, Lutz SM, Manichaikul AW, Martin LW, Marx O, McGarvey ST, Minster RL, Moll M, Moussa KA, Naseri T, North KE, Oelsner EC, Peralta JM, Peyser PA, Psaty BM, Rafaels N, Raffield LM, Reupena MS, Rich SS, Rotter JI, Schwartz DA, Shadyab AH, Sheu WHH, Sims M, Smith JA, Sun X, Taylor KD, Telen MJ, Watson H, Weeks DE, Weir DR, Yanek LR, Young KA, Young KL, Zhao W, Hancock DB, Jiang B, Vrieze S, Liu DJ. Multi-ancestry transcriptome-wide association analyses yield insights into tobacco use biology and drug repurposing. Nat Genet 2023; 55:291-300. [PMID: 36702996 PMCID: PMC9925385 DOI: 10.1038/s41588-022-01282-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 12/08/2022] [Indexed: 01/27/2023]
Abstract
Most transcriptome-wide association studies (TWASs) so far focus on European ancestry and lack diversity. To overcome this limitation, we aggregated genome-wide association study (GWAS) summary statistics, whole-genome sequences and expression quantitative trait locus (eQTL) data from diverse ancestries. We developed a new approach, TESLA (multi-ancestry integrative study using an optimal linear combination of association statistics), to integrate an eQTL dataset with a multi-ancestry GWAS. By exploiting shared phenotypic effects between ancestries and accommodating potential effect heterogeneities, TESLA improves power over other TWAS methods. When applied to tobacco use phenotypes, TESLA identified 273 new genes, up to 55% more compared with alternative TWAS methods. These hits and subsequent fine mapping using TESLA point to target genes with biological relevance. In silico drug-repurposing analyses highlight several drugs with known efficacy, including dextromethorphan and galantamine, and new drugs such as muscle relaxants that may be repurposed for treating nicotine addiction.
Collapse
Affiliation(s)
- Fang Chen
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Xingyan Wang
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Seon-Kyeong Jang
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| | | | - J Dylan Weissenkampen
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychology, Penn State College of Medicine, Hershey, PA, USA
| | | | - Lina Yang
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Renan Sauteraud
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Christine M Albert
- Department of Cardiology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Donna K Arnett
- College of Public Health, University of Kentucky, Lexington, KY, USA
| | - Allison E Ashley-Koch
- Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC, USA
- Department of Medicine, Duke University Medical Center, Durham, NC, USA
- Duke Comprehensive Sickle Cell Center, Duke University Medical Center, Durham, NC, USA
| | - Kathleen C Barnes
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Center, Aurora, CO, USA
| | - R Graham Barr
- Department of Medicine, Columbia University Medical Center, New York, NY, USA
| | - Diane M Becker
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Lawrence F Bielak
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Joshua C Bis
- Department of Medicine, Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
| | - John Blangero
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Meher Preethi Boorgula
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Center, Aurora, CO, USA
| | - Daniel I Chasman
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Sameer Chavan
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Center, Aurora, CO, USA
| | - Yii-Der I Chen
- Department of Pediatrics, Institute for Translational Genomics and Population Sciences, Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Lee-Ming Chuang
- Department of Internal Medicine, National Taiwan University Hospital, Taipei, Taiwan
| | - Adolfo Correa
- Department of Medicine, Jackson Heart Study, University of Mississippi Medical Center, Jackson, MS, USA
| | - Joanne E Curran
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Sean P David
- University of Chicago, Chicago, IL, USA
- NorthShore University Health System, Evanston, IL, USA
| | - Lisa de Las Fuentes
- Department of Medicine, Division of Biostatistics and Cardiovascular Division, Washington University School of Medicine, St. Louis, MO, USA
| | - Ranjan Deka
- Department of Environmental and Public Health Sciences, College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - Ravindranath Duggirala
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Jessica D Faul
- Institute for Social Research, Survey Research Center, University of Michigan, Ann Arbor, MI, USA
| | - Melanie E Garrett
- Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC, USA
- Department of Medicine, Duke University Medical Center, Durham, NC, USA
| | - Sina A Gharib
- Department of Medicine, Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
- Computational Medicine Core at Center for Lung Biology, Division of Pulmonary, Critical Care and Sleep Medicine, University of Washington, Seattle, WA, USA
| | - Xiuqing Guo
- Department of Pediatrics, Institute for Translational Genomics and Population Sciences, Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Michael E Hall
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Nicola L Hawley
- Department of Epidemiology (Chronic Disease), School of Public Health, Yale University, New Haven, CT, USA
| | - Jiang He
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| | - Brian D Hobbs
- Harvard Medical School, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - John E Hokanson
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Chao A Hsiung
- Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Taiwan
| | - Shih-Jen Hwang
- The Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
- The Framingham Heart Study, Framingham, MA, USA
| | - Thomas M Hyde
- Lieber Institute for Brain Development, Baltimore, MD, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Marguerite R Irvin
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Andrew E Jaffe
- Lieber Institute for Brain Development, Baltimore, MD, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Mental Health and Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Department of Human Genetics and Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | | | - Robert Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, The Bronx, NY, USA
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Sharon L R Kardia
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Joel D Kaufman
- Departments of Environmental & Occupational Health Sciences, Medicine, and Epidemiology, University of Washington Seattle, Seattle, WA, USA
| | - Tanika N Kelly
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| | - Joel E Kleinman
- Lieber Institute for Brain Development, Baltimore, MD, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | | | - I-Te Lee
- Department of Internal Medicine, Division of Endocrinology and Metabolism, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Daniel Levy
- The Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sharon M Lutz
- Department of Population Medicine, Harvard Pilgrim Health Care, Boston, MA, USA
| | - Ani W Manichaikul
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Lisa W Martin
- Division of Cardiology, George Washington University School of Medicine and Health Sciences, Washington, DC, USA
| | - Olivia Marx
- Department of Biomedical Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Stephen T McGarvey
- Department of Epidemiology, International Health Institute, Brown University School of Public Health, Providence, RI, USA
| | - Ryan L Minster
- Department of Human Genetics and Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Matthew Moll
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Karine A Moussa
- Penn State Huck Institutes of Life Sciences, Penn State College of Medicine, University Park, PA, USA
| | - Take Naseri
- Ministry of Health, Government of Samoa, Apia, Samoa
| | - Kari E North
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Elizabeth C Oelsner
- Department of Medicine, Columbia University Medical Center, New York, NY, USA
| | - Juan M Peralta
- Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, USA
| | - Patricia A Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Bruce M Psaty
- Department of Medicine, Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Nicholas Rafaels
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Center, Aurora, CO, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Jerome I Rotter
- Department of Pediatrics, Institute for Translational Genomics and Population Sciences, Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | | | - Aladdin H Shadyab
- Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, La Jolla, CA, USA
| | | | - Mario Sims
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Jennifer A Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
- Institute for Social Research, Survey Research Center, University of Michigan, Ann Arbor, MI, USA
| | - Xiao Sun
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| | - Kent D Taylor
- Department of Pediatrics, Institute for Translational Genomics and Population Sciences, Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Marilyn J Telen
- Department of Medicine, Duke University Medical Center, Durham, NC, USA
| | - Harold Watson
- Faculty of Medical Sciences, University of the West Indies, Cave Hill Campus, Barbados
| | - Daniel E Weeks
- Department of Human Genetics and Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - David R Weir
- Institute for Social Research, Survey Research Center, University of Michigan, Ann Arbor, MI, USA
| | - Lisa R Yanek
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Kendra A Young
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Kristin L Young
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Wei Zhao
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
- Institute for Social Research, Survey Research Center, University of Michigan, Ann Arbor, MI, USA
| | | | - Bibo Jiang
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA.
| | - Scott Vrieze
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA.
| | - Dajiang J Liu
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA.
| |
Collapse
|
21
|
Patel RS, Lui A, Hudson C, Moss L, Sparks RP, Hill SE, Shi Y, Cai J, Blair LJ, Bickford PC, Patel NA. Small molecule targeting long noncoding RNA GAS5 administered intranasally improves neuronal insulin signaling and decreases neuroinflammation in an aged mouse model. Sci Rep 2023; 13:317. [PMID: 36609440 PMCID: PMC9822944 DOI: 10.1038/s41598-022-27126-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Accepted: 12/26/2022] [Indexed: 01/09/2023] Open
Abstract
Shifts in normal aging set stage for neurodegeneration and dementia affecting 1 in 10 adults. The study demonstrates that lncRNA GAS5 is decreased in aged and Alzheimer's disease brain. The role and targets of lncRNA GAS5 in the aging brain were elucidated using a GAS5-targeting small molecule NPC86, a frontier in lncRNA-targeting therapeutic. Robust techniques such as molecular dynamics simulation of NPC86 binding to GAS5, in vitro functional assays demonstrating that GAS5 regulates insulin signaling, neuronal survival, phosphorylation of tau, and neuroinflammation via toll-like receptors support the role of GAS5 in maintaining healthy neurons. The study demonstrates the safety and efficacy of intranasal NPC86 treatment in aged mice to improve cellular functions with transcriptomic analysis in response to NPC86. In summary, the study demonstrates that GAS5 contributes to pathways associated with neurodegeneration and NPC86 has tremendous therapeutic potential to prevent the advent of neurodegenerative diseases and dementias.
Collapse
Affiliation(s)
- Rekha S. Patel
- grid.281075.90000 0001 0624 9286James A. Haley Veterans Hospital, Research Service, 13000 Bruce B. Downs Blvd., Tampa, FL 33612 USA
| | - Ashley Lui
- grid.170693.a0000 0001 2353 285XDepartment of Molecular Medicine, University of South Florida, Tampa, FL 33612 USA
| | - Charles Hudson
- grid.281075.90000 0001 0624 9286James A. Haley Veterans Hospital, Research Service, 13000 Bruce B. Downs Blvd., Tampa, FL 33612 USA
| | - Lauren Moss
- grid.170693.a0000 0001 2353 285XDepartment of Neurosurgery and Brain Repair, University of South Florida, Tampa, FL 33612 USA
| | - Robert P. Sparks
- Present Address: UMass Chan Medical School, Worcester, MA 01655 USA
| | - Shannon E. Hill
- grid.170693.a0000 0001 2353 285XDepartment of Molecular Medicine, University of South Florida, Tampa, FL 33612 USA ,grid.170693.a0000 0001 2353 285XUSF Health Byrd Institute, University of South Florida, Tampa, FL 33612 USA
| | - Yan Shi
- grid.170693.a0000 0001 2353 285XDepartment of Chemistry, University of South Florida, Tampa, FL 33612 USA
| | - Jianfeng Cai
- grid.170693.a0000 0001 2353 285XDepartment of Chemistry, University of South Florida, Tampa, FL 33612 USA
| | - Laura J. Blair
- grid.281075.90000 0001 0624 9286James A. Haley Veterans Hospital, Research Service, 13000 Bruce B. Downs Blvd., Tampa, FL 33612 USA ,grid.170693.a0000 0001 2353 285XDepartment of Molecular Medicine, University of South Florida, Tampa, FL 33612 USA ,grid.170693.a0000 0001 2353 285XUSF Health Byrd Institute, University of South Florida, Tampa, FL 33612 USA
| | - Paula C. Bickford
- grid.281075.90000 0001 0624 9286James A. Haley Veterans Hospital, Research Service, 13000 Bruce B. Downs Blvd., Tampa, FL 33612 USA ,grid.170693.a0000 0001 2353 285XDepartment of Neurosurgery and Brain Repair, University of South Florida, Tampa, FL 33612 USA
| | - Niketa A. Patel
- grid.281075.90000 0001 0624 9286James A. Haley Veterans Hospital, Research Service, 13000 Bruce B. Downs Blvd., Tampa, FL 33612 USA ,grid.170693.a0000 0001 2353 285XDepartment of Molecular Medicine, University of South Florida, Tampa, FL 33612 USA
| |
Collapse
|
22
|
Gedik H, Peterson RE, Riley BP, Vladimirov VI, Bacanu SA. Integrative Post-Genome-Wide Association Study Analyses Relevant to Psychiatric Disorders: Imputing Transcriptome and Proteome Signals. Complex Psychiatry 2023; 9:130-144. [PMID: 37588130 PMCID: PMC10425719 DOI: 10.1159/000530223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 03/09/2023] [Indexed: 08/18/2023] Open
Abstract
Background The genome-wide association study (GWAS) is a common tool to identify genetic variants associated with complex traits, including psychiatric disorders (PDs). However, post-GWAS analyses are needed to extend the statistical inference to biologically relevant entities, e.g., genes, proteins, and pathways. To achieve this goal, researchers developed methods that incorporate biologically relevant intermediate molecular phenotypes, such as gene expression and protein abundance, which are posited to mediate the variant-trait association. Transcriptome-wide association study (TWAS) and proteome-wide association study (PWAS) are commonly used methods to test the association between these molecular mediators and the trait. Summary In this review, we discuss the most recent developments in TWAS and PWAS. These methods integrate existing "omic" information with the GWAS summary statistics for trait(s) of interest. Specifically, they impute transcript/protein data and test the association between imputed gene expression/protein level with phenotype of interest by using (i) GWAS summary statistics and (ii) reference transcriptomic/proteomic/genomic datasets. TWAS and PWAS are suitable as analysis tools for (i) primary association scan and (ii) fine-mapping to identify potentially causal genes for PDs. Key Messages As post-GWAS analyses, TWAS and PWAS have the potential to highlight causal genes for PDs. These prioritized genes could indicate targets for the development of novel drug therapies. For researchers attempting such analyses, we recommend Mendelian randomization tools that use GWAS statistics for both trait and reference datasets, e.g., summary Mendelian randomization (SMR). We base our recommendation on (i) being able to use the same tool for both TWAS and PWAS, (ii) not requiring the pre-computed weights (and thus easier to update for larger reference datasets), and (iii) most larger transcriptome reference datasets are publicly available and easy to transform into a compatible format for SMR analysis.
Collapse
Affiliation(s)
- Huseyin Gedik
- Integrative Life Sciences, Virginia Institute of Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA, USA
| | - Roseann E. Peterson
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Brien P. Riley
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Vladimir I. Vladimirov
- Department of Psychiatry, College of Medicine-Phoenix, University of Arizona, Phoenix, AZ, USA
| | - Silviu-Alin Bacanu
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| |
Collapse
|
23
|
He J, Wen W, Beeghly A, Chen Z, Cao C, Shu XO, Zheng W, Long Q, Guo X. Integrating transcription factor occupancy with transcriptome-wide association analysis identifies susceptibility genes in human cancers. Nat Commun 2022; 13:7118. [PMID: 36402776 PMCID: PMC9675749 DOI: 10.1038/s41467-022-34888-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 11/10/2022] [Indexed: 11/21/2022] Open
Abstract
Transcriptome-wide association studies (TWAS) have successfully discovered many putative disease susceptibility genes. However, TWAS may suffer from inaccuracy of gene expression predictions due to inclusion of non-regulatory variants. By integrating prior knowledge of susceptible transcription factor occupied elements, we develop sTF-TWAS and demonstrate that it outperforms existing TWAS approaches in both simulation and real data analyses. Under the sTF-TWAS framework, we build genetic models to predict alternative splicing and gene expression in normal breast, prostate and lung tissues from the Genotype-Tissue Expression project and apply these models to data from large genome-wide association studies (GWAS) conducted among European-ancestry populations. At Bonferroni-corrected P < 0.05, we identify 354 putative susceptibility genes for these cancers, including 189 previously unreported in GWAS loci and 45 in loci unreported by GWAS. These findings provide additional insight into the genetic susceptibility of human cancers. Additionally, we show the generalizability of the sTF-TWAS on non-cancer diseases.
Collapse
Affiliation(s)
- Jingni He
- grid.22072.350000 0004 1936 7697Department of Biochemistry & Molecular Biology, University of Calgary, Calgary, Canada ,grid.452223.00000 0004 1757 7615Department of Oncology, Xiangya Hospital, Central South University, Changsha, Hunan China
| | - Wanqing Wen
- grid.152326.10000 0001 2264 7217Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN USA
| | - Alicia Beeghly
- grid.152326.10000 0001 2264 7217Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN USA
| | - Zhishan Chen
- grid.152326.10000 0001 2264 7217Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN USA
| | - Chen Cao
- grid.22072.350000 0004 1936 7697Department of Biochemistry & Molecular Biology, University of Calgary, Calgary, Canada
| | - Xiao-Ou Shu
- grid.152326.10000 0001 2264 7217Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN USA
| | - Wei Zheng
- grid.152326.10000 0001 2264 7217Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN USA
| | - Quan Long
- grid.22072.350000 0004 1936 7697Department of Biochemistry & Molecular Biology, University of Calgary, Calgary, Canada ,grid.22072.350000 0004 1936 7697Department of Medical Genetics, University of Calgary, Calgary, Canada ,grid.22072.350000 0004 1936 7697Department of Mathematics & Statistics, University of Calgary, Calgary, Canada ,grid.22072.350000 0004 1936 7697Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada ,grid.22072.350000 0004 1936 7697Hotchkiss Brain Institute, University of Calgary, Calgary, Canada
| | - Xingyi Guo
- grid.152326.10000 0001 2264 7217Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN USA ,grid.152326.10000 0001 2264 7217Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN USA
| |
Collapse
|
24
|
Yang J, Oveisgharan S, Liu X, Wilson RS, Bennett DA, Buchman AS. Risk Models Based on Non-Cognitive Measures May Identify Presymptomatic Alzheimer's Disease. J Alzheimers Dis 2022; 89:1249-1262. [PMID: 35988224 PMCID: PMC10083073 DOI: 10.3233/jad-220446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
BACKGROUND Alzheimer's disease (AD) is a progressive disorder without a cure. Develop risk prediction models for detecting presymptomatic AD using non-cognitive measures is necessary to enable early interventions. OBJECTIVE Examine if non-cognitive metrics alone can be used to construct risk models to identify adults at risk for AD dementia and cognitive impairment. METHODS Clinical data from older adults without dementia from the Memory and Aging Project (MAP, n = 1,179) and Religious Orders Study (ROS, n = 1,103) were analyzed using Cox proportional hazard models to develop risk prediction models for AD dementia and cognitive impairment. Models using only non-cognitive covariates were compared to models that added cognitive covariates. All models were trained in MAP, tested in ROS, and evaluated by the AUC of ROC curve. RESULTS Models based on non-cognitive covariates alone achieved AUC (0.800,0.785) for predicting AD dementia (3.5) years from baseline. Including additional cognitive covariates improved AUC to (0.916,0.881). A model with a single covariate of composite cognition score achieved AUC (0.905,0.863). Models based on non-cognitive covariates alone achieved AUC (0.717,0.714) for predicting cognitive impairment (3.5) years from baseline. Including additional cognitive covariates improved AUC to (0.783,0.770). A model with a single covariate of composite cognition score achieved AUC (0.754,0.730). CONCLUSION Risk models based on non-cognitive metrics predict both AD dementia and cognitive impairment. However, non-cognitive covariates do not provide incremental predictivity for models that include cognitive metrics in predicting AD dementia, but do in models predicting cognitive impairment. Further improved risk prediction models for cognitive impairment are needed.
Collapse
Affiliation(s)
- Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | - Shahram Oveisgharan
- Rush Alzheimer's Disease Center, Rush University Medicine Center, Chicago, IL, USA
| | - Xizhu Liu
- Quantitative Theory and Methods Program, College of Arts and Sciences, Emory University, Atlanta, GA, USA
| | - Robert S Wilson
- Rush Alzheimer's Disease Center, Rush University Medicine Center, Chicago, IL, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medicine Center, Chicago, IL, USA
| | - Aron S Buchman
- Rush Alzheimer's Disease Center, Rush University Medicine Center, Chicago, IL, USA
| |
Collapse
|
25
|
Shao Z, Wang T, Qiao J, Zhang Y, Huang S, Zeng P. A comprehensive comparison of multilocus association methods with summary statistics in genome-wide association studies. BMC Bioinformatics 2022; 23:359. [PMID: 36042399 PMCID: PMC9429742 DOI: 10.1186/s12859-022-04897-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 08/22/2022] [Indexed: 02/07/2023] Open
Abstract
Background Multilocus analysis on a set of single nucleotide polymorphisms (SNPs) pre-assigned within a gene constitutes a valuable complement to single-marker analysis by aggregating data on complex traits in a biologically meaningful way. However, despite the existence of a wide variety of SNP-set methods, few comprehensive comparison studies have been previously performed to evaluate the effectiveness of these methods. Results We herein sought to fill this knowledge gap by conducting a comprehensive empirical comparison for 22 commonly-used summary-statistics based SNP-set methods. We showed that only seven methods could effectively control the type I error, and that these well-calibrated approaches had varying power performance under the simulation scenarios. Overall, we confirmed that the burden test was generally underpowered and score-based variance component tests (e.g., sequence kernel association test) were much powerful under the polygenic genetic architecture in both common and rare variant association analyses. We further revealed that two linkage-disequilibrium-free P value combination methods (e.g., harmonic mean P value method and aggregated Cauchy association test) behaved very well under the sparse genetic architecture in simulations and real-data applications to common and rare variant association analyses as well as in expression quantitative trait loci weighted integrative analysis. We also assessed the scalability of these approaches by recording computational time and found that all these methods can be scalable to biobank-scale data although some might be relatively slow. Conclusion In conclusion, we hope that our findings can offer an important guidance on how to choose appropriate multilocus association analysis methods in post-GWAS era. All the SNP-set methods are implemented in the R package called MCA, which is freely available at https://github.com/biostatpzeng/. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04897-3.
Collapse
Affiliation(s)
- Zhonghe Shao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Jiahao Qiao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Yuchen Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.,Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.,Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.,Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.,Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China. .,Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China. .,Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China. .,Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China. .,Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
| |
Collapse
|
26
|
Jin X, Zhang L, Ji J, Ju T, Zhao J, Yuan Z. Network regression analysis in transcriptome-wide association studies. BMC Genomics 2022; 23:562. [PMID: 35933330 PMCID: PMC9356418 DOI: 10.1186/s12864-022-08809-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 08/02/2022] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Transcriptome-wide association studies (TWASs) have shown great promise in interpreting the findings from genome-wide association studies (GWASs) and exploring the disease mechanisms, by integrating GWAS and eQTL mapping studies. Almost all TWAS methods only focus on one gene at a time, with exception of only two published multiple-gene methods nevertheless failing to account for the inter-dependence as well as the network structure among multiple genes, which may lead to power loss in TWAS analysis as complex disease often owe to multiple genes that interact with each other as a biological network. We therefore developed a Network Regression method in a two-stage TWAS framework (NeRiT) to detect whether a given network is associated with the traits of interest. NeRiT adopts the flexible Bayesian Dirichlet process regression to obtain the gene expression prediction weights in the first stage, uses pointwise mutual information to represent the general between-node correlation in the second stage and can effectively take the network structure among different gene nodes into account. RESULTS Comprehensive and realistic simulations indicated NeRiT had calibrated type I error control for testing both the node effect and edge effect, and yields higher power than the existed methods, especially in testing the edge effect. The results were consistent regardless of the GWAS sample size, the gene expression prediction model in the first step of TWAS, the network structure as well as the correlation pattern among different gene nodes. Real data applications through analyzing systolic blood pressure and diastolic blood pressure from UK Biobank showed that NeRiT can simultaneously identify the trait-related nodes as well as the trait-related edges. CONCLUSIONS NeRiT is a powerful and efficient network regression method in TWAS.
Collapse
Affiliation(s)
- Xiuyuan Jin
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.,Institute for Medical Dataology, Shandong University, Jinan, 250003, Shandong, China
| | - Liye Zhang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.,Institute for Medical Dataology, Shandong University, Jinan, 250003, Shandong, China
| | - Jiadong Ji
- Institute for Financial Studies, Shandong University, Jinan, 250100, Shandong, China
| | - Tao Ju
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.,Institute for Medical Dataology, Shandong University, Jinan, 250003, Shandong, China
| | - Jinghua Zhao
- Department of Public Health and Primary Care, Cardiovascular Epidemiology Unit, University of Cambridge, Cambridge, UK.
| | - Zhongshang Yuan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China. .,Institute for Medical Dataology, Shandong University, Jinan, 250003, Shandong, China.
| |
Collapse
|
27
|
Cao C, Kossinna P, Kwok D, Li Q, He J, Su L, Guo X, Zhang Q, Long Q. Disentangling genetic feature selection and aggregation in transcriptome-wide association studies. Genetics 2021; 220:6444993. [PMID: 34849857 DOI: 10.1093/genetics/iyab216] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 11/04/2021] [Indexed: 12/14/2022] Open
Abstract
The success of transcriptome-wide association studies (TWAS) has led to substantial research towards improving the predictive accuracy of its core component of Genetically Regulated eXpression (GReX). GReX links expression information with genotype and phenotype by playing two roles simultaneously: it acts as both the outcome of the genotype-based predictive models (for predicting expressions) and the linear combination of genotypes (as the predicted expressions) for association tests. From the perspective of machine learning (considering SNPs as features), these are actually two separable steps-feature selection and feature aggregation-which can be independently conducted. In this work, we show that the single approach of GReX limits the adaptability of TWAS methodology and practice. By conducting simulations and real data analysis, we demonstrate that disentangled protocols adapting straightforward approaches for feature selection (e.g., simple marker test) and aggregation (e.g., kernel machines) outperform the standard TWAS protocols that rely on GReX. Our development provides more powerful novel tools for conducting TWAS. More importantly, our characterization of the exact nature of TWAS suggests that, instead of questionably binding two distinct steps into the same statistical form (GReX), methodological research focusing on optimal combinations of feature selection and aggregation approaches will bring higher power to TWAS protocols.
Collapse
Affiliation(s)
- Chen Cao
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Devin Kwok
- Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Jingni He
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Liya Su
- Department of Pathology, Anatomy and Cell Biology, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN 37203, USA
| | - Qingrun Zhang
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.,Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Quan Long
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.,Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada.,Department of Medical Genetics, University of Calgary, Calgary, AB T2N 4N1, Canada.,Hotchkiss Brain Institute, O'Brien Institute for Public Health, University of Calgary, Calgary, AB T2N 4N1, Canada
| |
Collapse
|
28
|
Cao C, Wang J, Kwok D, Cui F, Zhang Z, Zhao D, Li MJ, Zou Q. webTWAS: a resource for disease candidate susceptibility genes identified by transcriptome-wide association study. Nucleic Acids Res 2021; 50:D1123-D1130. [PMID: 34669946 PMCID: PMC8728162 DOI: 10.1093/nar/gkab957] [Citation(s) in RCA: 94] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 09/24/2021] [Accepted: 10/05/2021] [Indexed: 12/20/2022] Open
Abstract
The development of transcriptome-wide association studies (TWAS) has enabled researchers to better identify and interpret causal genes in many diseases. However, there are currently no resources providing a comprehensive listing of gene-disease associations discovered by TWAS from published GWAS summary statistics. TWAS analyses are also difficult to conduct due to the complexity of TWAS software pipelines. To address these issues, we introduce a new resource called webTWAS, which integrates a database of the most comprehensive disease GWAS datasets currently available with credible sets of potential causal genes identified by multiple TWAS software packages. Specifically, a total of 235 064 gene-diseases associations for a wide range of human diseases are prioritized from 1298 high-quality downloadable European GWAS summary statistics. Associations are calculated with seven different statistical models based on three popular and representative TWAS software packages. Users can explore associations at the gene or disease level, and easily search for related studies or diseases using the MeSH disease tree. Since the effects of diseases are highly tissue-specific, webTWAS applies tissue-specific enrichment analysis to identify significant tissues. A user-friendly web server is also available to run custom TWAS analyses on user-provided GWAS summary statistics data. webTWAS is freely available at http://www.webtwas.net.
Collapse
Affiliation(s)
- Chen Cao
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.,Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Canada
| | - Jianhua Wang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Devin Kwok
- School of Computer Science, McGill University, Montreal, Canada
| | - Feifei Cui
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Zilong Zhang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Da Zhao
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Mulin Jun Li
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.,Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|