1
|
Luo L, Pang T, Zheng H, Liufu C, Chang S. xWAS analysis in neuropsychiatric disorders by integrating multi-molecular phenotype quantitative trait loci and GWAS summary data. J Transl Med 2024; 22:387. [PMID: 38664746 PMCID: PMC11044291 DOI: 10.1186/s12967-024-05065-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 03/05/2024] [Indexed: 04/29/2024] Open
Abstract
BACKGROUND Integrating quantitative trait loci (QTL) data related to molecular phenotypes with genome-wide association study (GWAS) data is an important post-GWAS strategic approach employed to identify disease-associated molecular features. Various types of molecular phenotypes have been investigated in neuropsychiatric disorders. However, these findings pertaining to distinct molecular features are often independent of each other, posing challenges for having an overview of the mapped genes. METHODS In this study, we comprehensively summarized published analyses focusing on four types of risk-related molecular features (gene expression, splicing transcriptome, protein abundance, and DNA methylation) across five common neuropsychiatric disorders. Subsequently, we conducted supplementary analyses with the latest GWAS dataset and corresponding deficient molecular phenotypes using Functional Summary-based Imputation (FUSION) and summary data-based Mendelian randomization (SMR). Based on the curated and supplemented results, novel reliable genes and their functions were explored. RESULTS Our findings revealed that eQTL exhibited superior ability in prioritizing risk genes compared to the other QTL, followed by sQTL. Approximately half of the genes associated with splicing transcriptome, protein abundance, and DNA methylation were successfully replicated by eQTL-associated genes across all five disorders. Furthermore, we identified 436 novel reliable genes, which enriched in pathways related with neurotransmitter transportation such as synaptic, dendrite, vesicles, axon along with correlations with other neuropsychiatric disorders. Finally, we identified ten multiple molecular involved regulation patterns (MMRP), which may provide valuable insights into understanding the contribution of molecular regulation network targeting these disease-associated genes. CONCLUSIONS The analyses prioritized novel and reliable gene sets related with five molecular features based on published and supplementary results for five common neuropsychiatric disorders, which were missed in the original GWAS analysis. Besides, the involved MMRP behind these genes could be given priority for further investigation to elucidate the pathogenic molecular mechanisms underlying neuropsychiatric disorders in future studies.
Collapse
Affiliation(s)
- Lingxue Luo
- Peking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), 51 Huayuan Bei Road, Beijing, 100191, China
| | - Tao Pang
- Peking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), 51 Huayuan Bei Road, Beijing, 100191, China
| | - Haohao Zheng
- Peking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), 51 Huayuan Bei Road, Beijing, 100191, China
| | - Chao Liufu
- Peking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), 51 Huayuan Bei Road, Beijing, 100191, China
| | - Suhua Chang
- Peking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), 51 Huayuan Bei Road, Beijing, 100191, China.
- Research Units of Diagnosis and Treatment of Mood Cognitive Disorder, Chinese Academy of Medical Sciences, Beijing, 100191, China.
| |
Collapse
|
2
|
Wu M, Wang F, Ge Y, Ma S, Li Y. Bi-level structured functional analysis for genome-wide association studies. Biometrics 2023; 79:3359-3373. [PMID: 37098961 DOI: 10.1111/biom.13871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 04/19/2023] [Indexed: 04/27/2023]
Abstract
Genome-wide association studies (GWAS) have led to great successes in identifying genotype-phenotype associations for complex human diseases. In such studies, the high dimensionality of single nucleotide polymorphisms (SNPs) often makes analysis difficult. Functional analysis, which interprets SNPs densely distributed in a chromosomal region as a continuous process rather than discrete observations, has emerged as a promising avenue for overcoming the high dimensionality challenges. However, the majority of the existing functional studies continue to be individual SNP based and are unable to sufficiently account for the intricate underpinning structures of SNP data. SNPs are often found in groups (e.g., genes or pathways) and have a natural group structure. Additionally, these SNP groups can be highly correlated with coordinated biological functions and interact in a network. Motivated by these unique characteristics of SNP data, we develop a novel bi-level structured functional analysis method and investigate disease-associated genetic variants at the SNP level and SNP group level simultaneously. The penalization technique is adopted for bi-level selection and also to accommodate the group-level network structure. Both the estimation and selection consistency properties are rigorously established. The superiority of the proposed method over alternatives is shown through extensive simulation studies. A type 2 diabetes SNP data application yields some biologically intriguing results.
Collapse
Affiliation(s)
- Mengyun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| | - Fan Wang
- Center for Applied Statistics, School of Statistics, and Statistical Consulting Center, Renmin University of China, Beijing, China
| | - Yeheng Ge
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| | - Shuangge Ma
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| | - Yang Li
- Center for Applied Statistics, School of Statistics, and Statistical Consulting Center, Renmin University of China, Beijing, China
| |
Collapse
|
3
|
de Leeuw C, Werme J, Savage JE, Peyrot WJ, Posthuma D. On the interpretation of transcriptome-wide association studies. PLoS Genet 2023; 19:e1010921. [PMID: 37676898 PMCID: PMC10508613 DOI: 10.1371/journal.pgen.1010921] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 09/19/2023] [Accepted: 08/15/2023] [Indexed: 09/09/2023] Open
Abstract
Transcriptome-wide association studies (TWAS) aim to detect relationships between gene expression and a phenotype, and are commonly used for secondary analysis of genome-wide association study (GWAS) results. Results from TWAS analyses are often interpreted as indicating a genetic relationship between gene expression and a phenotype, but this interpretation is not consistent with the null hypothesis that is evaluated in the traditional TWAS framework. In this study we provide a mathematical outline of this TWAS framework, and elucidate what interpretations are warranted given the null hypothesis it actually tests. We then use both simulations and real data analysis to assess the implications of misinterpreting TWAS results as indicative of a genetic relationship between gene expression and the phenotype. Our simulation results show considerably inflated type 1 error rates for TWAS when interpreted this way, with 41% of significant TWAS associations detected in the real data analysis found to have insufficient statistical evidence to infer such a relationship. This demonstrates that in current implementations, TWAS cannot reliably be used to investigate genetic relationships between gene expression and a phenotype, but that local genetic correlation analysis can serve as a potential alternative.
Collapse
Affiliation(s)
- Christiaan de Leeuw
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
| | - Josefin Werme
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
| | - Jeanne E. Savage
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
| | - Wouter J. Peyrot
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
- Department of Psychiatry, Amsterdam UMC, location VUmc, Amsterdam, the Netherlands
| | - Danielle Posthuma
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
- Department of Child and Adolescent Psychology and Psychiatry, section Complex Trait Genetics, Amsterdam Neuroscience, VU University Medical Centre, Amsterdam, The Netherlands
| |
Collapse
|
4
|
Zhang J, Liang X, Gonzales S, Liu J, Gao XR, Wang X. A gene based combination test using GWAS summary data. BMC Bioinformatics 2023; 24:2. [PMID: 36597047 PMCID: PMC9811798 DOI: 10.1186/s12859-022-05114-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 12/13/2022] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Gene-based association tests provide a useful alternative and complement to the usual single marker association tests, especially in genome-wide association studies (GWAS). The way of weighting for variants in a gene plays an important role in boosting the power of a gene-based association test. Appropriate weights can boost statistical power, especially when detecting genetic variants with weak effects on a trait. One major limitation of existing gene-based association tests lies in using weights that are predetermined biologically or empirically. This limitation often attenuates the power of a test. On another hand, effect sizes or directions of causal genetic variants in real data are usually unknown, driving a need for a flexible yet robust methodology of gene based association tests. Furthermore, access to individual-level data is often limited, while thousands of GWAS summary data are publicly and freely available. RESULTS To resolve these limitations, we propose a combination test named as OWC which is based on summary statistics from GWAS data. Several traditional methods including burden test, weighted sum of squared score test [SSU], weighted sum statistic [WSS], SNP-set Kernel Association Test [SKAT], and the score test are special cases of OWC. To evaluate the performance of OWC, we perform extensive simulation studies. Results of simulation studies demonstrate that OWC outperforms several existing popular methods. We further show that OWC outperforms comparison methods in real-world data analyses using schizophrenia GWAS summary data and a fasting glucose GWAS meta-analysis data. The proposed method is implemented in an R package available at https://github.com/Xuexia-Wang/OWC-R-package CONCLUSIONS: We propose a novel gene-based association test that incorporates four different weighting schemes (two constant weights and two weights proportional to normal statistic Z) and includes several popular methods as its special cases. Results of the simulation studies and real data analyses illustrate that the proposed test, OWC, outperforms comparable methods in most scenarios. These results demonstrate that OWC is a useful tool that adapts to the underlying biological model for a disease by weighting appropriately genetic variants and combination of well-known gene-based tests.
Collapse
Affiliation(s)
- Jianjun Zhang
- grid.266869.50000 0001 1008 957XDepartment of Mathematics, University of North Texas, 225 Avenue E, Denton, TX 76201 USA
| | - Xiaoyu Liang
- grid.17088.360000 0001 2150 1785Department of Epidemiology and Biostatistics, Michigan State University, 909 Wilson Rd Room B601, East Lansing, MI 48824 USA
| | - Samantha Gonzales
- grid.266869.50000 0001 1008 957XDepartment of Mathematics, University of North Texas, 225 Avenue E, Denton, TX 76201 USA
| | - Jianguo Liu
- grid.266869.50000 0001 1008 957XDepartment of Mathematics, University of North Texas, 225 Avenue E, Denton, TX 76201 USA
| | - Xiaoyi Raymond Gao
- grid.261331.40000 0001 2285 7943Department of Ophthalmology and Visual Science, Department of Biomedical informatics, Division of Human Genetics, Ohio State University, 915 Olentangy River Road, Columbus, OH 43212 USA
| | - Xuexia Wang
- grid.65456.340000 0001 2110 1845Department of Biostatistics, Robert Stempel College of Public Health and Social Work, Florida International University, 11200 SW 8th street, Miami, FL 33174 USA
| |
Collapse
|
5
|
Identification of Human Brain Proteins for Bitter-Sweet Taste Perception: A Joint Proteome-Wide and Transcriptome-Wide Association Study. Nutrients 2022; 14:nu14102177. [PMID: 35631318 PMCID: PMC9143225 DOI: 10.3390/nu14102177] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 05/17/2022] [Accepted: 05/20/2022] [Indexed: 02/04/2023] Open
Abstract
Objective: Bitter or sweet beverage perception is associated with alterations in brain structure and function. Our aim is to analyze the genetic association between bitter or sweet beverage perception and human brain proteins. Materials and methods: In our study, 8356 and 11,518 proteins were first collected from two reference datasets of human brain proteomes, the ROS/MAP and Banner. The bitter or sweet beverage perception-related proteome-wide association studies (PWAS) were then conducted by integrating recent genome-wide association study (GWAS) data (n = 422,300) of taste perception with human brain proteomes. The human brain gene expression profiles were collected from two reference datasets, including the brain RNA-seq (CBR) and brain RNA-seq splicing (CBRS). The taste perception-related transcriptome-wide association studies (TWAS) were finally performed by integrating the same GWAS data with human brain gene expression profiles to validate the PWAS findings. Results: In PWAS, four statistically significant proteins were identified using the ROS/MAP and then replicated using the Banner reference dataset (all permutated p < 0.05), including ABCG2 for total bitter beverages and tea, CPNE1 for total bitter beverage, ACTR1B for artificially sweetened beverages, FLOT2 for alcoholic bitter beverages and total sweet beverages. In TWAS analysis, six statistically significant genes were detected by CBR and confirmed by the CBRS reference dataset (all permutated p < 0.05), including PIGG for total bitter beverages and non-alcoholic bitter beverages, C3orf18 for total bitter beverages, ZSWIM7 for non-alcoholic bitter beverages, PEX7 for coffee, PKP4 for tea and RPLP2 for grape juice. Further comparison of the PWAS and TWAS found three common statistically significant proteins/genes identified from the Banner and CBR reference datasets, including THBS4 for total bitter beverages, CA4 for non-alcoholic bitter beverages, LIAS for non-grape juices. Conclusions: Our results support the potential effect of bitter or sweet beverage perception on brain function and identify several candidate brain proteins for bitter or sweet beverage perception.
Collapse
|
6
|
Cao X, Wang X, Zhang S, Sha Q. Gene-based association tests using GWAS summary statistics and incorporating eQTL. Sci Rep 2022; 12:3553. [PMID: 35241742 PMCID: PMC8894384 DOI: 10.1038/s41598-022-07465-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 02/11/2022] [Indexed: 01/29/2023] Open
Abstract
Although genome-wide association studies (GWAS) have been successfully applied to a variety of complex diseases and identified many genetic variants underlying complex diseases via single marker tests, there is still a considerable heritability of complex diseases that could not be explained by GWAS. One alternative approach to overcome the missing heritability caused by genetic heterogeneity is gene-based analysis, which considers the aggregate effects of multiple genetic variants in a single test. Another alternative approach is transcriptome-wide association study (TWAS). TWAS aggregates genomic information into functionally relevant units that map to genes and their expression. TWAS is not only powerful, but can also increase the interpretability in biological mechanisms of identified trait associated genes. In this study, we propose a powerful and computationally efficient gene-based association test, called Overall. Using extended Simes procedure, Overall aggregates information from three types of traditional gene-based association tests and also incorporates expression quantitative trait locus (eQTL) information into a gene-based association test using GWAS summary statistics. We show that after a small number of replications to estimate the correlation among the integrated gene-based tests, the p values of Overall can be calculated analytically. Simulation studies show that Overall can control type I error rates very well and has higher power than the tests that we compared with. We also apply Overall to two schizophrenia GWAS summary datasets and two lipids GWAS summary datasets. The results show that this newly developed method can identify more significant genes than other methods we compared with.
Collapse
Affiliation(s)
- Xuewei Cao
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, 49931, USA
| | - Xuexia Wang
- Department of Mathematics, University of North Texas, Denton, TX, USA
| | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, 49931, USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, 49931, USA.
| |
Collapse
|
7
|
Lin Z, Xue H, Malakhov MM, Knutson KA, Pan W. Accounting for nonlinear effects of gene expression identifies additional associated genes in transcriptome-wide association studies. Hum Mol Genet 2022; 31:2462-2470. [PMID: 35043938 PMCID: PMC9307319 DOI: 10.1093/hmg/ddac015] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 01/08/2022] [Accepted: 01/10/2022] [Indexed: 01/21/2023] Open
Abstract
Transcriptome-wide association studies (TWAS) integrate genome-wide association study (GWAS) data with gene expression (GE) data to identify (putative) causal genes for complex traits. There are two stages in TWAS: in Stage 1, a model is built to impute gene expression from genotypes, and in Stage 2, gene-trait association is tested using imputed gene expression. Despite many successes with TWAS, in the current practice, one only assumes a linear relationship between GE and the trait, which however may not hold, leading to loss of power. In this study, we extend the standard TWAS by considering a quadratic effect of GE, in addition to the usual linear effect. We train imputation models for both linear and quadratic gene expression levels in Stage 1, then include both the imputed linear and quadratic expression levels in Stage 2. We applied both the standard TWAS and our approach first to the ADNI gene expression data and the IGAP Alzheimer's disease GWAS summary data, then to the GTEx (V8) gene expression data and the UK Biobank individual-level GWAS data for lipids, followed by validation with different GWAS data, suitable model checking and more robust TWAS methods. In all these applications, the new TWAS approach was able to identify additional genes associated with Alzheimer's disease, LDL and HDL cholesterol levels, suggesting its likely power gains and thus the need to account for potentially nonlinear effects of gene expression on complex traits.
Collapse
Affiliation(s)
- Zhaotong Lin
- Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Haoran Xue
- Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Mykhaylo M Malakhov
- Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Katherine A Knutson
- Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Wei Pan
- To whom correspondence should be addressed at: A460 Mayo Building, 420 Delaware St SE, Minneapolis, MN 55455, USA. Tel: (612)626-2705; Fax: (612)626-0660;
| |
Collapse
|
8
|
Lu H, Zhang J, Jiang Z, Zhang M, Wang T, Zhao H, Zeng P. Detection of Genetic Overlap Between Rheumatoid Arthritis and Systemic Lupus Erythematosus Using GWAS Summary Statistics. Front Genet 2021; 12:656545. [PMID: 33815486 PMCID: PMC8012913 DOI: 10.3389/fgene.2021.656545] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 03/01/2021] [Indexed: 01/04/2023] Open
Abstract
Background Clinical and epidemiological studies have suggested systemic lupus erythematosus (SLE) and rheumatoid arthritis (RA) are comorbidities and common genetic etiologies can partly explain such coexistence. However, shared genetic determinations underlying the two diseases remain largely unknown. Methods Our analysis relied on summary statistics available from genome-wide association studies of SLE (N = 23,210) and RA (N = 58,284). We first evaluated the genetic correlation between RA and SLE through the linkage disequilibrium score regression (LDSC). Then, we performed a multiple-tissue eQTL (expression quantitative trait loci) weighted integrative analysis for each of the two diseases and aggregated association evidence across these tissues via the recently proposed harmonic mean P-value (HMP) combination strategy, which can produce a single well-calibrated P-value for correlated test statistics. Afterwards, we conducted the pleiotropy-informed association using conjunction conditional FDR (ccFDR) to identify potential pleiotropic genes associated with both RA and SLE. Results We found there existed a significant positive genetic correlation (rg = 0.404, P = 6.01E-10) via LDSC between RA and SLE. Based on the multiple-tissue eQTL weighted integrative analysis and the HMP combination across various tissues, we discovered 14 potential pleiotropic genes by ccFDR, among which four were likely newly novel genes (i.e., INPP5B, OR5K2, RP11-2C24.5, and CTD-3105H18.4). The SNP effect sizes of these pleiotropic genes were typically positively dependent, with an average correlation of 0.579. Functionally, these genes were implicated in multiple auto-immune relevant pathways such as inositol phosphate metabolic process, membrane and glucagon signaling pathway. Conclusion This study reveals common genetic components between RA and SLE and provides candidate associated loci for understanding of molecular mechanism underlying the comorbidity of the two diseases.
Collapse
Affiliation(s)
- Haojie Lu
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Jinhui Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Zhou Jiang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Meng Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Ting Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China.,Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Huashuo Zhao
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China.,Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China.,Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| |
Collapse
|