Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Liang Q, Abraham A, Capra JA, Kostka D. Disease-specific prioritization of non-coding GWAS variants based on chromatin accessibility. HGG ADVANCES 2024;5:100310. [PMID: 38773771 PMCID: PMC11259938 DOI: 10.1016/j.xhgg.2024.100310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 05/15/2024] [Accepted: 05/16/2024] [Indexed: 05/24/2024] Open

Abstract

Non-protein-coding genetic variants are a major driver of the genetic risk for human disease; however, identifying which non-coding variants contribute to diseases and their mechanisms remains challenging. In silico variant prioritization methods quantify a variant's severity, but for most methods, the specific phenotype and disease context of the prediction remain poorly defined. For example, many commonly used methods provide a single, organism-wide score for each variant, while other methods summarize a variant's impact in certain tissues and/or cell types. Here, we propose a complementary disease-specific variant prioritization scheme, which is motivated by the observation that variants contributing to disease often operate through specific biological mechanisms. We combine tissue/cell-type-specific variant scores (e.g., GenoSkyline, FitCons2, DNA accessibility) into disease-specific scores with a logistic regression approach and apply it to ∼25,000 non-coding variants spanning 111 diseases. We show that this disease-specific aggregation significantly improves the association of common non-coding genetic variants with disease (average precision: 0.151, baseline = 0.09), compared with organism-wide scores (GenoCanyon, LINSIGHT, GWAVA, Eigen, CADD; average precision: 0.129, baseline = 0.09). Further on, disease similarities based on data-driven aggregation weights highlight meaningful disease groups, and it provides information about tissues and cell types that drive these similarities. We also show that so-learned similarities are complementary to genetic similarities as quantified by genetic correlation. Overall, our approach demonstrates the strengths of disease-specific variant prioritization, leads to improvement in non-coding variant prioritization, and enables interpretable models that link variants to disease via specific tissues and/or cell types.

Collapse

Khatiwada A, Yilmaz AS, Wolf BJ, Pietrzak M, Chung D. multi-GPA-Tree: Statistical approach for pleiotropy informed and functional annotation tree guided prioritization of GWAS results. PLoS Comput Biol 2023;19:e1011686. [PMID: 38060592 PMCID: PMC10729974 DOI: 10.1371/journal.pcbi.1011686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 12/19/2023] [Accepted: 11/13/2023] [Indexed: 12/20/2023] Open

Abstract

Genome-wide association studies (GWAS) have successfully identified over two hundred thousand genotype-trait associations. Yet some challenges remain. First, complex traits are often associated with many single nucleotide polymorphisms (SNPs), most with small or moderate effect sizes, making them difficult to detect. Second, many complex traits share a common genetic basis due to 'pleiotropy' and and though few methods consider it, leveraging pleiotropy can improve statistical power to detect genotype-trait associations with weaker effect sizes. Third, currently available statistical methods are limited in explaining the functional mechanisms through which genetic variants are associated with specific or multiple traits. We propose multi-GPA-Tree to address these challenges. The multi-GPA-Tree approach can identify risk SNPs associated with single as well as multiple traits while also identifying the combinations of functional annotations that can explain the mechanisms through which risk-associated SNPs are linked with the traits. First, we implemented simulation studies to evaluate the proposed multi-GPA-Tree method and compared its performance with existing statistical approaches. The results indicate that multi-GPA-Tree outperforms existing statistical approaches in detecting risk-associated SNPs for multiple traits. Second, we applied multi-GPA-Tree to a systemic lupus erythematosus (SLE) and rheumatoid arthritis (RA), and to a Crohn's disease (CD) and ulcertive colitis (UC) GWAS, and functional annotation data including GenoSkyline and GenoSkylinePlus. Our results demonstrate that multi-GPA-Tree can be a powerful tool that improves association mapping while facilitating understanding of the underlying genetic architecture of complex traits and potential mechanisms linking risk-associated SNPs with complex traits.

Collapse

Moore A, Marks JA, Quach BC, Guo Y, Bierut LJ, Gaddis NC, Hancock DB, Page GP, Johnson EO. Evaluating 17 methods incorporating biological function with GWAS summary statistics to accelerate discovery demonstrates a tradeoff between high sensitivity and high positive predictive value. Commun Biol 2023;6:1199. [PMID: 38001305 PMCID: PMC10673847 DOI: 10.1038/s42003-023-05413-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 10/03/2023] [Indexed: 11/26/2023] Open

Cai M, Wang Z, Xiao J, Hu X, Chen G, Yang C. XMAP: Cross-population fine-mapping by leveraging genetic diversity and accounting for confounding bias. Nat Commun 2023;14:6870. [PMID: 37898663 PMCID: PMC10613261 DOI: 10.1038/s41467-023-42614-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Accepted: 10/17/2023] [Indexed: 10/30/2023] Open

Deng Q, Gupta A, Jeon H, Nam JH, Yilmaz AS, Chang W, Pietrzak M, Li L, Kim HJ, Chung D. graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation data. Front Genet 2023;14:1079198. [PMID: 37501720 PMCID: PMC10370274 DOI: 10.3389/fgene.2023.1079198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 06/21/2023] [Indexed: 07/29/2023] Open

Abstract

Genome-wide association studies (GWAS) have successfully identified a large number of genetic variants associated with traits and diseases. However, it still remains challenging to fully understand the functional mechanisms underlying many associated variants. This is especially the case when we are interested in variants shared across multiple phenotypes. To address this challenge, we propose graph-GPA 2.0 (GGPA 2.0), a statistical framework to integrate GWAS datasets for multiple phenotypes and incorporate functional annotations within a unified framework. Our simulation studies showed that incorporating functional annotation data using GGPA 2.0 not only improves the detection of disease-associated variants, but also provides a more accurate estimation of relationships among diseases. Next, we analyzed five autoimmune diseases and five psychiatric disorders with the functional annotations derived from GenoSkyline and GenoSkyline-Plus, along with the prior disease graph generated by biomedical literature mining. For autoimmune diseases, GGPA 2.0 identified enrichment for blood-related epigenetic marks, especially B cells and regulatory T cells, across multiple diseases. Psychiatric disorders were enriched for brain-related epigenetic marks, especially the prefrontal cortex and the inferior temporal lobe for bipolar disorder and schizophrenia, respectively. In addition, the pleiotropy between bipolar disorder and schizophrenia was also detected. Finally, we found that GGPA 2.0 is robust to the use of irrelevant and/or incorrect functional annotations. These results demonstrate that GGPA 2.0 can be a powerful tool to identify genetic variants associated with each phenotype or those shared across multiple phenotypes, while also promoting an understanding of functional mechanisms underlying the associated variants.

Collapse

Alamin M, Sultana MH, Lou X, Jin W, Xu H. Dissecting Complex Traits Using Omics Data: A Review on the Linear Mixed Models and Their Application in GWAS. PLANTS (BASEL, SWITZERLAND) 2022;11:3277. [PMID: 36501317 PMCID: PMC9739826 DOI: 10.3390/plants11233277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 11/23/2022] [Accepted: 11/25/2022] [Indexed: 06/17/2023]

Xiao J, Cai M, Yu X, Hu X, Chen G, Wan X, Yang C. Leveraging the local genetic structure for trans-ancestry association mapping. Am J Hum Genet 2022;109:1317-1337. [PMID: 35714612 PMCID: PMC9300880 DOI: 10.1016/j.ajhg.2022.05.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Accepted: 05/23/2022] [Indexed: 01/09/2023] Open

Khatiwada A, Wolf BJ, Yilmaz AS, Ramos PS, Pietrzak M, Lawson A, Hunt KJ, Kim HJ, Chung D. GPA-Tree: statistical approach for functional-annotation-tree-guided prioritization of GWAS results. Bioinformatics 2022;38:1067-1074. [PMID: 34849578 PMCID: PMC10060690 DOI: 10.1093/bioinformatics/btab802] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 10/09/2021] [Accepted: 11/23/2021] [Indexed: 02/03/2023] Open

Cai M, Xiao J, Zhang S, Wan X, Zhao H, Chen G, Yang C. A unified framework for cross-population trait prediction by leveraging the genetic correlation of polygenic traits. Am J Hum Genet 2021;108:632-655. [PMID: 33770506 PMCID: PMC8059341 DOI: 10.1016/j.ajhg.2021.03.002] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 03/01/2021] [Indexed: 12/29/2022] Open

Ming J, Wang T, Yang C. LPM: a latent probit model to characterize the relationship among complex traits using summary statistics from multiple GWASs and functional annotations. Bioinformatics 2020;36:2506-2514. [PMID: 31860024 DOI: 10.1093/bioinformatics/btz947] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2019] [Revised: 12/13/2019] [Accepted: 12/18/2019] [Indexed: 12/21/2022] Open

The statistical practice of the GTEx Project: from single to multiple tissues. QUANTITATIVE BIOLOGY 2020. [DOI: 10.1007/s40484-020-0210-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Sun N, Zhao H. Statistical Methods in Genome-Wide Association Studies. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-030320-041026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Yang C, Wan X, Lin X, Chen M, Zhou X, Liu J. CoMM: a collaborative mixed model to dissecting genetic contributions to complex traits by leveraging regulatory information. Bioinformatics 2020;35:1644-1652. [PMID: 30295737 DOI: 10.1093/bioinformatics/bty865] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2018] [Revised: 09/15/2018] [Accepted: 10/05/2018] [Indexed: 12/12/2022] Open

Abstract

MOTIVATION

Genome-wide association studies (GWASs) have been successful in identifying many genetic variants associated with complex traits. However, the mechanistic links between these variants and complex traits remain elusive. A scientific hypothesis is that genetic variants influence complex traits at the organismal level via affecting cellular traits, such as regulating gene expression and altering protein abundance. Although earlier works have already presented some scientific insights about this hypothesis and their findings are very promising, statistical methods that effectively harness multilayered data (e.g. genetic variants, cellular traits and organismal traits) on a large scale for functional and mechanistic exploration are highly demanding.

RESULTS

In this study, we propose a collaborative mixed model (CoMM) to investigate the mechanistic role of associated variants in complex traits. The key idea is built upon the emerging scientific evidence that genetic effects at the cellular level are much stronger than those at the organismal level. Briefly, CoMM combines two models: the first model relating gene expression with genotype and the second model relating phenotype with predicted gene expression using the first model. The two models are fitted jointly in CoMM, such that the uncertainty in predicting gene expression has been fully accounted. To demonstrate the advantages of CoMM over existing methods, we conducted extensive simulation studies, and also applied CoMM to analyze 25 traits in NFBC1966 and Genetic Epidemiology Research on Aging (GERA) studies by integrating transcriptome information from the Genetic European in Health and Disease (GEUVADIS) Project. The results indicate that by leveraging regulatory information, CoMM can effectively improve the power of prioritizing risk variants. Regarding the computational efficiency, CoMM can complete the analysis of NFBC1966 dataset and GERA datasets in 2 and 18 min, respectively.

AVAILABILITY AND IMPLEMENTATION

The developed R package is available at https://github.com/gordonliu810822/CoMM.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse