Jin J, Wang Y. T2-DAG: a powerful test for differentially expressed gene pathways via graph-informed structural equation modeling.
Bioinformatics 2022;
38:1005-1014. [PMID:
34755844 PMCID:
PMC8796375 DOI:
10.1093/bioinformatics/btab770]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 11/01/2021] [Accepted: 11/04/2021] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION
A major task in genetic studies is to identify genes related to human diseases and traits to understand functional characteristics of genetic mutations and enhance patient diagnosis. Compared with marginal analyses of individual genes, identification of gene pathways, i.e. a set of genes with known interactions that collectively contribute to specific biological functions, can provide more biologically meaningful results. Such gene pathway analysis can be formulated into a high-dimensional two-sample testing problem. Given the typically limited sample size of gene expression datasets, most existing two-sample tests tend to have compromised powers because they ignore or only inefficiently incorporate the auxiliary pathway information on gene interactions.
RESULTS
We propose T2-DAG, a Hotelling's T2-type test for detecting differentially expressed gene pathways, which efficiently leverages the auxiliary pathway information on gene interactions from existing pathway databases through a linear structural equation model. We further establish its asymptotic distribution under pertinent assumptions. Simulation studies under various scenarios show that T2-DAG outperforms several representative existing methods with well-controlled type-I error rates and substantially improved powers, even with incomplete or inaccurate pathway information or unadjusted confounding effects. We also illustrate the performance of T2-DAG in an application to detect differentially expressed KEGG pathways between different stages of lung cancer.
AVAILABILITY AND IMPLEMENTATION
The R (R Development Core Team, 2021) package T2DAG which implements the proposed T2-DAG test is available on Github at https://github.com/Jin93/T2DAG.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Collapse