Zhang H, Wu Z. The generalized Fisher's combination and accurate p-value calculation under dependence.
Biometrics 2022. [PMID:
35178716 DOI:
10.1111/biom.13634]
[Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Accepted: 02/03/2022] [Indexed: 11/28/2022]
Abstract
Combining dependent tests of significance has broad applications but the related p-value calculation is challenging. For Fisher's combination test, current p-value calculation methods (e.g., Brown's approximation) tend to inflate the type I error rate when the desired significance level is substantially less than 0.05. The problem could lead to significant false discoveries in big data analyses. This paper provides two main contributions. First, it presents a general family of Fisher type statistics, referred to as the GFisher, which covers many classic statistics, such as Fisher's combination, Good's statistic, Lancaster's statistic, weighted Z-score combination, etc. The GFisher allows a flexible weighting scheme, as well as an omnibus procedure that automatically adapts proper weights and the statistic-defining parameters to a given data. Second, the paper presents several new p-value calculation methods based on two novel ideas: moment-ratio matching and joint-distribution surrogating. Systematic simulations show that the new calculation methods are more accurate under multivariate Gaussian, and more robust under the generalized linear model and the multivariate t-distribution. The applications of the GFisher and the new p-value calculation methods are demonstrated by a gene-based SNP-set association study. Relevant computation has been implemented to an R package GFisher available on the Comprehensive R Archive Network. This article is protected by copyright. All rights reserved.
Collapse