1
|
Levac B, Kumar S, Jalal A, Tamir JI. Accelerated motion correction with deep generative diffusion models. Magn Reson Med 2024; 92:853-868. [PMID: 38688874 DOI: 10.1002/mrm.30082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 01/02/2024] [Accepted: 02/23/2024] [Indexed: 05/02/2024]
Abstract
PURPOSE The aim of this work is to develop a method to solve the ill-posed inverse problem of accelerated image reconstruction while correcting forward model imperfections in the context of subject motion during MRI examinations. METHODS The proposed solution uses a Bayesian framework based on deep generative diffusion models to jointly estimate a motion-free image and rigid motion estimates from subsampled and motion-corrupt two-dimensional (2D) k-space data. RESULTS We demonstrate the ability to reconstruct motion-free images from accelerated two-dimensional (2D) Cartesian and non-Cartesian scans without any external reference signal. We show that our method improves over existing correction techniques on both simulated and prospectively accelerated data. CONCLUSION We propose a flexible framework for retrospective motion correction of accelerated MRI based on deep generative diffusion models, with potential application to other forward model corruptions.
Collapse
Affiliation(s)
- Brett Levac
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, Texas, USA
| | - Sidharth Kumar
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, Texas, USA
| | - Ajil Jalal
- Electrical Engineering and Computer Sciences, University of California at Berkeley, Berkeley, California, USA
| | - Jonathan I Tamir
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, Texas, USA
| |
Collapse
|
2
|
Du Y, Li J, Raha S, Qu Y. A unified Bayesian framework for bias adjustment in multiple comparisons from clinical trials. Stat Med 2024; 43:2928-2943. [PMID: 38742595 DOI: 10.1002/sim.10064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 01/30/2024] [Accepted: 03/06/2024] [Indexed: 05/16/2024]
Abstract
In clinical trials, multiple comparisons arising from various treatments/doses, subgroups, or endpoints are common. Typically, trial teams focus on the comparison showing the largest observed treatment effect, often involving a specific treatment pair and endpoint within a subgroup. These findings frequently lead to follow-up pivotal studies, many of which do not confirm the initial positive results. Selection bias occurs when the most promising treatment, subgroup, or endpoint is chosen for further development, potentially skewing subsequent investigations. Such bias can be defined as the deviation in the observed treatment effects from the underlying truth. In this article, we propose a general and unified Bayesian framework to address selection bias in clinical trials with multiple comparisons. Our approach does not require a priori specification of a parametric distribution for the prior, offering a more flexible and generalized solution. The proposed method facilitates a more accurate interpretation of clinical trial results by adjusting for such selection bias. Through simulation studies, we compared several methods and demonstrated their superior performance over the normal shrinkage estimator. We recommended the use of Bayesian Model Averaging estimator averaging over Gaussian Mixture Models as the prior distribution based on its performance and flexibility. We applied the method to a multicenter, randomized, double-blind, placebo-controlled study investigating the cardiovascular effects of dulaglutide.
Collapse
Affiliation(s)
- Yu Du
- Global Statistical Sciences, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana
| | - Jianghao Li
- Global Statistical Sciences, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana
| | - Sohini Raha
- Global Statistical Sciences, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana
| | - Yongming Qu
- Global Statistical Sciences, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana
| |
Collapse
|
3
|
Feng J, Chen J, Li X, Ren X, Chen J, Li Z, Wu Y, Zhang Z, Yang R, Li J, Lu Y, Liu Y. Mendelian randomization and Bayesian model averaging of autoimmune diseases and Long COVID. Front Genet 2024; 15:1383162. [PMID: 39005628 PMCID: PMC11240141 DOI: 10.3389/fgene.2024.1383162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 05/27/2024] [Indexed: 07/16/2024] Open
Abstract
Background Following COVID-19, reports suggest Long COVID and autoimmune diseases (AIDs) in infected individuals. However, bidirectional causal effects between Long COVID and AIDs, which may help to prevent diseases, have not been fully investigated. Methods Summary-level data from genome-wide association studies (GWAS) of Long COVID (N = 52615) and AIDs including inflammatory bowel disease (IBD) (N = 377277), Crohn's disease (CD) (N = 361508), ulcerative colitis (UC) (N = 376564), etc. were employed. Bidirectional causal effects were gauged between AIDs and Long COVID by exploiting Mendelian randomization (MR) and Bayesian model averaging (BMA). Results The evidence of causal effects of IBD (OR = 1.06, 95% CI = 1.00-1.11, p = 3.13E-02), CD (OR = 1.10, 95% CI = 1.01-1.19, p = 2.21E-02) and UC (OR = 1.08, 95% CI = 1.03-1.13, p = 2.35E-03) on Long COVID was found. In MR-BMA, UC was estimated as the highest-ranked causal factor (MIP = 0.488, MACE = 0.035), followed by IBD and CD. Conclusion This MR study found that IBD, CD and UC had causal effects on Long COVID, which suggests a necessity to screen high-risk populations.
Collapse
Affiliation(s)
- Jieni Feng
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Jiankun Chen
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
- The Second Affiliated Hospital (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou University of Chinese Medicine, Guangzhou, China
- Guangzhou Key Laboratory of Traditional Chinese Medicine for Prevention and Treatment of Emerging Infectious Diseases, Guangzhou, China
| | - Xiaoya Li
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Xiaolei Ren
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Junxu Chen
- The First Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Zuming Li
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Yuan Wu
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
- The Second Affiliated Hospital (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Zhongde Zhang
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
- The Second Affiliated Hospital (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou University of Chinese Medicine, Guangzhou, China
- State Key Laboratory of Traditional Chinese Medicine Syndrome, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Rongyuan Yang
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
- The Second Affiliated Hospital (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou University of Chinese Medicine, Guangzhou, China
- Guangzhou Key Laboratory of Traditional Chinese Medicine for Prevention and Treatment of Emerging Infectious Diseases, Guangzhou, China
- State Key Laboratory of Traditional Chinese Medicine Syndrome, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Jiqiang Li
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
- The Second Affiliated Hospital (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou University of Chinese Medicine, Guangzhou, China
- Guangzhou Key Laboratory of Traditional Chinese Medicine for Prevention and Treatment of Emerging Infectious Diseases, Guangzhou, China
| | - Yue Lu
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
- The Second Affiliated Hospital (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Yuntao Liu
- The Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China
- The Second Affiliated Hospital (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou University of Chinese Medicine, Guangzhou, China
- Guangzhou Key Laboratory of Traditional Chinese Medicine for Prevention and Treatment of Emerging Infectious Diseases, Guangzhou, China
- State Key Laboratory of Traditional Chinese Medicine Syndrome, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| |
Collapse
|
4
|
Li S, Cheng L, Li J, Wang Z, Li J. Learning data distribution of three-dimensional ocean sound speed fields via diffusion models. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:3410-3425. [PMID: 38780198 DOI: 10.1121/10.0026026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 04/30/2024] [Indexed: 05/25/2024]
Abstract
The probability distribution of three-dimensional sound speed fields (3D SSFs) in an ocean region encapsulates vital information about their variations, serving as valuable data-driven priors for SSF inversion tasks. However, learning such a distribution is challenging due to the high dimensionality and complexity of 3D SSFs. To tackle this challenge, we propose employing the diffusion model, a cutting-edge deep generative model that has showcased remarkable performance in diverse domains, including image and audio processing. Nonetheless, applying this approach to 3D ocean SSFs encounters two primary hurdles. First, the lack of publicly available well-crafted 3D SSF datasets impedes training and evaluation. Second, 3D SSF data consist of multiple 2D layers with varying variances, which can lead to uneven denoising during the reverse process. To surmount these obstacles, we introduce a novel 3D SSF dataset called 3DSSF, specifically designed for training and evaluating deep generative models. In addition, we devise a high-capacity neural architecture for the diffusion model to effectively handle variations in 3D sound speeds. Furthermore, we employ state-of-the-art continuous-time-based optimization method and predictor-corrector scheme for high-performance training and sampling. Notably, this paper presents the first evaluation of the diffusion model's effectiveness in generating 3D SSF data. Numerical experiments validate the proposed method's strong ability to learn the underlying data distribution of 3D SSFs, and highlight its effectiveness in assisting SSF inversion tasks and subsequently characterizing the transmission loss of underwater acoustics.
Collapse
Affiliation(s)
- Siyuan Li
- College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China
| | - Lei Cheng
- College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China
| | - Jun Li
- China State Shipbuilding Corporation Systems Engineering Research Institute, Beijing 100094, China
| | - Zichen Wang
- China State Shipbuilding Corporation Systems Engineering Research Institute, Beijing 100094, China
| | - Jianlong Li
- College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China
- Hainan Institute, Zhejiang University, 572025, Sanya, China
| |
Collapse
|
5
|
So HC, Xue X, Ma Z, Sham PC. SumVg: Total Heritability Explained by All Variants in Genome-Wide Association Studies Based on Summary Statistics with Standard Error Estimates. Int J Mol Sci 2024; 25:1347. [PMID: 38279346 PMCID: PMC10816209 DOI: 10.3390/ijms25021347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 01/15/2024] [Accepted: 01/16/2024] [Indexed: 01/28/2024] Open
Abstract
Genome-wide association studies (GWAS) are commonly employed to study the genetic basis of complex traits/diseases, and a key question is how much heritability could be explained by all single nucleotide polymorphisms (SNPs) in GWAS. One widely used approach that relies on summary statistics only is linkage disequilibrium score regression (LDSC); however, this approach requires certain assumptions about the effects of SNPs (e.g., all SNPs contribute to heritability and each SNP contributes equal variance). More flexible modeling methods may be useful. We previously developed an approach recovering the "true" effect sizes from a set of observed z-statistics with an empirical Bayes approach, using only summary statistics. However, methods for standard error (SE) estimation are not available yet, limiting the interpretation of our results and the applicability of the approach. In this study, we developed several resampling-based approaches to estimate the SE of SNP-based heritability, including two jackknife and three parametric bootstrap methods. The resampling procedures are performed at the SNP level as it is most common to estimate heritability from GWAS summary statistics alone. Simulations showed that the delete-d-jackknife and parametric bootstrap approaches provide good estimates of the SE. In particular, the parametric bootstrap approaches yield the lowest root-mean-squared-error (RMSE) of the true SE. We also explored various methods for constructing confidence intervals (CIs). In addition, we applied our method to estimate the SNP-based heritability of 12 immune-related traits (levels of cytokines and growth factors) to shed light on their genetic architecture. We also implemented the methods to compute the sum of heritability explained and the corresponding SE in an R package SumVg. In conclusion, SumVg may provide a useful alternative tool for calculating SNP heritability and estimating SE/CI, which does not rely on distributional assumptions of SNP effects.
Collapse
Affiliation(s)
- Hon-Cheong So
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China; (X.X.); (Z.M.)
- KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Kunming Institute of Zoology and The Chinese University of Hong Kong, Shatin, Hong Kong, China
- Department of Psychiatry, The Chinese University of Hong Kong, Shatin, Hong Kong, China
- CUHK Shenzhen Research Institute, Shenzhen 518057, China
- Margaret K. L. Cheung Research Centre for Management of Parkinsonism, The Chinese University of Hong Kong, Shatin, Hong Kong, China
- Hong Kong Branch of the Chinese Academy of Sciences Center for Excellence in Animal Evolution and Genetics, The Chinese University of Hong Kong, Shatin, Hong Kong, China
- Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Xiao Xue
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China; (X.X.); (Z.M.)
| | - Zhijie Ma
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China; (X.X.); (Z.M.)
| | - Pak-Chung Sham
- Department of Psychiatry, The University of Hong Kong, Pokfulam, Hong Kong, China;
| |
Collapse
|
6
|
Forde A, Hemani G, Ferguson J. Review and further developments in statistical corrections for Winner's Curse in genetic association studies. PLoS Genet 2023; 19:e1010546. [PMID: 37721937 PMCID: PMC10538662 DOI: 10.1371/journal.pgen.1010546] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 09/28/2023] [Accepted: 08/29/2023] [Indexed: 09/20/2023] Open
Abstract
Genome-wide association studies (GWAS) are commonly used to identify genomic variants that are associated with complex traits, and estimate the magnitude of this association for each variant. However, it has been widely observed that the association estimates of variants tend to be lower in a replication study than in the study that discovered those associations. A phenomenon known as Winner's Curse is responsible for this upward bias present in association estimates of significant variants in the discovery study. We review existing Winner's Curse correction methods which require only GWAS summary statistics in order to make adjustments. In addition, we propose modifications to improve existing methods and propose a novel approach which uses the parametric bootstrap. We evaluate and compare methods, first using a wide variety of simulated data sets and then, using real data sets for three different traits. The metric, estimated mean squared error (MSE) over significant SNPs, was primarily used for method assessment. Our results indicate that widely used conditional likelihood based methods tend to perform poorly. The other considered methods behave much more similarly, with our proposed bootstrap method demonstrating very competitive performance. To complement this review, we have developed an R package, 'winnerscurse' which can be used to implement these various Winner's Curse adjustment methods to GWAS summary statistics.
Collapse
Affiliation(s)
- Amanda Forde
- School of Mathematical and Statistical Sciences, University of Galway, Galway, Ireland
| | - Gibran Hemani
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - John Ferguson
- HRB Clinical Research Facility, NUI Galway, Galway, Ireland
| |
Collapse
|
7
|
Guo X, Wei W, Liu M, Cai T, Wu C, Wang J. Assessing the Most Vulnerable Subgroup to Type II Diabetes Associated with Statin Usage: Evidence from Electronic Health Record Data. J Am Stat Assoc 2023; 118:1488-1499. [PMID: 38223220 PMCID: PMC10786632 DOI: 10.1080/01621459.2022.2157727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 11/21/2022] [Indexed: 12/23/2022]
Abstract
There have been increased concerns that the use of statins, one of the most commonly prescribed drugs for treating coronary artery disease, is potentially associated with the increased risk of new-onset Type II diabetes (T2D). Nevertheless, to date, there is no robust evidence supporting as to whether and what kind of populations are indeed vulnerable for developing T2D after taking statins. In this case study, leveraging the biobank and electronic health record data in the Partner Health System, we introduce a new data analysis pipeline and a novel statistical methodology that address existing limitations by (i) designing a rigorous causal framework that systematically examines the causal effects of statin usage on T2D risk in observational data, (ii) uncovering which patient subgroup is most vulnerable for developing T2D after taking statins, and (iii) assessing the replicability and statistical significance of the most vulnerable subgroup via a bootstrap calibration procedure. Our proposed approach delivers asymptotically sharp confidence intervals and debiased estimate for the treatment effect of the most vulnerable subgroup in the presence of high-dimensional covariates. With our proposed approach, we find that females with high T2D genetic risk are at the highest risk of developing T2D due to statin usage.
Collapse
Affiliation(s)
- Xinzhou Guo
- Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
| | - Waverly Wei
- Division of Biostatistics, UC Berkeley, Berkeley, CA
| | - Molei Liu
- Department of Biostatistics, Columbia Mailman School of Public Health, New York, NY
| | - Tianxi Cai
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Chong Wu
- Department of Biostatistics, MD Anderson Cancer Center, Houston, TX
| | - Jingshen Wang
- Division of Biostatistics, UC Berkeley, Berkeley, CA
| |
Collapse
|
8
|
Zhai J, Jiang H. Two-sample test with g-modeling and its applications. Stat Med 2023; 42:89-104. [PMID: 36412978 PMCID: PMC10099579 DOI: 10.1002/sim.9603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 07/31/2022] [Accepted: 10/31/2022] [Indexed: 11/23/2022]
Abstract
Many real data analyses involve two-sample comparisons in location or in distribution. Most existing methods focus on problems where observations are independently and identically distributed in each group. However, in some applications the observed data are not identically distributed but associated with some unobserved parameters which are identically distributed. To address this challenge, we propose a novel two-sample testing procedure as a combination of the g $$ g $$ -modeling density estimation introduced by Efron and the two-sample Kolmogorov-Smirnov test. We also propose efficient bootstrap algorithms to estimate the statistical significance for such tests. We demonstrate the utility of the proposed approach with two biostatistical applications: the analysis of surgical nodes data with binomial model and differential expression analysis of single-cell RNA sequencing data with zero-inflated Poisson model.
Collapse
Affiliation(s)
- Jingyi Zhai
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
| | - Hui Jiang
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
9
|
Image denoising in the deep learning era. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10305-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
10
|
Yang CH, Doss H, Vemuri BC. An Empirical Bayes Approach to Shrinkage Estimation on the Manifold of Symmetric Positive-Definite Matrices. J Am Stat Assoc 2022; 119:259-272. [PMID: 38590837 PMCID: PMC11000275 DOI: 10.1080/01621459.2022.2110877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 08/02/2022] [Indexed: 10/15/2022]
Abstract
The James-Stein estimator is an estimator of the multivariate normal mean and dominates the maximum likelihood estimator (MLE) under squared error loss. The original work inspired great interest in developing shrinkage estimators for a variety of problems. Nonetheless, research on shrinkage estimation for manifold-valued data is scarce. In this article, we propose shrinkage estimators for the parameters of the Log-Normal distribution defined on the manifold of N × N symmetric positive-definite matrices. For this manifold, we choose the Log-Euclidean metric as its Riemannian metric since it is easy to compute and has been widely used in a variety of applications. By using the Log-Euclidean distance in the loss function, we derive a shrinkage estimator in an analytic form and show that it is asymptotically optimal within a large class of estimators that includes the MLE, which is the sample Fréchet mean of the data. We demonstrate the performance of the proposed shrinkage estimator via several simulated data experiments. Additionally, we apply the shrinkage estimator to perform statistical inference in both diffusion and functional magnetic resonance imaging problems.
Collapse
Affiliation(s)
- Chun-Hao Yang
- Institute of Applied Mathematical Sciences, National Taiwan University, Taipei, Taiwan
| | - Hani Doss
- Department of Statistics, University of Florida, Gainesville, FL
| | - Baba C. Vemuri
- Department of CISE, University of Florida, Gainesville, FL
| |
Collapse
|
11
|
Sun Y, Zhang YY, Sun J. The empirical Bayes estimators of the parameter of the uniform distribution with an inverse gamma prior under Stein’s loss function. COMMUN STAT-SIMUL C 2022. [DOI: 10.1080/03610918.2022.2093904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Affiliation(s)
- Ya Sun
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
| | - Ying-Ying Zhang
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
- Chongqing Key Laboratory of Analytic Mathematics and Applications, Chongqing University, Chongqing, China
- Department of Statistics, School of Mathematics and Statistics, Yunnan University, Kunming, China
| | - Ji Sun
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
| |
Collapse
|
12
|
Zhang YY, Zhang YY, Wang ZY, Sun Y, Sun J. The empirical Bayes estimators of the variance parameter of the normal distribution with a conjugate inverse gamma prior under Stein’s loss function. COMMUN STAT-THEOR M 2022. [DOI: 10.1080/03610926.2022.2076123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Affiliation(s)
- Ying-Ying Zhang
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
- Chongqing Key Laboratory of Analytic Mathematics and Applications, Chongqing University, Chongqing, China
| | - Yuan-Yu Zhang
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
| | - Ze-Yu Wang
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
| | - Ya Sun
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
| | - Ji Sun
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
| |
Collapse
|
13
|
Using reference models in variable selection. Comput Stat 2022. [DOI: 10.1007/s00180-022-01231-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
AbstractVariable selection, or more generally, model reduction is an important aspect of the statistical workflow aiming to provide insights from data. In this paper, we discuss and demonstrate the benefits of using a reference model in variable selection. A reference model acts as a noise-filter on the target variable by modeling its data generating mechanism. As a result, using the reference model predictions in the model selection procedure reduces the variability and improves stability, leading to improved model selection performance. Assuming that a Bayesian reference model describes the true distribution of future data well, the theoretically preferred usage of the reference model is to project its predictive distribution to a reduced model, leading to projection predictive variable selection approach. We analyse how much the great performance of the projection predictive variable is due to the use of reference model and show that other variable selection methods can also be greatly improved by using the reference model as target instead of the original data. In several numerical experiments, we investigate the performance of the projective prediction approach as well as alternative variable selection methods with and without reference models. Our results indicate that the use of reference models generally translates into better and more stable variable selection.
Collapse
|
14
|
Du Y, Li Z, Chen X. Efficient empirical Bayes estimates for risk parameters of Pareto distributions. COMMUN STAT-THEOR M 2022. [DOI: 10.1080/03610926.2020.1766501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Yongmei Du
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, China
| | - Zhouping Li
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, China
| | - Xiaosong Chen
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, China
| |
Collapse
|
15
|
Empirical Bayes and Selective Inference. J Indian Inst Sci 2022. [DOI: 10.1007/s41745-022-00286-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
AbstractWe review the empirical Bayes approach to large-scale inference. In the context of the problem of inference for a high-dimensional normal mean, empirical Bayes methods are advocated as they exhibit risk-reducing shrinkage, while establishing appropriate control of frequentist properties of the inference. We elucidate these frequentist properties and evaluate the protection that empirical Bayes provides against selection bias.
Collapse
|
16
|
James GM, Radchenko P, Rava B. Irrational Exuberance: Correcting Bias in Probability Estimates. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2020.1787175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Gareth M. James
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| | | | - Bradley Rava
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| |
Collapse
|
17
|
Affiliation(s)
- Lilun Du
- Department of Information Systems, Business Statistics and Operations Management, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Inchi Hu
- Department of Information Systems, Business Statistics and Operations Management, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| |
Collapse
|
18
|
OUP accepted manuscript. Biometrika 2022. [DOI: 10.1093/biomet/asac019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
19
|
Affiliation(s)
| | - Stefan Wager
- Stanford University, Graduate School of Business, Stanford, United States
| |
Collapse
|
20
|
Wang L, Gao B, Fan Y, Xue F, Zhou X. Mendelian randomization under the omnigenic architecture. Brief Bioinform 2021; 22:6347949. [PMID: 34379090 DOI: 10.1093/bib/bbab322] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 07/22/2021] [Accepted: 07/24/2021] [Indexed: 11/15/2022] Open
Abstract
Mendelian randomization (MR) is a common analytic tool for exploring the causal relationship among complex traits. Existing MR methods require selecting a small set of single nucleotide polymorphisms (SNPs) to serve as instrument variables. However, selecting a small set of SNPs may not be ideal, as most complex traits have a polygenic or omnigenic architecture and are each influenced by thousands of SNPs. Here, motivated by the recent omnigenic hypothesis, we present an MR method that uses all genome-wide SNPs for causal inference. Our method uses summary statistics from genome-wide association studies as input, accommodates the commonly encountered horizontal pleiotropy effects and relies on a composite likelihood framework for scalable computation. We refer to our method as the omnigenic Mendelian randomization, or OMR. We examine the power and robustness of OMR through extensive simulations including those under various modeling misspecifications. We apply OMR to several real data applications, where we identify multiple complex traits that potentially causally influence coronary artery disease (CAD) and asthma. The identified new associations reveal important roles of blood lipids, blood pressure and immunity underlying CAD as well as important roles of immunity and obesity underlying asthma.
Collapse
Affiliation(s)
- Lu Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China.,Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Boran Gao
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yue Fan
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA.,School of Public Health, Health Science Center of Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
21
|
Qu Y, Park SY, Wu Q, Shen W. A unified approach for evaluating the prediction of treatment effect across different types of endpoints. Pharm Stat 2021; 21:4-16. [PMID: 34268857 DOI: 10.1002/pst.2149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 03/18/2021] [Accepted: 06/01/2021] [Indexed: 11/06/2022]
Abstract
Phase 2 and 3 development failure is one of the key factors for high drug development cost. Robust prediction of a candidate drug's efficacy and safety profile could potentially improve the success rate of the drug development. Therefore, systematic evaluation of the prediction is important for learning and continuous improvement of the prediction. In this article, we proposed a set of unified criteria that allow to evaluate the predictions across different endpoints, indications and development stages: standardized bias (SB), standardized mean squared errors (SMSE), and credibility of prediction. We applied the SB and SMSE to the predicted treatment effects for 54 comparisons in 5 compounds in immunology and diabetes.
Collapse
Affiliation(s)
- Yongming Qu
- Global Statistical Sciences, Eli Lilly and Company, Indianapolis, Indiana, USA
| | - So Young Park
- Global Statistical Sciences, Eli Lilly and Company, Indianapolis, Indiana, USA
| | - Qiwei Wu
- Global Statistical Sciences, Eli Lilly and Company, Indianapolis, Indiana, USA
| | - Wei Shen
- Global Statistical Sciences, Eli Lilly and Company, Indianapolis, Indiana, USA
| |
Collapse
|
22
|
Alvo M. Empirical Bayes on a shoestring and other applications. COMMUN STAT-THEOR M 2021. [DOI: 10.1080/03610926.2021.1948061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Mayer Alvo
- Department of Mathematics and Statistics, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
23
|
Wiklund SJ, Burman CF. Selection bias, investment decisions and treatment effect distributions. Pharm Stat 2021; 20:1168-1182. [PMID: 34002467 PMCID: PMC9290610 DOI: 10.1002/pst.2132] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 04/09/2021] [Accepted: 05/03/2021] [Indexed: 11/08/2022]
Abstract
When making decisions regarding the investment and design for a Phase 3 programme in the development of a new drug, the results from preceding Phase 2 trials are an important source of information. However, only projects in which the Phase 2 results show promising treatment effects will typically be considered for a Phase 3 investment decision. This implies that, for those projects where Phase 3 is pursued, the underlying Phase 2 estimates are subject to selection bias. We will in this article investigate the nature of this selection bias based on a selection of distributions for the treatment effect. We illustrate some properties of Bayesian estimates, providing shrinkage of the Phase 2 estimate to counteract the selection bias. We further give some empirical guidance regarding the choice of prior distribution and comment on the consequences for decision-making in investment and planning for Phase 3 programmes.
Collapse
|
24
|
Zhang C, Yu P, Wang X. Statistical inference in EV linear model. COMMUN STAT-THEOR M 2021. [DOI: 10.1080/03610926.2021.1914096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Chunxiu Zhang
- College of Mathematics and Computer Science, Shanxi Normal University, Linfen, China
| | - Ping Yu
- College of Mathematics and Computer Science, Shanxi Normal University, Linfen, China
| | - Xiaofeng Wang
- College of Mathematics and Computer Science, Shanxi Normal University, Linfen, China
| |
Collapse
|
25
|
Burghgraeve E, De Neve J, Rosseel Y. Estimating Structural Equation Models Using James-Stein Type Shrinkage Estimators. PSYCHOMETRIKA 2021; 86:96-130. [PMID: 33738686 DOI: 10.1007/s11336-021-09749-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Revised: 01/30/2021] [Indexed: 06/12/2023]
Abstract
We propose a two-step procedure to estimate structural equation models (SEMs). In a first step, the latent variable is replaced by its conditional expectation given the observed data. This conditional expectation is estimated using a James-Stein type shrinkage estimator. The second step consists of regressing the dependent variables on this shrinkage estimator. In addition to linear SEMs, we also derive shrinkage estimators to estimate polynomials. We empirically demonstrate the feasibility of the proposed method via simulation and contrast the proposed estimator with ML and MIIV estimators under a limited number of simulation scenarios. We illustrate the method on a case study.
Collapse
Affiliation(s)
- Elissa Burghgraeve
- Department of Data Analysis, GHENT UNIVERSITY, Henri Dunantlaan 1, Ghent, Belgium.
| | - Jan De Neve
- Department of Data Analysis, GHENT UNIVERSITY, Henri Dunantlaan 1, Ghent, Belgium
| | - Yves Rosseel
- Department of Data Analysis, GHENT UNIVERSITY, Henri Dunantlaan 1, Ghent, Belgium
| |
Collapse
|
26
|
Affiliation(s)
- Bradley Efron
- Department of Statistics Stanford University Stanford CA
| |
Collapse
|
27
|
Sun J, Zhang YY, Sun Y. The empirical Bayes estimators of the rate parameter of the inverse gamma distribution with a conjugate inverse gamma prior under Stein's loss function. J STAT COMPUT SIM 2020. [DOI: 10.1080/00949655.2020.1858299] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Ji Sun
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, People's Republic of China
| | - Ying-Ying Zhang
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, People's Republic of China
| | - Ya Sun
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, People's Republic of China
| |
Collapse
|
28
|
Fu L, Gang B, James GM, Sun W. Heteroscedasticity-Adjusted Ranking and Thresholding for Large-Scale Multiple Testing. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1840992] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Luella Fu
- Department of Mathematics, San Francisco State University, San Francisco, CA
| | - Bowen Gang
- Department of Statistics, Fudan University, Shanghai, China
| | - Gareth M. James
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| | - Wenguang Sun
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| |
Collapse
|
29
|
Affiliation(s)
- Zhi Ji
- School of Mathematics and StatisticsLanzhou University Lanzhou China
| | - Yang Wei
- School of Mathematics and StatisticsLanzhou University Lanzhou China
| | - Zhouping Li
- School of Mathematics and StatisticsLanzhou University Lanzhou China
| |
Collapse
|
30
|
Zhao J, Ming J, Hu X, Chen G, Liu J, Yang C. Bayesian weighted Mendelian randomization for causal inference based on summary statistics. Bioinformatics 2020; 36:1501-1508. [PMID: 31593215 DOI: 10.1093/bioinformatics/btz749] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 09/06/2019] [Accepted: 10/02/2019] [Indexed: 12/22/2022] Open
Abstract
MOTIVATION The results from Genome-Wide Association Studies (GWAS) on thousands of phenotypes provide an unprecedented opportunity to infer the causal effect of one phenotype (exposure) on another (outcome). Mendelian randomization (MR), an instrumental variable (IV) method, has been introduced for causal inference using GWAS data. Due to the polygenic architecture of complex traits/diseases and the ubiquity of pleiotropy, however, MR has many unique challenges compared to conventional IV methods. RESULTS We propose a Bayesian weighted Mendelian randomization (BWMR) for causal inference to address these challenges. In our BWMR model, the uncertainty of weak effects owing to polygenicity has been taken into account and the violation of IV assumption due to pleiotropy has been addressed through outlier detection by Bayesian weighting. To make the causal inference based on BWMR computationally stable and efficient, we developed a variational expectation-maximization (VEM) algorithm. Moreover, we have also derived an exact closed-form formula to correct the posterior covariance which is often underestimated in variational inference. Through comprehensive simulation studies, we evaluated the performance of BWMR, demonstrating the advantage of BWMR over its competitors. Then we applied BWMR to make causal inference between 130 metabolites and 93 complex human traits, uncovering novel causal relationship between exposure and outcome traits. AVAILABILITY AND IMPLEMENTATION The BWMR software is available at https://github.com/jiazhao97/BWMR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jia Zhao
- Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR 999077
- School of Mathematical Sciences, Beijing Normal University, Beijing 100875
| | - Jingsi Ming
- Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR 999077
| | - Xianghong Hu
- Department of Mathematics, Hong Kong Baptist University, Hong Kong SAR 999077
- Department of Mathematics, Southern University of Science and Technology, Shenzhen 518055
| | - Gang Chen
- The WeGene Company, Shenzhen 518042, China
| | - Jin Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, 169857 Singapore
| | - Can Yang
- Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR 999077
| |
Collapse
|
31
|
Qu Y, Du Y, Zhang Y, Shen L. Understanding and adjusting for the selection bias from a proof‐of‐concept study to a more confirmatory study. Stat Med 2020; 39:4593-4604. [DOI: 10.1002/sim.8740] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 05/02/2020] [Accepted: 08/08/2020] [Indexed: 01/16/2023]
Affiliation(s)
- Yongming Qu
- Department of Biometrics Eli Lilly and Company Indianapolis Indiana USA
| | - Yu Du
- Department of Biometrics Eli Lilly and Company Indianapolis Indiana USA
| | - Ying Zhang
- Department of Biometrics Eli Lilly and Company Indianapolis Indiana USA
| | - Lei Shen
- Department of Biometrics Eli Lilly and Company Indianapolis Indiana USA
| |
Collapse
|
32
|
Affiliation(s)
- Bradley Efron
- Department of Statistics, Stanford University, Stanford, CA
| |
Collapse
|
33
|
Saha S, Guntuboyina A. On the nonparametric maximum likelihood estimator for Gaussian location mixture densities with application to Gaussian denoising. Ann Stat 2020. [DOI: 10.1214/19-aos1817] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
34
|
Vsevolozhskaya OA, Zaykin DV. Quantifying posterior effect size distribution of susceptibility loci by common summary statistics. Genet Epidemiol 2020; 44:339-351. [PMID: 32100375 DOI: 10.1002/gepi.22286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 12/25/2019] [Accepted: 01/27/2020] [Indexed: 11/06/2022]
Abstract
Testing millions of single nucleotide polymorphisms (SNPs) in genetic association studies has become a standard routine for disease gene discovery. In light of recent re-evaluation of statistical practice, it has been suggested that p-values are unfit as summaries of statistical evidence. Despite this criticism, p-values contain information that can be utilized to address the concerns about their flaws. We present a new method for utilizing evidence summarized by p-values for estimating odds ratio (OR) based on its approximate posterior distribution. In our method, only p-values, sample size, and standard deviation for ln(OR) are needed as summaries of data, accompanied by a suitable prior distribution for ln(OR) that can assume any shape. The parameter of interest, ln(OR), is the only parameter with a specified prior distribution, hence our model is a mix of classical and Bayesian approaches. We show that our method retains the main advantages of the Bayesian approach: it yields direct probability statements about hypotheses for OR and is resistant to biases caused by selection of top-scoring SNPs. Our method enjoys greater flexibility than similarly inspired methods in the assumed distribution for the summary statistic and in the form of the prior for the parameter of interest. We illustrate our method by presenting interval estimates of effect size for reported genetic associations with lung cancer. Although we focus on OR, the method is not limited to this particular measure of effect size and can be used broadly for assessing reliability of findings in studies testing multiple predictors.
Collapse
Affiliation(s)
| | - Dmitri V Zaykin
- Biostatistics and Computational Biology, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina
| |
Collapse
|
35
|
Everaert C, Volders PJ, Morlion A, Thas O, Mestdagh P. SPECS: a non-parametric method to identify tissue-specific molecular features for unbalanced sample groups. BMC Bioinformatics 2020; 21:58. [PMID: 32066370 PMCID: PMC7026976 DOI: 10.1186/s12859-020-3407-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 02/11/2020] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND To understand biology and differences among various tissues or cell types, one typically searches for molecular features that display characteristic abundance patterns. Several specificity metrics have been introduced to identify tissue-specific molecular features, but these either require an equal number of replicates per tissue or they can't handle replicates at all. RESULTS We describe a non-parametric specificity score that is compatible with unequal sample group sizes. To demonstrate its usefulness, the specificity score was calculated on all GTEx samples, detecting known and novel tissue-specific genes. A webtool was developed to browse these results for genes or tissues of interest. An example python implementation of SPECS is available at https://github.com/celineeveraert/SPECS. The precalculated SPECS results on the GTEx data are available through a user-friendly browser at specs.cmgg.be. CONCLUSIONS SPECS is a non-parametric method that identifies known and novel specific-expressed genes. In addition, SPECS could be adopted for other features and applications.
Collapse
Affiliation(s)
- Celine Everaert
- Center for Medical Genetics, Department of Biomolecular Medicine, Ghent University, Ghent, Belgium.
- Cancer Research Institute Ghent, Ghent, Belgium.
| | - Pieter-Jan Volders
- Center for Medical Genetics, Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent, Ghent, Belgium
- Flemish Institute for Biotechnology, Ghent, Belgium
| | - Annelien Morlion
- Center for Medical Genetics, Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent, Ghent, Belgium
| | - Olivier Thas
- I-Biostat, Data Science Institute, Hasselt University, Hasselt, Belgium
- National Institute for Applied Statistics Australia (NIASRA), University of Wollongong, Wollongong, Australia
- Department of Data Analysis and Mathematical Modelling, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
| | - Pieter Mestdagh
- Center for Medical Genetics, Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent, Ghent, Belgium
| |
Collapse
|
36
|
Abstract
Summary
A two-stage normal hierarchical model called the Fay–Herriot model and the empirical Bayes estimator are widely used to obtain indirect and model-based estimates of means in small areas. However, the performance of the empirical Bayes estimator can be poor when the assumed normal distribution is misspecified. This article presents a simple modification that makes use of density power divergence and proposes a new robust empirical Bayes small area estimator. The mean squared error and estimated mean squared error of the proposed estimator are derived based on the asymptotic properties of the robust estimator of the model parameters. We investigate the numerical performance of the proposed method through simulations and an application to survey data.
Collapse
Affiliation(s)
- S Sugasawa
- Center for Spatial Information Science, The University of Tokyo, Kashiwa, Chiba 272-8568, Japan
| |
Collapse
|
37
|
Jiang W. On general maximum likelihood empirical Bayes estimation of heteroscedastic IID normal means. Electron J Stat 2020. [DOI: 10.1214/20-ejs1717] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
38
|
Davenport S, Nichols TE. Selective peak inference: Unbiased estimation of raw and standardized effect size at local maxima. Neuroimage 2019; 209:116375. [PMID: 31866164 DOI: 10.1016/j.neuroimage.2019.116375] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 11/11/2019] [Accepted: 11/17/2019] [Indexed: 11/29/2022] Open
Abstract
The spatial signals in neuroimaging mass univariate analyses can be characterized in a number of ways, but one widely used approach is peak inference: the identification of peaks in the image. Peak locations and magnitudes provide a useful summary of activation and are routinely reported, however, the magnitudes reflect selection bias as these points have both survived a threshold and are local maxima. In this paper we propose the use of resampling methods to estimate and correct this bias in order to estimate both the raw units change as well as standardized effect size measured with Cohen's d and partial R2. We evaluate our method with a massive open dataset, and discuss how the corrected estimates can be used to perform power analyses. Keywords: fMRI, selective inference, winner's curse, regression to the mean, bias, bootstrap, local maxima, UK biobank, power analyses, massive linear modeling.
Collapse
Affiliation(s)
- Samuel Davenport
- Department of Statistics, University of Oxford, Oxford, OX1 3LB, UK.
| | - Thomas E Nichols
- Oxford Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Population Health, University of Oxford, Oxford, OX3 7LF, UK; Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neuro-sciences, University of Oxford, Oxford, OX3 9DU, UK; Department of Statistics, University of Warwick, Coventry, CV4 7AL, UK
| |
Collapse
|
39
|
Banerjee T, Mukherjee G, Sun W. Adaptive Sparse Estimation With Side Information. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2019.1679639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Trambak Banerjee
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| | - Gourab Mukherjee
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| | - Wenguang Sun
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| |
Collapse
|
40
|
|
41
|
Zhang YY, Wang ZY, Duan ZM, Mi W. The empirical Bayes estimators of the parameter of the Poisson distribution with a conjugate gamma prior under Stein's loss function. J STAT COMPUT SIM 2019. [DOI: 10.1080/00949655.2019.1652606] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Ying-Ying Zhang
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, People's Republic of China
| | - Ze-Yu Wang
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, People's Republic of China
| | - Zheng-Min Duan
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, People's Republic of China
| | - Wen Mi
- School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu, People's Republic of China
| |
Collapse
|
42
|
|
43
|
Efron B. Rejoinder: Bayes, Oracle Bayes, and Empirical Bayes. Stat Sci 2019. [DOI: 10.1214/19-sts674rej] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
44
|
|
45
|
Reehorst ET, Schniter P. Regularization by Denoising: Clarifications and New Interpretations. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING 2019; 5:52-67. [PMID: 31633003 PMCID: PMC6801116 DOI: 10.1109/tci.2018.2880326] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Regularization by Denoising (RED), as recently proposed by Romano, Elad, and Milanfar, is powerful image-recovery framework that aims to minimize an explicit regularization objective constructed from a plug-in image-denoising function. Experimental evidence suggests that the RED algorithms are state-of-the-art. We claim, however, that explicit regularization does not explain the RED algorithms. In particular, we show that many of the expressions in the paper by Romano et al. hold only when the denoiser has a symmetric Jacobian, and we demonstrate that such symmetry does not occur with practical denoisers such as non-local means, BM3D, TNRD, and DnCNN. To explain the RED algorithms, we propose a new framework called Score-Matching by Denoising (SMD), which aims to match a "score" (i.e., the gradient of a log-prior). We then show tight connections between SMD, kernel density estimation, and constrained minimum mean-squared error denoising. Furthermore, we interpret the RED algorithms from Romano et al. and propose new algorithms with acceleration and convergence guarantees. Finally, we show that the RED algorithms seek a consensus equilibrium solution, which facilitates a comparison to plug-and-play ADMM.
Collapse
|
46
|
Morrison J, Simon N. Rank Conditional Coverage and Confidence Intervals in High-Dimensional Problems. J Comput Graph Stat 2019; 27:648-656. [PMID: 30740009 DOI: 10.1080/10618600.2017.1411270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Confidence interval procedures used in low dimensional settings are often inappropriate for high dimensional applications. When many parameters are estimated, marginal confidence intervals associated with the most significant estimates have very low coverage rates: They are too small and centered at biased estimates. The problem of forming confidence intervals in high dimensional settings has previously been studied through the lens of selection adjustment. In that framework, the goal is to control the proportion of non-covering intervals formed for selected parameters. In this paper we approach the problem by considering the relationship between rank and coverage probability. Marginal confidence intervals have very low coverage rates for the most significant parameters and high rates for parameters with more boring estimates. Many selection adjusted intervals have the same behavior despite controlling the coverage rate within a selected set. This relationship between rank and coverage rate means that the parameters most likely to be pursued further in follow-up or replication studies are the least likely to be covered by the constructed intervals. In this paper, we propose rank conditional coverage (RCC) as a new coverage criterion for confidence intervals in multiple testing/covering problems. The RCC is the expected coverage rate of an interval given the significance ranking for the associated estimator. We also propose two methods that use bootstrapping to construct confidence intervals that control the RCC. Because these methods make use of additional information captured by the ranks of the parameter estimates, they often produce smaller intervals than marginal or selection adjusted methods. These methods are implemented in R (R Core Team, 2017) in the package rcc available on CRAN at https://cran.r-project.org/web/packages/rcc/index.html.
Collapse
Affiliation(s)
- Jean Morrison
- Department of Human Gentetics, University of Chicago, Chicago, IL
| | - Noah Simon
- Department of Biostatistics, University of Washington, Seattle, WA
| |
Collapse
|
47
|
Zhang YY, Rong TZ, Li MM. The empirical Bayes estimators of the mean and variance parameters of the normal distribution with a conjugate normal-inverse-gamma prior by the moment method and the MLE method. COMMUN STAT-THEOR M 2019. [DOI: 10.1080/03610926.2018.1465081] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Ying-Ying Zhang
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
| | - Teng-Zhong Rong
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
| | - Man-Man Li
- Department of Statistics and Actuarial Science, College of Mathematics and Statistics, Chongqing University, Chongqing, China
| |
Collapse
|
48
|
Ong F, Milanfar P, Getreuer P. Local Kernels that Approximate Bayesian Regularization and Proximal Operators. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:3007-3019. [PMID: 30640613 DOI: 10.1109/tip.2019.2893071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this work, we broadly connect kernel-based filtering (e.g. approaches such as the bilateral filter and nonlocal means, but also many more) with general variational formulations of Bayesian regularized least squares, and the related concept of proximal operators. Variational/Bayesian/proximal formulations often result in optimization problems that do not have closed-form solutions, and therefore typically require global iterative solutions. Our main contribution here is to establish how one can approximate the solution of the resulting global optimization problems using locally adaptive filters with specific kernels. Our results are valid for small regularization strength (i.e. weak noise) but the approach is powerful enough to be useful for a wide range of applications because we expose how to derive a "kernelized" solution to these problems that approximates the global solution in one shot, using only local operations. As another side benefit in the reverse direction, given a local data-adaptive filter constructed with a particular choice of kernel, we enable the interpretation of such filters in the variational/Bayesian/proximal framework.
Collapse
|
49
|
Rosenkranz GK. Empirical Bayes estimators in hierarchical models with mixture priors. J Appl Stat 2018. [DOI: 10.1080/02664763.2018.1450364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Gerd K. Rosenkranz
- Institute for Medical Statistics, Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| |
Collapse
|
50
|
Affiliation(s)
- Peter McCullagh
- Department of Statistics, University of Chicago, S. Ellis Avenue, Chicago, Illinois, USA
| | - Nicholas G Polson
- Booth School of Business, S. Woodlawn Avenue, Chicago, Illinois, USA
| |
Collapse
|