1
|
Kim R, Pourahmadi M, Garcia TP. Positive-definite thresholding estimators of covariance matrices with zeros. J MULTIVARIATE ANAL 2023. [DOI: 10.1016/j.jmva.2023.105186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
|
2
|
Liu Y, Zhou J, Chen Z, Zhang X. A general averaging method for count data with overdispersion and/or excess zeros in biomedicine. Stat Methods Med Res 2023:9622802231159213. [PMID: 36919477 DOI: 10.1177/09622802231159213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
With the aim of providing better estimation for count data with overdispersion and/or excess zeros, we develop a novel estimation method-optimal weighting based on cross-validation-for the zero-inflated negative binomial model, where the Poisson, negative binomial, and zero-inflated Poisson models are all included as its special cases. To facilitate the selection of the optimal weight vector, a K-fold cross-validation technique is adopted. Unlike the jackknife model averaging discussed in Hansen and Racine (2012), the proposed method deletes one group of observations rather than only one observation to enhance the computational efficiency. Furthermore, we also theoretically prove the asymptotic optimality of the newly developed optimal weighting based on cross-validation method. Simulation studies and three empirical applications indicate the superiority of the presented optimal weighting based on cross-validation method when compared with the three commonly used information-based model selection methods and their model averaging counterparts.
Collapse
Affiliation(s)
- Yin Liu
- School of Statistics and Mathematics, 12445Zhongnan University of Economics and Law, Wuhan, P.R. China
| | - Jianghong Zhou
- Department of Credit Management, 83521Guangdong University of Finance, Guangzhou, P.R. China
| | - Zhanshou Chen
- School of Mathematics and Statistics, 107627Qinghai Normal University, Xining, P.R. China
| | - Xinyu Zhang
- Academy of Mathematics and Systems Science, 12381Chinese Academy of Science, Beijing, P.R. China
| |
Collapse
|
3
|
Xiong W, Pan H, Wang J, Tian M. An efficient model-free approach to interaction screening for high dimensional data. Stat Med 2023; 42:1583-1605. [PMID: 36857779 DOI: 10.1002/sim.9688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 12/02/2022] [Accepted: 02/06/2023] [Indexed: 03/03/2023]
Abstract
An innovated model-free interaction screening procedure called the MCVIS is proposed for high dimensional data analysis. Specifically, we adopt the introduced MCV index for quantifying the importance of an interaction effect among predictors. Our proposed method is fully nonparametric and is capable of successfully selecting interactions even if the signal of parental main effects is weak. The MCVIS procedure has many distinctive features: (i) it can work with discrete, categorical and continuous covariates; (ii) it can deal with both categorical and continuous response, even handle the missing response; (iii) it is robust for heavy-tailed distributions, thus well accommodates heterogeneity typically caused by high dimensionality; (iv) it enjoys the sure screening and ranking consistency properties, therefore achieves dimension reduction without information loss. In another respect, computational feasibility is a top concern in high dimensional data analysis, by transforming our MCV into several variants, the MCVIS procedure is simple and fast to implement. Extensive numerical experiments and comparisons confirm the effectiveness and wide applicability of our MCVIS procedure. We further illustrate the proposed methodology by empirical study of two real datasets. Supplementary materials are available online.
Collapse
Affiliation(s)
- Wei Xiong
- School of Statistics, University of International Business and Economics, Beijing, China
| | - Han Pan
- School of Mathematical Sciences, Peking University, Beijing, China
| | - Jianrong Wang
- School of Statistics, University of International Business and Economics, Beijing, China
| | - Maozai Tian
- Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing, China
| |
Collapse
|
4
|
Chen J, Li Q, Chen HY. Testing generalized linear models with high-dimensional nuisance parameter. Biometrika 2023; 110:83-99. [PMID: 36816791 PMCID: PMC9933885 DOI: 10.1093/biomet/asac021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Generalized linear models often have a high-dimensional nuisance parameters, as seen in applications such as testing gene-environment interactions or gene-gene interactions. In these scenarios, it is essential to test the significance of a high-dimensional sub-vector of the model's coefficients. Although some existing methods can tackle this problem, they often rely on the bootstrap to approximate the asymptotic distribution of the test statistic, and thus are computationally expensive. Here, we propose a computationally efficient test with a closed-form limiting distribution, which allows the parameter being tested to be either sparse or dense. We show that under certain regularity conditions, the type I error of the proposed method is asymptotically correct, and we establish its power under high-dimensional alternatives. Extensive simulations demonstrate the good performance of the proposed test and its robustness when certain sparsity assumptions are violated. We also apply the proposed method to Chinese famine sample data in order to show its performance when testing the significance of gene-environment interactions.
Collapse
Affiliation(s)
- Jinsong Chen
- College of Applied Health Sciences, University of Illinois at Chicago, 1919 W Taylor St, Chicago, Illinois 60612, U.S.A
| | - Quefeng Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| | - Hua Yun Chen
- School of Public Health, University of Illinois at Chicago, 2121 W Taylor St, Chicago, Illinois 60612, U.S.A
| |
Collapse
|
5
|
Song S, Lin Y, Zhou Y. A General M-estimation Theory in Semi-Supervised Framework. J Am Stat Assoc 2023. [DOI: 10.1080/01621459.2023.2169699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Affiliation(s)
- Shanshan Song
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong, China
| | - Yuanyuan Lin
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong, China
| | - Yong Zhou
- KLATASDS-MOE, School of Statistics and Academy of Statistics and Interdisciplinary Sciences, East China Normal University, Shanghai, China
| |
Collapse
|
6
|
Li T, Yu J, Meng C. Scalable model-free feature screening via sliced-Wasserstein dependency. J Comput Graph Stat 2023. [DOI: 10.1080/10618600.2023.2183213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Affiliation(s)
- Tao Li
- Center for Applied Statistics, Institute of Statistics and Big Data, Renmin University of China
| | - Jun Yu
- School of Mathematics and Statistics, Beijing Institute of Technology
| | - Cheng Meng
- Center for Applied Statistics, Institute of Statistics and Big Data, Renmin University of China
| |
Collapse
|
7
|
Cheng W, Li X, Li X, Yan X. Model averaging for generalized linear models with missing at random covariates. STATISTICS-ABINGDON 2022. [DOI: 10.1080/02331888.2022.2161094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Weili Cheng
- School of Mathematics and Statistics, North China University of Water Resources and Electric Power, Zhengzhou, People's Republic of China
| | - Xiaorui Li
- School of Mathematics and Statistics, North China University of Water Resources and Electric Power, Zhengzhou, People's Republic of China
| | - Xiaoxia Li
- School of Mathematics and Information Technology, Yuncheng University, Yuncheng, People's Republic of China
| | - Xiaodong Yan
- Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, People's Republic of China
- Shandong National Center for Applied Mathematics, Shandong Province Key Laboratory of Financial Risk, Shandong, People's Republic of China
| |
Collapse
|
8
|
Zhou J, Wan ATK, Yu D. Frequentist model averaging for zero‐inflated Poisson regression models. Stat Anal Data Min 2022. [DOI: 10.1002/sam.11598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Jianhong Zhou
- Department of Credit Management Guangdong University of Finance Guangzhou China
| | - Alan T. K. Wan
- Department of Management Sciences City University of Hong Kong Kowloon, Hong Kong China
| | - Dalei Yu
- Department of Statistics Yunnan University of Finance and Economics Kunming China
| |
Collapse
|
9
|
Feature screening and FDR control with knockoff features for ultrahigh-dimensional right-censored data. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2022.107504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
10
|
Ma W, Xiao J, Yang Y, Ye F. Model-free feature screening for ultrahigh dimensional data via a Pearson chi-square based index. J STAT COMPUT SIM 2022. [DOI: 10.1080/00949655.2022.2062358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Weidong Ma
- Department of Mathematical Sciences, Tsinghua University, Beijing, People's Republic of China
| | - Jingsong Xiao
- Department of Mathematical Sciences, Tsinghua University, Beijing, People's Republic of China
| | - Ying Yang
- Department of Mathematical Sciences, Tsinghua University, Beijing, People's Republic of China
| | - Fei Ye
- School of Statistics, Capital University of Economics and Business, Beijing, People's Republic of China
| |
Collapse
|
11
|
Hung H, Huang SY, Ing CK. A generalized information criterion for high-dimensional PCA rank selection. Stat Pap (Berl) 2022. [DOI: 10.1007/s00362-021-01276-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
12
|
Affiliation(s)
- Lingyue Zhang
- School of Mathematical Sciences, Dalian University of Technology, Dalian, China
| | - Dawei Lu
- School of Mathematical Sciences, Dalian University of Technology, Dalian, China
| | - Xiaoguang Wang
- School of Mathematical Sciences, Dalian University of Technology, Dalian, China
| |
Collapse
|
13
|
Okumura H. Bias reduction and model selection in misspecified models. COMMUN STAT-THEOR M 2021. [DOI: 10.1080/03610926.2021.1959613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Hidenori Okumura
- Faculty of Nursing, Japanese Red Cross Hiroshima College of Nursing, Hatsukaichi, Japan
| |
Collapse
|
14
|
Hou E, Lawrence E, Hero AO. Penalized ensemble Kalman filters for high dimensional non-linear systems. PLoS One 2021; 16:e0248046. [PMID: 33735201 PMCID: PMC7971544 DOI: 10.1371/journal.pone.0248046] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Accepted: 02/18/2021] [Indexed: 11/24/2022] Open
Abstract
The ensemble Kalman filter (EnKF) is a data assimilation technique that uses an ensemble of models, updated with data, to track the time evolution of a usually non-linear system. It does so by using an empirical approximation to the well-known Kalman filter. However, its performance can suffer when the ensemble size is smaller than the state space, as is often necessary for computationally burdensome models. This scenario means that the empirical estimate of the state covariance is not full rank and possibly quite noisy. To solve this problem in this high dimensional regime, we propose a computationally fast and easy to implement algorithm called the penalized ensemble Kalman filter (PEnKF). Under certain conditions, it can be theoretically proven that the PEnKF will be accurate (the estimation error will converge to zero) despite having fewer ensemble members than state dimensions. Further, as contrasted to localization methods, the proposed approach learns the covariance structure associated with the dynamical system. These theoretical results are supported with simulations of several non-linear and high dimensional systems.
Collapse
Affiliation(s)
- Elizabeth Hou
- EECS Department, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail:
| | - Earl Lawrence
- Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Alfred O. Hero
- EECS Department, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
15
|
Demirkaya E, Feng Y, Basu P, Lv J. Large-scale model selection in misspecified generalized linear models. Biometrika 2021. [DOI: 10.1093/biomet/asab005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Summary
Model selection is crucial both to high-dimensional learning and to inference for contemporary big data applications in pinpointing the best set of covariates among a sequence of candidate interpretable models. Most existing work implicitly assumes that the models are correctly specified or have fixed dimensionality, yet both model misspecification and high dimensionality are prevalent in practice. In this paper, we exploit the framework of model selection principles under the misspecified generalized linear models presented in Lv & Liu (2014), and investigate the asymptotic expansion of the posterior model probability in the setting of high-dimensional misspecified models. With a natural choice of prior probabilities that encourages interpretability and incorporates the Kullback–Leibler divergence, we suggest using the high-dimensional generalized Bayesian information criterion with prior probability for large-scale model selection with misspecification. Our new information criterion characterizes the impacts of both model misspecification and high dimensionality on model selection. We further establish the consistency of covariance contrast matrix estimation and the model selection consistency of the new information criterion in ultrahigh dimensions under some mild regularity conditions. Our numerical studies demonstrate that the proposed method enjoys improved model selection consistency over its main competitors.
Collapse
|
16
|
Staerk C, Kateri M, Ntzoufras I. High-dimensional variable selection via low-dimensional adaptive learning. Electron J Stat 2021. [DOI: 10.1214/21-ejs1797] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
17
|
Queiroz FF, Lemonte AJ. On hypothesis testing inference in location-scale models under model misspecification. J STAT COMPUT SIM 2020. [DOI: 10.1080/00949655.2020.1763996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
| | - Artur J. Lemonte
- Departamento de Estatística, Universidade Federal do Rio Grande do Norte, Centro de Ciências Exatas e da Terra, Natal, Brazil
| |
Collapse
|
18
|
Liu W, Ke Y, Liu J, Li R. Model-Free Feature Screening and FDR Control With Knockoff Features. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1783274] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Wanjun Liu
- Department of Statistics, The Pennsylvania State University, University Park, PA
| | - Yuan Ke
- Department of Statistics, University of Georgia, Athens, GA
| | - Jingyuan Liu
- MOE Key Laboratory of Econometrics, Department of Statistics, School of Economics, Wang Yanan Institute for Studies in Economics, and Fujian Key Lab of Statistics, Xiamen University, Xiamen, China
| | - Runze Li
- Department of Statistics, The Pennsylvania State University, University Park, PA
| |
Collapse
|
19
|
Abstract
Functional data is a common and important type in econometrics and has been easier and easier to collect in the big data era. To improve estimation accuracy and reduce forecast risks with functional data, in this paper, we propose a novel cross-validation model averaging method for generalized functional linear model where the scalar response variable is related to a random function predictor by a link function. We establish asymptotic theoretical result on the optimality of the weights selected by our method when the true model is not in the candidate model set. Our simulations show that the proposed method often performs better than the commonly used model selection and averaging methods. We also apply the proposed method to Beijing second-hand house price data.
Collapse
|
20
|
Bachoc F, Preinerstorfer D, Steinberger L. Uniformly valid confidence intervals post-model-selection. Ann Stat 2020. [DOI: 10.1214/19-aos1815] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
21
|
Scientific discovery in a model-centric framework: Reproducibility, innovation, and epistemic diversity. PLoS One 2019; 14:e0216125. [PMID: 31091251 PMCID: PMC6519896 DOI: 10.1371/journal.pone.0216125] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Accepted: 04/16/2019] [Indexed: 01/16/2023] Open
Abstract
Consistent confirmations obtained independently of each other lend credibility to a scientific result. We refer to results satisfying this consistency as reproducible and assume that reproducibility is a desirable property of scientific discovery. Yet seemingly science also progresses despite irreproducible results, indicating that the relationship between reproducibility and other desirable properties of scientific discovery is not well understood. These properties include early discovery of truth, persistence on truth once it is discovered, and time spent on truth in a long-term scientific inquiry. We build a mathematical model of scientific discovery that presents a viable framework to study its desirable properties including reproducibility. In this framework, we assume that scientists adopt a model-centric approach to discover the true model generating data in a stochastic process of scientific discovery. We analyze the properties of this process using Markov chain theory, Monte Carlo methods, and agent-based modeling. We show that the scientific process may not converge to truth even if scientific results are reproducible and that irreproducible results do not necessarily imply untrue results. The proportion of different research strategies represented in the scientific population, scientists’ choice of methodology, the complexity of truth, and the strength of signal contribute to this counter-intuitive finding. Important insights include that innovative research speeds up the discovery of scientific truth by facilitating the exploration of model space and epistemic diversity optimizes across desirable properties of scientific discovery.
Collapse
|
22
|
Li Y, Luo Y, Ferrari D, Hu X, Qin Y. Rejoinder to Discussions on: Model confidence bounds for variable selection. Biometrics 2019; 75:411-413. [PMID: 30955222 DOI: 10.1111/biom.13020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Yang Li
- School of Statistics, Renmin University of China.,Center for Applied Statistics, Renmin University of China
| | - Yuetian Luo
- Department of Statistics, University of Wisconsin, Madison
| | - Davide Ferrari
- Faculty of Economics and Management, Free University of Bozen-Bolzano
| | - Xiaonan Hu
- Department of Biostatistics, Yale University.,School of Mathematical Sciences, University of Chinese Academy of Sciences
| | - Yichen Qin
- Department of Operations, Business Analytics, and Information Systems, University of Cincinnati
| |
Collapse
|
23
|
Hsu HL, Ing CK, Tong H. On model selection from a finite family of possibly misspecified time series models. Ann Stat 2019. [DOI: 10.1214/18-aos1706] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
24
|
Abstract
Bifactor and other hierarchical models have become central to representing and explaining observations in psychopathology, health, and other areas of clinical science, as well as in the behavioral sciences more broadly. This prominence comes after a relatively rapid period of rediscovery, however, and certain features remain poorly understood. Here, hierarchical models are compared and contrasted with other models of superordinate structure, with a focus on implications for model comparisons and interpretation. Issues pertaining to the specification and estimation of bifactor and other hierarchical models are reviewed in exploratory as well as confirmatory modeling scenarios, as are emerging findings about model fit and selection. Bifactor and other hierarchical models provide a powerful mechanism for parsing shared and unique components of variance, but care is required in specifying and making inferences about them.
Collapse
Affiliation(s)
- Kristian E. Markon
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, Iowa 52242, USA
| |
Collapse
|
25
|
Song A, Ma T, Lv S, Lin C. A model-free variable selection method for reducing the number of redundant variables. STATISTICS-ABINGDON 2018. [DOI: 10.1080/02331888.2018.1515949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- Anchao Song
- School of Statistics, Southwestern University of Finance and Economics, ChengDu, China
| | - Tiefeng Ma
- School of Statistics, Southwestern University of Finance and Economics, ChengDu, China
| | - Shaogao Lv
- School of Statistics and Mathematics, Nanjing Audit University, Nanjing, China
| | - Changsheng Lin
- School of Mathematics and Statistics, Yangtze Normal University, Chongqing, China
| |
Collapse
|
26
|
|
27
|
Yu D, Zhang X, Yau KKW. Asymptotic properties and information criteria for misspecified generalized linear mixed models. J R Stat Soc Series B Stat Methodol 2018. [DOI: 10.1111/rssb.12270] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Dalei Yu
- Yunnan University of Finance and Economics; Kunming People's Republic of China
| | - Xinyu Zhang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences; Beijing People's Republic of China
| | | |
Collapse
|
28
|
Liu Y, Wang P. Selection by partitioning the solution paths. Electron J Stat 2018. [DOI: 10.1214/18-ejs1434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
29
|
Miyata Y. Laplace approximations and Bayesian information criteria in possibly misspecified models. COMMUN STAT-THEOR M 2017. [DOI: 10.1080/03610926.2017.1295079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Yoichi Miyata
- Faculty of Economics, Takasaki City University of Economics, Kaminamie, Takasaki, Gunma, Japan
| |
Collapse
|
30
|
Ando T, Li KC. A weight-relaxed model averaging approach for high-dimensional generalized linear models. Ann Stat 2017. [DOI: 10.1214/17-aos1538] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
31
|
Kubkowski M, Mielniczuk J. Active sets of predictors for misspecified logistic regression. STATISTICS-ABINGDON 2017. [DOI: 10.1080/02331888.2017.1290096] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- M. Kubkowski
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - J. Mielniczuk
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
- Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
| |
Collapse
|
32
|
Zhang X, Yu D, Zou G, Liang H. Optimal Model Averaging Estimation for Generalized Linear Models and Generalized Linear Mixed-Effects Models. J Am Stat Assoc 2017. [DOI: 10.1080/01621459.2015.1115762] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Xinyu Zhang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, and School of Mathematical Science, Beijing, China
| | - Dalei Yu
- Statistics and Mathematics College, Yunnan University of Finance and Economics, Kunming, China
| | - Guohua Zou
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, and School of Mathematical Science, Beijing, China
| | - Hua Liang
- Department of Statistics, Geoge Washington University, Washington, DC, USA
| |
Collapse
|
33
|
Deledalle CA. Estimation of Kullback-Leibler losses for noisy recovery problems within the exponential family. Electron J Stat 2017. [DOI: 10.1214/17-ejs1321] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
34
|
Mukhopadhyay M, Samanta T. A mixture of g-priors for variable selection when the number of regressors grows with the sample size. TEST-SPAIN 2016. [DOI: 10.1007/s11749-016-0516-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
35
|
|
36
|
Brewer MJ, Butler A, Cooksley SL. The relative performance of AIC, AICC
and BIC in the presence of unobserved heterogeneity. Methods Ecol Evol 2016. [DOI: 10.1111/2041-210x.12541] [Citation(s) in RCA: 139] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Mark J. Brewer
- Biomathematics and Statistics Scotland; Craigiebuckler Aberdeen AB15 8QH UK
| | - Adam Butler
- Biomathematics and Statistics Scotland; JCMB; The King's Buildings Edinburgh EH9 3JZ UK
| | | |
Collapse
|
37
|
|
38
|
Lin W, Feng R, Li H. Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics. J Am Stat Assoc 2015; 110:270-288. [PMID: 26392642 DOI: 10.1080/01621459.2014.908125] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
In genetical genomics studies, it is important to jointly analyze gene expression data and genetic variants in exploring their associations with complex traits, where the dimensionality of gene expressions and genetic variants can both be much larger than the sample size. Motivated by such modern applications, we consider the problem of variable selection and estimation in high-dimensional sparse instrumental variables models. To overcome the difficulty of high dimensionality and unknown optimal instruments, we propose a two-stage regularization framework for identifying and estimating important covariate effects while selecting and estimating optimal instruments. The methodology extends the classical two-stage least squares estimator to high dimensions by exploiting sparsity using sparsity-inducing penalty functions in both stages. The resulting procedure is efficiently implemented by coordinate descent optimization. For the representative L1 regularization and a class of concave regularization methods, we establish estimation, prediction, and model selection properties of the two-stage regularized estimators in the high-dimensional setting where the dimensionality of co-variates and instruments are both allowed to grow exponentially with the sample size. The practical performance of the proposed method is evaluated by simulation studies and its usefulness is illustrated by an analysis of mouse obesity data. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Wei Lin
- Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Rui Feng
- Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Hongzhe Li
- Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| |
Collapse
|
39
|
|