1
|
Xu G, Amei A, Wu W, Liu Y, Shen L, Oh EC, Wang Z. RETROSPECTIVE VARYING COEFFICIENT ASSOCIATION ANALYSIS OF LONGITUDINAL BINARY TRAITS: APPLICATION TO THE IDENTIFICATION OF GENETIC LOCI ASSOCIATED WITH HYPERTENSION. Ann Appl Stat 2024; 18:487-505. [PMID: 38577266 PMCID: PMC10994004 DOI: 10.1214/23-aoas1798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2024]
Abstract
Many genetic studies contain rich information on longitudinal phenotypes that require powerful analytical tools for optimal analysis. Genetic analysis of longitudinal data that incorporates temporal variation is important for understanding the genetic architecture and biological variation of complex diseases. Most of the existing methods assume that the contribution of genetic variants is constant over time and fail to capture the dynamic pattern of disease progression. However, the relative influence of genetic variants on complex traits fluctuates over time. In this study, we propose a retrospective varying coefficient mixed model association test, RVMMAT, to detect time-varying genetic effect on longitudinal binary traits. We model dynamic genetic effect using smoothing splines, estimate model parameters by maximizing a double penalized quasi-likelihood function, design a joint test using a Cauchy combination method, and evaluate statistical significance via a retrospective approach to achieve robustness to model misspecification. Through simulations we illustrated that the retrospective varying-coefficient test was robust to model misspecification under different ascertainment schemes and gained power over the association methods assuming constant genetic effect. We applied RVMMAT to a genome-wide association analysis of longitudinal measure of hypertension in the Multi-Ethnic Study of Atherosclerosis. Pathway analysis identified two important pathways related to G-protein signaling and DNA damage. Our results demonstrated that RVMMAT could detect biologically relevant loci and pathways in a genome scan and provided insight into the genetic architecture of hypertension.
Collapse
Affiliation(s)
- Gang Xu
- Department of Mathematical Sciences, University of Nevada
| | - Amei Amei
- Department of Mathematical Sciences, University of Nevada
| | - Weimiao Wu
- Department of Biostatistics, Yale School of Public Health
| | - Yunqing Liu
- Department of Biostatistics, Yale School of Public Health
| | - Linchuan Shen
- Department of Mathematical Sciences, University of Nevada
| | - Edwin C. Oh
- Department of Internal Medicine, University of Nevada School of Medicine
| | - Zuoheng Wang
- Department of Biostatistics, Yale School of Public Health
| |
Collapse
|
2
|
Yang Y, Pan Z, Kang J, Brummett C, Li Y. Simultaneous selection and inference for varying coefficients with zero regions: a soft-thresholding approach. Biometrics 2023; 79:3388-3401. [PMID: 37459178 PMCID: PMC10792111 DOI: 10.1111/biom.13900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 06/05/2023] [Indexed: 08/02/2023]
Abstract
Varying coefficient models have been used to explore dynamic effects in many scientific areas, such as in medicine, finance, and epidemiology. As most existing models ignore the existence of zero regions, we propose a new soft-thresholded varying coefficient model, where the coefficient functions are piecewise smooth with zero regions. Our new modeling approach enables us to perform variable selection, detect the zero regions of selected variables, obtain point estimates of the varying coefficients with zero regions, and construct a new type of sparse confidence intervals that accommodate zero regions. We prove the asymptotic properties of the estimator, based on which we draw statistical inference. Our simulation study reveals that the proposed sparse confidence intervals achieve the desired coverage probability. We apply the proposed method to analyze a large-scale preoperative opioid study.
Collapse
Affiliation(s)
| | - Ziyang Pan
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| | - Chad Brummett
- Department of Anesthesiology, University of Michigan, Ann Arbor, MI, U.S.A
| | - Yi Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A
| |
Collapse
|
3
|
Dai X, Lu X, Chekouo T. A Bayesian genomic selection approach incorporating prior feature ordering and population structures with application to coronary artery disease. Stat Methods Med Res 2023; 32:1616-1629. [PMID: 37376889 DOI: 10.1177/09622802231181231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2023]
Abstract
Coronary artery disease is one of the most common types of cardiovascular disease. Death from coronary heart disease is influenced by genetic factors in both women and men. In this article, we propose a novel Bayesian variable selection framework for the identification of important genetic variants associated with coronary artery disease disease status. Instead of treating each feature independently as in conventional Bayesian variable selection methods, we propose an innovative prior for the inclusion probabilities of genetic variants that accounts for their ordering structure. We assume that neighboring variants are more likely to be selected together as they tend to be highly correlated and have similar biological functions. Additionally, we propose to group participating subjects based on underlying population structure and fit separate regressions, so that the regression coefficients can better reflect different disease risks in different population groups. Our approach borrows strength across regression models through an innovative prior inspired by the Markov random fields. The proposed framework can improve variable selection and prediction performances as demonstrated in the simulation studies. We also apply the proposed framework to the CATHeterization GENetics data with binary Coronary artery disease disease status.
Collapse
Affiliation(s)
- Xiaotian Dai
- Department of Mathematics and Statistics, University of Calgary, Calgary, Canada
| | - Xuewen Lu
- Department of Mathematics and Statistics, University of Calgary, Calgary, Canada
| | - Thierry Chekouo
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
4
|
Liu Y, Li G. Sure Joint Screening for High Dimensional Cox's Proportional Hazards Model Under the Case-Cohort Design. J Comput Biol 2023; 30:663-677. [PMID: 37140454 PMCID: PMC10282795 DOI: 10.1089/cmb.2022.0416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023] Open
Abstract
This study develops a sure joint feature screening method for the case-cohort design with ultrahigh-dimensional covariates. Our method is based on a sparsity-restricted Cox proportional hazards model. An iterative reweighted hard thresholding algorithm is proposed to approximate the sparsity-restricted, pseudo-partial likelihood estimator for joint screening. We rigorously show that our method possesses the sure screening property, with the probability of retaining all relevant covariates tending to 1 as the sample size goes to infinity. Our simulation results demonstrate that the proposed procedure has substantially improved screening performance over some existing feature screening methods for the case-cohort design, especially when some covariates are jointly correlated, but marginally uncorrelated, with the event time outcome. A real data illustration is provided using breast cancer data with high-dimensional genomic covariates. We have implemented the proposed method using MATLAB and made it available to readers through GitHub.
Collapse
Affiliation(s)
- Yi Liu
- Department of Mathematics, School of Mathematical Sciences, Ocean University of China, Qingdao, China
| | - Gang Li
- Department of Biostatistics, University of California at Los Angeles, Los Angeles, California, USA
| |
Collapse
|
5
|
Xiong W, Chen Y, Ma S. Unified model-free interaction screening via CV-entropy filter. Comput Stat Data Anal 2023; 180:107684. [PMID: 36910335 PMCID: PMC9997997 DOI: 10.1016/j.csda.2022.107684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
For many practical high-dimensional problems, interactions have been increasingly found to play important roles beyond main effects. A representative example is gene-gene interaction. Joint analysis, which analyzes all interactions and main effects in a single model, can be seriously challenged by high dimensionality. For high-dimensional data analysis in general, marginal screening has been established as effective for reducing computational cost, increasing stability, and improving estimation/selection performance. Most of the existing marginal screening methods are designed for the analysis of main effects only. The existing screening methods for interaction analysis are often limited by making stringent model assumptions, lacking robustness, and/or requiring predictors to be continuous (and hence lacking flexibility). A unified marginal screening approach tailored to interaction analysis is developed, which can be applied to regression, classification, and survival analysis. Predictors are allowed to be continuous and discrete. The proposed approach is built on Coefficient of Variation (CV) filters based on information entropy. Statistical properties are rigorously established. It is shown that the CV filters are almost insensitive to the distribution tails of predictors, correlation structure among predictors, and sparsity level of signals. An efficient two-stage algorithm is developed to make the proposed approach scalable to ultrahigh-dimensional data. Simulations and the analysis of TCGA LUAD data further establish the practical superiority of the proposed approach.
Collapse
Affiliation(s)
- Wei Xiong
- School of Statistics, University of International Business and Economics, Beijing 100872, PR China
| | - Yaxian Chen
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong
| | - Shuangge Ma
- Department of Biostatistics, Yale School of Public Health, USA
| |
Collapse
|
6
|
Zhang L, Song X. Ultrahigh dimensional single index model estimation via refitted cross-validation. COMMUN STAT-THEOR M 2023. [DOI: 10.1080/03610926.2023.2179881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2023]
Affiliation(s)
- Lixia Zhang
- School of Statistics, Beijing Normal University, Beijing, PR China
| | - Xuguang Song
- School of Statistics, Beijing Normal University, Beijing, PR China
| |
Collapse
|
7
|
Li T, Yu J, Meng C. Scalable model-free feature screening via sliced-Wasserstein dependency. J Comput Graph Stat 2023. [DOI: 10.1080/10618600.2023.2183213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Affiliation(s)
- Tao Li
- Center for Applied Statistics, Institute of Statistics and Big Data, Renmin University of China
| | - Jun Yu
- School of Mathematics and Statistics, Beijing Institute of Technology
| | - Cheng Meng
- Center for Applied Statistics, Institute of Statistics and Big Data, Renmin University of China
| |
Collapse
|
8
|
Fan J, Lou Z, Yu M. Are Latent Factor Regression and Sparse Regression Adequate? J Am Stat Assoc 2023; 119:1076-1088. [PMID: 39268549 PMCID: PMC11390100 DOI: 10.1080/01621459.2023.2169700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 01/13/2023] [Indexed: 01/19/2023]
Abstract
We propose the Factor Augmented (sparse linear) Regression Model (FARM) that not only admits both the latent factor regression and sparse linear regression as special cases but also bridges dimension reduction and sparse regression together. We provide theoretical guarantees for the estimation of our model under the existence of sub-Gaussian and heavy-tailed noises (with bounded (1 + ϑ) -th moment, for all ϑ > 0) respectively. In addition, the existing works on supervised learning often assume the latent factor regression or sparse linear regression is the true underlying model without justifying its adequacy. To fill in such an important gap on high-dimensional inference, we also leverage our model as the alternative model to test the sufficiency of the latent factor regression and the sparse linear regression models. To accomplish these goals, we propose the Factor-Adjusted deBiased Test (FabTest) and a two-stage ANOVA type test respectively. We also conduct large-scale numerical experiments including both synthetic and FRED macroeconomics data to corroborate the theoretical properties of our methods. Numerical results illustrate the robustness and effectiveness of our model against latent factor regression and sparse linear regression models.
Collapse
Affiliation(s)
- Jianqing Fan
- Frederick L. Moore '18 Professor of Finance, Professor of Statistics, and Professor of Operations Research and Financial Engineering at the Princeton University
| | - Zhipeng Lou
- Department of Operations Research and Financial Engineering, Princeton University
| | - Mengxin Yu
- Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
9
|
Craig SJ, Kenney AM, Lin J, Paul IM, Birch LL, Savage JS, Marini ME, Chiaromonte F, Reimherr ML, Makova KD. Constructing a polygenic risk score for childhood obesity using functional data analysis. ECONOMETRICS AND STATISTICS 2023; 25:66-86. [PMID: 36620476 PMCID: PMC9813976 DOI: 10.1016/j.ecosta.2021.10.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Obesity is a highly heritable condition that affects increasing numbers of adults and, concerningly, of children. However, only a small fraction of its heritability has been attributed to specific genetic variants. These variants are traditionally ascertained from genome-wide association studies (GWAS), which utilize samples with tens or hundreds of thousands of individuals for whom a single summary measurement (e.g., BMI) is collected. An alternative approach is to focus on a smaller, more deeply characterized sample in conjunction with advanced statistical models that leverage longitudinal phenotypes. Novel functional data analysis (FDA) techniques are used to capitalize on longitudinal growth information from a cohort of children between birth and three years of age. In an ultra-high dimensional setting, hundreds of thousands of single nucleotide polymorphisms (SNPs) are screened, and selected SNPs are used to construct two polygenic risk scores (PRS) for childhood obesity using a weighting approach that incorporates the dynamic and joint nature of SNP effects. These scores are significantly higher in children with (vs. without) rapid infant weight gain-a predictor of obesity later in life. Using two independent cohorts, it is shown that the genetic variants identified in very young children are also informative in older children and in adults, consistent with early childhood obesity being predictive of obesity later in life. In contrast, PRSs based on SNPs identified by adult obesity GWAS are not predictive of weight gain in the cohort of young children. This provides an example of a successful application of FDA to GWAS. This application is complemented with simulations establishing that a deeply characterized sample can be just as, if not more, effective than a comparable study with a cross-sectional response. Overall, it is demonstrated that a deep, statistically sophisticated characterization of a longitudinal phenotype can provide increased statistical power to studies with relatively small sample sizes; and shows how FDA approaches can be used as an alternative to the traditional GWAS.
Collapse
Affiliation(s)
- Sarah J.C. Craig
- Department of Biology, Penn State University, University Park
- Center for Medical Genomics, Penn State University, University Park, PA
| | - Ana M. Kenney
- Department of Statistics, Penn State University, University Park, PA
| | - Junli Lin
- Department of Statistics, Penn State University, University Park, PA
| | - Ian M. Paul
- Center for Medical Genomics, Penn State University, University Park, PA
- Department of Pediatrics, Penn State College of Medicine, Hershey, PA
| | - Leann L. Birch
- Department of Foods and Nutrition, University of Georgia, Athens, GA
| | - Jennifer S. Savage
- Department of Nutritional Sciences, Penn State University, University Park, PA
- Center for Childhood Obesity Research, Penn State University, University Park, PA
| | - Michele E. Marini
- Center for Childhood Obesity Research, Penn State University, University Park, PA
| | - Francesca Chiaromonte
- Center for Medical Genomics, Penn State University, University Park, PA
- Department of Statistics, Penn State University, University Park, PA
- EMbeDS, Sant’Anna School of Advanced Studies, Piazza Martiri della Libertà, Pisa, Italy
| | - Matthew L. Reimherr
- Center for Medical Genomics, Penn State University, University Park, PA
- Department of Statistics, Penn State University, University Park, PA
| | - Kateryna D. Makova
- Department of Biology, Penn State University, University Park
- Center for Medical Genomics, Penn State University, University Park, PA
| |
Collapse
|
10
|
Feature screening and FDR control with knockoff features for ultrahigh-dimensional right-censored data. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2022.107504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
11
|
Xiong W, Tian M, Tang M, Pan H. Robust and sparse learning of varying coefficient models with high-dimensional features. J Appl Stat 2022; 50:3312-3336. [PMID: 37969890 PMCID: PMC10637205 DOI: 10.1080/02664763.2022.2109129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 07/28/2022] [Indexed: 10/15/2022]
Abstract
Varying coefficient model (VCM) is extensively used in various scientific fields due to its capability of capturing the changing structure of predictors. Classical mean regression analysis is often complicated in the existence of skewed, heterogeneous and heavy-tailed data. For this purpose, this work employs the idea of model averaging and introduces a novel comprehensive approach by incorporating quantile-adaptive weights across different quantile levels to further improve both least square (LS) and quantile regression (QR) methods. The proposed procedure that adaptively takes advantage of the heterogeneous and sparse nature of input data can gain more efficiency and be well adapted to extreme event case and high-dimensional setting. Motivated by its nice properties, we develop several robust methods to reveal the dynamic close-to-truth structure for VCM and consistently uncover the zero and nonzero patterns in high-dimensional scientific discoveries. We provide a new iterative algorithm that is proven to be asymptotic consistent and can attain the optimal nonparametric convergence rate given regular conditions. These introduced procedures are highlighted with extensive simulation examples and several real data analyses to further show their stronger predictive power compared with LS, composite quantile regression (CQR) and QR methods.
Collapse
Affiliation(s)
- Wei Xiong
- School of Statistics, University of International Business and Economics, Beijing, People's Republic of China
| | - Maozai Tian
- Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing, People's Republic of China
| | - Manlai Tang
- Department of Mathematics, College of Engineering, Design and Physical Sciences, Brunel University London, London, UK
| | - Han Pan
- School of Statistics, University of International Business and Economics, Beijing, People's Republic of China
| |
Collapse
|
12
|
Tian B, Liu Z, Wang H. Non-marginal feature screening for varying coefficient competing risks model. Stat Probab Lett 2022. [DOI: 10.1016/j.spl.2022.109648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
13
|
Qu L, Wang X, Sun L. Variable screening for varying coefficient models with ultrahigh-dimensional survival data. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2022.107498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
14
|
Sen S, Kundu D, Das K. Variable selection for categorical response: a comparative study. Comput Stat 2022. [DOI: 10.1007/s00180-022-01260-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
15
|
Zhao S, Fu G. Distribution-free and model-free multivariate feature screening via multivariate rank distance correlation. J MULTIVARIATE ANAL 2022. [DOI: 10.1016/j.jmva.2022.105081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
16
|
Tong Z, Cai Z, Yang S, Li R. Model-Free Conditional Feature Screening with FDR Control. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2063130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Zhaoxue Tong
- Pennsylvania State University, University Park, PA
| | | | | | - Runze Li
- Pennsylvania State University, University Park, PA
| |
Collapse
|
17
|
Ma W, Xiao J, Yang Y, Ye F. Model-free feature screening for ultrahigh dimensional data via a Pearson chi-square based index. J STAT COMPUT SIM 2022. [DOI: 10.1080/00949655.2022.2062358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Weidong Ma
- Department of Mathematical Sciences, Tsinghua University, Beijing, People's Republic of China
| | - Jingsong Xiao
- Department of Mathematical Sciences, Tsinghua University, Beijing, People's Republic of China
| | - Ying Yang
- Department of Mathematical Sciences, Tsinghua University, Beijing, People's Republic of China
| | - Fei Ye
- School of Statistics, Capital University of Economics and Business, Beijing, People's Republic of China
| |
Collapse
|
18
|
Li Y, Qiu Y, Xu Y. From multivariate to functional data analysis: fundamentals, recent developments, and emerging areas. J MULTIVARIATE ANAL 2022; 188:104806. [PMID: 39040141 PMCID: PMC11261241 DOI: 10.1016/j.jmva.2021.104806] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Functional data analysis (FDA), which is a branch of statistics on modeling infinite dimensional random vectors resided in functional spaces, has become a major research area for Journal of Multivariate Analysis. We review some fundamental concepts of FDA, their origins and connections from multivariate analysis, and some of its recent developments, including multi-level functional data analysis, high-dimensional functional regression, and dependent functional data analysis. We also discuss the impact of these new methodology developments on genetics, plant science, wearable device data analysis, image data analysis, and business analytics. Two real data examples are provided to motivate our discussions.
Collapse
Affiliation(s)
- Yehua Li
- University of California - Riverside, Riverside, CA 92521, USA
| | - Yumou Qiu
- Iowa State University, Ames, IA 50011, USA
| | - Yuhang Xu
- Bowling Green State University, Bowling Green, OH 43403, USA
| |
Collapse
|
19
|
Chakraborty S, Shojaie A. Nonparametric Causal Structure Learning in High Dimensions. ENTROPY (BASEL, SWITZERLAND) 2022; 24:351. [PMID: 35327862 PMCID: PMC8947566 DOI: 10.3390/e24030351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 02/21/2022] [Accepted: 02/25/2022] [Indexed: 12/10/2022]
Abstract
The PC and FCI algorithms are popular constraint-based methods for learning the structure of directed acyclic graphs (DAGs) in the absence and presence of latent and selection variables, respectively. These algorithms (and their order-independent variants, PC-stable and FCI-stable) have been shown to be consistent for learning sparse high-dimensional DAGs based on partial correlations. However, inferring conditional independences from partial correlations is valid if the data are jointly Gaussian or generated from a linear structural equation model-an assumption that may be violated in many applications. To broaden the scope of high-dimensional causal structure learning, we propose nonparametric variants of the PC-stable and FCI-stable algorithms that employ the conditional distance covariance (CdCov) to test for conditional independence relationships. As the key theoretical contribution, we prove that the high-dimensional consistency of the PC-stable and FCI-stable algorithms carry over to general distributions over DAGs when we implement CdCov-based nonparametric tests for conditional independence. Numerical studies demonstrate that our proposed algorithms perform nearly as good as the PC-stable and FCI-stable for Gaussian distributions, and offer advantages in non-Gaussian graphical models.
Collapse
Affiliation(s)
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA;
| |
Collapse
|
20
|
Interaction screening via canonical correlation. Comput Stat 2022. [DOI: 10.1007/s00180-022-01206-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
21
|
Epistasis Detection via the Joint Cumulant. STATISTICS IN BIOSCIENCES 2022. [DOI: 10.1007/s12561-022-09336-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
22
|
Unified mean-variance feature screening for ultrahigh-dimensional regression. Comput Stat 2022. [DOI: 10.1007/s00180-021-01184-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
23
|
Nandy D, Chiaromonte F, Li R. Covariate Information Number for Feature Screening in Ultrahigh-Dimensional Supervised Problems. J Am Stat Assoc 2022; 117:1516-1529. [PMID: 36172297 PMCID: PMC9512254 DOI: 10.1080/01621459.2020.1864380] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Contemporary high-throughput experimental and surveying techniques give rise to ultrahigh-dimensional supervised problems with sparse signals; that is, a limited number of observations (n), each with a very large number of covariates (p >> n), only a small share of which is truly associated with the response. In these settings, major concerns on computational burden, algorithmic stability, and statistical accuracy call for substantially reducing the feature space by eliminating redundant covariates before the use of any sophisticated statistical analysis. Along the lines of Sure Independence Screening (Fan and Lv, 2008) and other model- and correlation-based feature screening methods, we propose a model-free procedure called Covariate Information Number - Sure Independence Screening (CIS). CIS uses a marginal utility connected to the notion of the traditional Fisher Information, possesses the sure screening property, and is applicable to any type of response (features) with continuous features (response). Simulations and an application to transcriptomic data on rats reveal the comparative strengths of CIS over some popular feature screening methods.
Collapse
Affiliation(s)
- Debmalya Nandy
- Department of Biostatistics & Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA,Corresponding author Debmalya Nandy
| | - Francesca Chiaromonte
- Department of Statistics, Penn State University, University Park, PA 16802, USA,Institute of Economics and EMbeDS, Sant’Anna School of Advanced Studies, Piazza Martiri della Libertà 33, Pisa 56127, Italy
| | - Runze Li
- Department of Statistics, Penn State University, University Park, PA 16802, USA
| |
Collapse
|
24
|
Yang B, Wu W, Yin X. On sufficient variable screening using log odds ratio filter. Electron J Stat 2022. [DOI: 10.1214/21-ejs1951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Baoying Yang
- Department of Statistics, College of Mathematics Southwest Jiaotong University, Chengdu, China
| | - Wenbo Wu
- Department of Management Science and Statistics The University of Texas at San Antonio, San Antonio, TX
| | - Xiangrong Yin
- Department of Statistics, University of Kentucky 319 Multidisciplinary Science Building, Lexington, KY 40536
| |
Collapse
|
25
|
Kazemi M, Shahsavani D, Arashi M, Rodrigues PC. Estimation in partial linear model with spline modal function. COMMUN STAT-SIMUL C 2021. [DOI: 10.1080/03610918.2019.1622716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- M. Kazemi
- Department of Statistics, Faculty of Mathematical Sciences, Shahrood University of Technology, Shahrood, Iran
| | - D. Shahsavani
- Department of Statistics, Faculty of Mathematical Sciences, Shahrood University of Technology, Shahrood, Iran
| | - M. Arashi
- Department of Statistics, Faculty of Mathematical Sciences, Shahrood University of Technology, Shahrood, Iran
| | - P. C. Rodrigues
- CAST, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland
| |
Collapse
|
26
|
Li X, Wang L, Wang HJ. Sparse Learning and Structure Identification for Ultrahigh-Dimensional Image-on-Scalar Regression. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2020.1753523] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Xinyi Li
- Statistical and Applied Mathematical Sciences Institute (SAMSI), Durham, NC
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Li Wang
- Department of Statistics, Iowa State University, Ames, IA
| | - Huixia Judy Wang
- Department of Statistics, George Washington University, Washington, DC
| | | |
Collapse
|
27
|
Zhang J, Zhou H, Liu Y, Cai J. Conditional screening for ultrahigh-dimensional survival data in case-cohort studies. LIFETIME DATA ANALYSIS 2021; 27:632-661. [PMID: 34417679 PMCID: PMC8561435 DOI: 10.1007/s10985-021-09531-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 08/05/2021] [Indexed: 06/13/2023]
Abstract
The case-cohort design has been widely used to reduce the cost of covariate measurements in large cohort studies. In many such studies, the number of covariates is very large, and the goal of the research is to identify active covariates which have great influence on response. Since the introduction of sure independence screening, screening procedures have achieved great success in terms of effectively reducing the dimensionality and identifying active covariates. However, commonly used screening methods are based on marginal correlation or its variants, they may fail to identify hidden active variables which are jointly important but are weakly correlated with the response. Moreover, these screening methods are mainly proposed for data under the simple random sampling and can not be directly applied to case-cohort data. In this paper, we consider the ultrahigh-dimensional survival data under the case-cohort design, and propose a conditional screening method by incorporating some important prior known information of active variables. This method can effectively detect hidden active variables. Furthermore, it possesses the sure screening property under some mild regularity conditions and does not require any complicated numerical optimization. We evaluate the finite sample performance of the proposed method via extensive simulation studies and further illustrate the new approach through a real data set from patients with breast cancer.
Collapse
Affiliation(s)
- Jing Zhang
- School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, 430073, China
| | - Haibo Zhou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599-7420, USA
| | - Yanyan Liu
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072, China
| | - Jianwen Cai
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599-7420, USA.
| |
Collapse
|
28
|
Liao Y, Liu J, Coffman DL, Li R. Varying Coefficient Mediation Model and Application to Analysis of Behavioral Economics Data. JOURNAL OF BUSINESS & ECONOMIC STATISTICS : A PUBLICATION OF THE AMERICAN STATISTICAL ASSOCIATION 2021; 40:1759-1771. [PMID: 36330150 PMCID: PMC9624463 DOI: 10.1080/07350015.2021.1971089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
This article is concerned with causal mediation analysis with varying indirect and direct effects. We propose a varying coefficient mediation model, which can also be viewed as an extension of moderation analysis on a causal diagram. We develop a new estimation procedure for the direct and indirect effects based on B-splines. Under mild conditions, rates of convergence and asymptotic distributions of the resulting estimates are established. We further propose a F-type test for the direct effect. We conduct simulation study to examine the finite sample performance of the proposed methodology, and apply the new procedures for empirical analysis of behavioral economics data.
Collapse
Affiliation(s)
- Yujie Liao
- Department of Statistics, Pennsylvania State University, University Park, PA
| | - Jingyuan Liu
- MOE Key Laboratory of Econometrics, Department of Statistics and Data Science, School of Economics, Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China
- Fujian Key Lab of Statistics, Xiamen University, Xiamen, China
| | - Donna L. Coffman
- Department of Epidemiology and Biostatistics, Temple University, Philadelphia, PA
| | - Runze Li
- Department of Statistics, Pennsylvania State University, University Park, PA
| |
Collapse
|
29
|
Guo C, Lv J, Wu J. Composite quantile regression for ultra-high dimensional semiparametric model averaging. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2021.107231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
30
|
Chen J, Li D, Wei L, Zhang W. Nonparametric homogeneity pursuit in functional-coefficient models. J Nonparametr Stat 2021. [DOI: 10.1080/10485252.2021.1951265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Jia Chen
- Department of Economics and Related Studies, University of York, York, UK
| | - Degui Li
- Department of Mathematics, University of York, York, UK
| | - Lingling Wei
- Department of Mathematics, University of York, York, UK
| | - Wenyang Zhang
- Department of Mathematics, University of York, York, UK
| |
Collapse
|
31
|
Li N, Peng X, Kawaguchi E, Suchard MA, Li G. A scalable surrogate L0 sparse regression method for generalized linear models with applications to large scale data. J Stat Plan Inference 2021. [DOI: 10.1016/j.jspi.2020.12.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
32
|
Zhong W, Wang J, Chen X. Censored mean variance sure independence screening for ultrahigh dimensional survival data. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2021.107206] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
|
33
|
Partition-based feature screening for categorical data via RKHS embeddings. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2021.107176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
34
|
Lu S, Chen X, Wang H. Conditional distance correlation sure independence screening for ultra-high dimensional survival data. COMMUN STAT-THEOR M 2021. [DOI: 10.1080/03610926.2019.1657454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Shuiyun Lu
- School of Statistics, Qufu Normal University, Qufu, China
| | - Xiaolin Chen
- School of Statistics, Qufu Normal University, Qufu, China
| | - Hong Wang
- School of Mathematics and Statistics, Central South University, Changsha, China
| |
Collapse
|
35
|
Abstract
Network analysis has drawn great attention in recent years. It is applied to a wide range disciplines. These include but are not limited to social science, finance and genetics. It is typical that one collects abundant covariates along the response variable in practice. Since the network structure makes the responses at different nodes no longer independent, existing screening methods may not perform well for network data. We propose a network-based sure independence screening (NW-SIS) method. This approach explicitly takes the network structure into consideration. The strong screening consistency property of the NW-SIS is rigorously established. We further investigated the estimation of the network effect and establish then -consistency of the estimator. The finite sample performance of the proposed method is assessed by simulation study and illustrated by an empirical analysis of a dataset from Chinese stock market.
Collapse
Affiliation(s)
| | - Xuening Zhu
- Fudan University
- Pennsylvania State University
| | | | | |
Collapse
|
36
|
Wang Y, Li L, Wang K. An online operating performance evaluation approach using probabilistic fuzzy theory for chemical processes with uncertainties. Comput Chem Eng 2021. [DOI: 10.1016/j.compchemeng.2020.107156] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
37
|
Hu Q, Zhu L, Liu Y, Sun J, Srivastava DK, Robison LL. Nonparametric screening and feature selection for ultrahigh-dimensional Case II interval-censored failure time data. Biom J 2020; 62:1909-1925. [PMID: 32677168 PMCID: PMC7988961 DOI: 10.1002/bimj.201900154] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 05/16/2020] [Accepted: 05/18/2020] [Indexed: 11/07/2022]
Abstract
For the analysis of ultrahigh-dimensional data, the first step is often to perform screening and feature selection to effectively reduce the dimensionality while retaining all the active or relevant variables with high probability. For this, many methods have been developed under various frameworks but most of them only apply to complete data. In this paper, we consider an incomplete data situation, case II interval-censored failure time data, for which there seems to be no screening procedure. Basing on the idea of cumulative residual, a model-free or nonparametric method is developed and shown to have the sure independent screening property. In particular, the approach is shown to tend to rank the active variables above the inactive ones in terms of their association with the failure time of interest. A simulation study is conducted to demonstrate the usefulness of the proposed method and, in particular, indicates that it works well with general survival models and is capable of capturing the nonlinear covariates with interactions. Also the approach is applied to a childhood cancer survivor study that motivated this investigation.
Collapse
Affiliation(s)
- Qiang Hu
- School of Statistics, Renmin University of China, Beijing, P. R. China
| | - Liang Zhu
- Division of Clinical and Translational Sciences, Department of Internal Medicine, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yanyan Liu
- School of Mathematics and Statistics, Wuhan University, Wuhan, P. R. China
| | - Jianguo Sun
- Department of Statistics, University of Missouri, Columbia, MO, USA
| | - Deo Kumar Srivastava
- Biostatistics Department, St. Jude Children’s Research Hospital, Memphis, TN, USA
| | - Leslie L. Robison
- Epidemiology and Cancer Control, St. Jude Children’s Research Hospital, Memphis, TN, USA
| |
Collapse
|
38
|
Lv S, Fan Z, Lian H, Suzuki T, Fukumizu K. A reproducing kernel Hilbert space approach to high dimensional partially varying coefficient model. Comput Stat Data Anal 2020. [DOI: 10.1016/j.csda.2020.107039] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
39
|
Zhang J, Liu Y, Cui H. Model-free feature screening via distance correlation for ultrahigh dimensional survival data. Stat Pap (Berl) 2020. [DOI: 10.1007/s00362-020-01210-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
40
|
Liu Y, Xu J, Li G. Sure joint feature screening in nonparametric transformation model for right censored data. CAN J STAT 2020. [DOI: 10.1002/cjs.11575] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Yi Liu
- School of Mathematical Sciences Ocean University of China Qingdao China
| | - Jinfeng Xu
- Department of Statistics & Actuarial Science The University of Hong Kong Hong Kong China
| | - Gang Li
- Department of Biostatistics University of California at Los Angeles Los Angeles CA U.S.A
| |
Collapse
|
41
|
Zhang F, Li R, Lian H, Bandyopadhyay D. Sparse reduced-rank regression for multivariate varying-coefficient models. J STAT COMPUT SIM 2020. [DOI: 10.1080/00949655.2020.1829622] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Fode Zhang
- Center of Statistical Research and School of Statistics, Southwestern University of Finance and Economics, Chengdu, People's Republic of China
| | - Rui Li
- School of Statistics and Information, Shanghai University of International Business and Economics, Shanghai, People's Republic of China
| | - Heng Lian
- Department of Mathematics, City University of Hong Kong, Kowloon, Hong Kong
| | | |
Collapse
|
42
|
Gao X, Liu Q. Sparsity identification in ultra-high dimensional quantile regression models with longitudinal data. COMMUN STAT-THEOR M 2020. [DOI: 10.1080/03610926.2019.1604966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Xianli Gao
- School of Statistics, Capital University of Economics and Business, Beijing, China
| | - Qiang Liu
- School of Statistics, Capital University of Economics and Business, Beijing, China
- Beijing Key Laboratory of Megaregions Sustainable Development Modelling, Beijing, China
| |
Collapse
|
43
|
Lai P, Liang W, Wang F, Zhang Q. Feature screening of quadratic inference functions for ultrahigh dimensional longitudinal data. J STAT COMPUT SIM 2020. [DOI: 10.1080/00949655.2020.1783666] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Peng Lai
- School of Mathematics and Statistics, Nanjing University of Information Science & Technology, Nanjing, People's Republic of China
| | - Weijuan Liang
- School of Statistics, Renmin University of China, Beijing, People's Republic of China
| | - Fangjian Wang
- School of Mathematics and Statistics, Nanjing University of Information Science & Technology, Nanjing, People's Republic of China
| | - Qingzhao Zhang
- Department of Statistics, School of Economics, Xiamen University, Xiamen, People's Republic of China
- Key Laboratory of Econometrics, Ministry of Education, Xiamen University, Xiamen, People's Republic of China
- The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, People's Republic of China
| |
Collapse
|
44
|
Li J, Lv J, Wan ATK, Liao J. AdaBoost Semiparametric Model Averaging Prediction for Multiple Categories. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1790375] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Affiliation(s)
- Jialiang Li
- Department of Statistics & Applied Probability, National University of Singapore, Singapore, Singapore
| | - Jing Lv
- School of Mathematics and Statistics, Southwest University, Chongqing, China
| | - Alan T. K. Wan
- Department of Management Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong
| | - Jun Liao
- School of Statistics, Renmin University of China, Beijing, China
| |
Collapse
|
45
|
Wang M, Kang X, Tian GL. Modified adaptive group lasso for high-dimensional varying coefficient models. COMMUN STAT-SIMUL C 2020. [DOI: 10.1080/03610918.2020.1804936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Mingqiu Wang
- School of Statistics, Qufu Normal University, Qufu, Shandong, P. R. China
| | - Xiaoning Kang
- International Business College and Institute of Supply Chain Analytics, Dongbei University of Finance and Economics, Dalian, Liaoning, P. R. China
| | - Guo-Liang Tian
- Department of Statistics and Data Science, Southern University of Science and Technology, Shenzhen, Guangdong, P. R. China
| |
Collapse
|
46
|
Chu Y, Lin L. Conditional SIRS for nonparametric and semiparametric models by marginal empirical likelihood. Stat Pap (Berl) 2020. [DOI: 10.1007/s00362-018-0993-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
47
|
Xu K, Shen Z, Huang X, Cheng Q. Projection correlation between scalar and vector variables and its use in feature screening with multi-response data. J STAT COMPUT SIM 2020. [DOI: 10.1080/00949655.2020.1753057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Kai Xu
- School of Mathematics and Statistics, Anhui Normal University, Wuhu, People's Republic of China
| | - Zhiling Shen
- School of Mathematics and Statistics, Anhui Normal University, Wuhu, People's Republic of China
| | - Xudong Huang
- School of Mathematics and Statistics, Anhui Normal University, Wuhu, People's Republic of China
| | - Qing Cheng
- Center for Quantitative Medicine, Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
| |
Collapse
|
48
|
Liu W, Ke Y, Liu J, Li R. Model-Free Feature Screening and FDR Control With Knockoff Features. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1783274] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Wanjun Liu
- Department of Statistics, The Pennsylvania State University, University Park, PA
| | - Yuan Ke
- Department of Statistics, University of Georgia, Athens, GA
| | - Jingyuan Liu
- MOE Key Laboratory of Econometrics, Department of Statistics, School of Economics, Wang Yanan Institute for Studies in Economics, and Fujian Key Lab of Statistics, Xiamen University, Xiamen, China
| | - Runze Li
- Department of Statistics, The Pennsylvania State University, University Park, PA
| |
Collapse
|
49
|
Lu S, Chen X, Xu S, Liu C. Joint model-free feature screening for ultra-high dimensional semi-competing risks data. Comput Stat Data Anal 2020. [DOI: 10.1016/j.csda.2020.106942] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
50
|
Liu Z, Xiong Z. Non-marginal feature screening for additive hazard model with ultrahigh-dimensional covariates. COMMUN STAT-THEOR M 2020. [DOI: 10.1080/03610926.2020.1770288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Zili Liu
- School of Mathematics and Statistics, Central China Normal University, Wuhan, China
| | - Zikang Xiong
- School of Mathematics and Statistics, Central China Normal University, Wuhan, China
| |
Collapse
|