1
|
Cai TT, Zhang AR, Zhou Y. Sparse Group Lasso: Optimal Sample Complexity, Convergence Rate, and Statistical Inference. IEEE TRANSACTIONS ON INFORMATION THEORY 2022; 68:5975-6002. [PMID: 36865503 PMCID: PMC9974176 DOI: 10.1109/tit.2022.3175455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
We study sparse group Lasso for high-dimensional double sparse linear regression, where the parameter of interest is simultaneously element-wise and group-wise sparse. This problem is an important instance of the simultaneously structured model - an actively studied topic in statistics and machine learning. In the noiseless case, matching upper and lower bounds on sample complexity are established for the exact recovery of sparse vectors and for stable estimation of approximately sparse vectors, respectively. In the noisy case, upper and matching minimax lower bounds for estimation error are obtained. We also consider the debiased sparse group Lasso and investigate its asymptotic property for the purpose of statistical inference. Finally, numerical studies are provided to support the theoretical results.
Collapse
Affiliation(s)
- T Tony Cai
- Department of Statistics & Data Science, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104
| | - Anru R Zhang
- Departments of Biostatistics & Bioinformatics, Computer Science, Mathematics, and Statistical Science, Duke University, Durham, NC 27710
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706
| | - Yuchen Zhou
- Department of Statistics & Data Science, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706
| |
Collapse
|
2
|
Cai JF, Li J, Xia D. Generalized Low-rank plus Sparse Tensor Estimation by Fast Riemannian Optimization. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2063131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Jian-Feng Cai
- Department of Mathematics, Hong Kong University of Science and Technology
| | - Jingyang Li
- Department of Mathematics, Hong Kong University of Science and Technology
| | - Dong Xia
- Department of Mathematics, Hong Kong University of Science and Technology
| |
Collapse
|
3
|
Ibriga HS, Sun WW. Covariate-assisted Sparse Tensor Completion. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2066537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
4
|
Xia D, Zhang AR, Zhou Y. Inference for low-rank tensors—no need to debias. Ann Stat 2022. [DOI: 10.1214/21-aos2146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Dong Xia
- Department of Mathematics, Hong Kong University of Science and Technology
| | - Anru R. Zhang
- Departments of Biostatistics & Bioinformatics, Computer Science, Mathematics, and Statistical Science, Duke University
| | - Yuchen Zhou
- Department of Statistics and Data Science, The Wharton School, University of Pennsylvania
| |
Collapse
|
5
|
Li L, Zeng J, Zhang X. Generalized Liquid Association Analysis for Multimodal Data Integration. J Am Stat Assoc 2022; 118:1984-1996. [PMID: 38099062 PMCID: PMC10720690 DOI: 10.1080/01621459.2021.2024437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 12/27/2021] [Indexed: 10/19/2022]
Abstract
Multimodal data are now prevailing in scientific research. One of the central questions in multimodal integrative analysis is to understand how two data modalities associate and interact with each other given another modality or demographic variables. The problem can be formulated as studying the associations among three sets of random variables, a question that has received relatively less attention in the literature. In this article, we propose a novel generalized liquid association analysis method, which offers a new and unique angle to this important class of problems of studying three-way associations. We extend the notion of liquid association of Li (2002) from the univariate setting to the sparse, multivariate, and high-dimensional setting. We establish a population dimension reduction model, transform the problem to sparse Tucker decomposition of a three-way tensor, and develop a higher-order orthogonal iteration algorithm for parameter estimation. We derive the non-asymptotic error bound and asymptotic consistency of the proposed estimator, while allowing the variable dimensions to be larger than and diverge with the sample size. We demonstrate the efficacy of the method through both simulations and a multimodal neuroimaging application for Alzheimer's disease research.
Collapse
Affiliation(s)
- Lexin Li
- University of California at Berkeley
| | | | | |
Collapse
|
6
|
Han R, Willett R, Zhang AR. An optimal statistical and computational framework for generalized tensor estimation. Ann Stat 2022. [DOI: 10.1214/21-aos2061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Rungang Han
- Department of Statistics, University of Wisconsin-Madison
| | - Rebecca Willett
- Departments of Statistics and Computer Science, University of Chicago
| | - Anru R. Zhang
- Department of Statistics, University of Wisconsin-Madison
| |
Collapse
|
7
|
Hu J, Lee C, Wang M. Generalized Tensor Decomposition With Features on Multiple Modes. J Comput Graph Stat 2021. [DOI: 10.1080/10618600.2021.1978471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Jiaxin Hu
- Department of Statistics, University of Wisconsin-Madison, Madison, WI
| | - Chanwoo Lee
- Department of Statistics, University of Wisconsin-Madison, Madison, WI
| | - Miaoyan Wang
- Department of Statistics, University of Wisconsin-Madison, Madison, WI
| |
Collapse
|
8
|
Zhou J, Sun WW, Zhang J, Li L. Partially Observed Dynamic Tensor Response Regression. J Am Stat Assoc 2021; 118:424-439. [PMID: 37333062 PMCID: PMC10274377 DOI: 10.1080/01621459.2021.1938082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Revised: 05/22/2021] [Accepted: 05/25/2021] [Indexed: 10/21/2022]
Abstract
In modern data science, dynamic tensor data prevail in numerous applications. An important task is to characterize the relationship between dynamic tensor datasets and external covariates. However, the tensor data are often only partially observed, rendering many existing methods inapplicable. In this article, we develop a regression model with a partially observed dynamic tensor as the response and external covariates as the predictor. We introduce the low-rankness, sparsity, and fusion structures on the regression coefficient tensor, and consider a loss function projected over the observed entries. We develop an efficient nonconvex alternating updating algorithm, and derive the finite-sample error bound of the actual estimator from each step of our optimization algorithm. Unobserved entries in the tensor response have imposed serious challenges. As a result, our proposal differs considerably in terms of estimation algorithm, regularity conditions, as well as theoretical properties, compared to the existing tensor completion or tensor response regression solutions. We illustrate the efficacy of our proposed method using simulations and two real applications, including a neuroimaging dementia study and a digital advertising study.
Collapse
Affiliation(s)
- Jie Zhou
- Department of Management Science, University of Miami Herbert Business School, Miami, FL
| | - Will Wei Sun
- Krannert School of Management, Purdue University, West Lafayette, IN
| | - Jingfei Zhang
- Department of Management Science, University of Miami Herbert Business School, Miami, FL
| | - Lexin Li
- Division of Biostatistics, University of California, Berkeley, Berkeley, CA
| |
Collapse
|
9
|
Zhou Y, He K. An improved tensor regression model via location smoothing. Stat (Int Stat Inst) 2021. [DOI: 10.1002/sta4.377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Ya Zhou
- Center for Applied Statistics and Institute of Statistics and Big Data Renmin University of China Beijing China
| | - Kejun He
- Center for Applied Statistics and Institute of Statistics and Big Data Renmin University of China Beijing China
| |
Collapse
|
10
|
Chen Y, Chi Y, Fan J, Ma C. Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval. MATHEMATICAL PROGRAMMING 2019; 176:5-37. [PMID: 33833473 PMCID: PMC8025800 DOI: 10.1007/s10107-019-01363-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2018] [Accepted: 01/02/2019] [Indexed: 06/12/2023]
Abstract
This paper considers the problem of solving systems of quadratic equations, namely, recovering an object of interestx ♮ ∈ ℝ n from m quadratic equations/samplesy i = ( a i ⊤ x ♮ ) 2 , 1 ≤ i ≤ m . This problem, also dubbed as phase retrieval, spans multiple domains including physical sciences and machine learning. We investigate the efficacy of gradient descent (or Wirtinger flow) designed for the nonconvex least squares problem. We prove that under Gaussian designs, gradient descent - when randomly initialized - yields an ϵ-accurate solution in O(log n + log(1/ϵ)) iterations given nearly minimal samples, thus achieving near-optimal computational and sample complexities at once. This provides the first global convergence guarantee concerning vanilla gradient descent for phase retrieval, without the need of (i) carefully-designed initialization, (ii) sample splitting, or (iii) sophisticated saddle-point escaping schemes. All of these are achieved by exploiting the statistical models in analyzing optimization algorithms, via a leave-one-out approach that enables the decoupling of certain statistical dependency between the gradient descent iterates and the data.
Collapse
Affiliation(s)
- Yuxin Chen
- Department of Electrical Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Yuejie Chi
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Jianqing Fan
- Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Cong Ma
- Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
11
|
Zhang A, Han R. Optimal Sparse Singular Value Decomposition for High-Dimensional High-Order Data. J Am Stat Assoc 2019; 114:1708-1725. [PMID: 34290464 PMCID: PMC8290930 DOI: 10.1080/01621459.2018.1527227] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2017] [Revised: 06/28/2018] [Accepted: 08/27/2018] [Indexed: 10/27/2022]
Abstract
In this article, we consider the sparse tensor singular value decomposition, which aims for dimension reduction on high-dimensional high-order data with certain sparsity structure. A method named sparse tensor alternating thresholding for singular value decomposition (STAT-SVD) is proposed. The proposed procedure features a novel double projection & thresholding scheme, which provides a sharp criterion for thresholding in each iteration. Compared with regular tensor SVD model, STAT-SVD permits more robust estimation under weaker assumptions. Both the upper and lower bounds for estimation accuracy are developed. The proposed procedure is shown to be minimax rate-optimal in a general class of situations. Simulation studies show that STAT-SVD performs well under a variety of configurations. We also illustrate the merits of the proposed procedure on a longitudinal tensor dataset on European country mortality rates. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Anru Zhang
- Department of Statistics, University of Wisconsin-Madison, Madison, WI
| | - Rungang Han
- Department of Statistics, University of Wisconsin-Madison, Madison, WI
| |
Collapse
|