1
|
Sun J, Lee KY. Generalized functional linear model with a point process predictor. Stat Med 2024; 43:1564-1576. [PMID: 38332307 DOI: 10.1002/sim.10023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Revised: 12/17/2023] [Accepted: 01/15/2024] [Indexed: 02/10/2024]
Abstract
Point process data have become increasingly popular these days. For example, many of the data captured in electronic health records (EHR) are in the format of point process data. It is of great interest to study the association between a point process predictor and a scalar response using generalized functional linear regression models. Various generalized functional linear regression models have been developed under different settings in the past decades. However, existing methods can only deal with functional or longitudinal predictors, not point process predictors. In this article, we propose a novel generalized functional linear regression model for a point process predictor. Our proposed model is based on the joint modeling framework, where we adopt a log-Gaussian Cox process model for the point process predictor and a generalized linear regression model for the outcome. We also develop a new algorithm for fast model estimation based on the Gaussian variational approximation method. We conduct extensive simulation studies to evaluate the performance of our proposed method and compare it to competing methods. The performance of our proposed method is further demonstrated on an EHR dataset of patients admitted into the intensive care units of the Beth Israel Deaconess Medical Center between 2001 and 2008.
Collapse
Affiliation(s)
- Jiehuan Sun
- Division of Epidemiology and Biostatistics, School of Public Health, University of Illinois Chicago, Chicago, Illinois, USA
| | - Kuang-Yao Lee
- Department of Statistics, Operations, and Data Science, Temple University, Philadelphia, Pennsylvania, USA
| |
Collapse
|
2
|
Zhao Y, Wu B, Kang J. Bayesian interaction selection model for multimodal neuroimaging data analysis. Biometrics 2023; 79:655-668. [PMID: 35220581 PMCID: PMC9418386 DOI: 10.1111/biom.13648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Accepted: 02/21/2022] [Indexed: 11/27/2022]
Abstract
Multimodality or multiconstruct data arise increasingly in functional neuroimaging studies to characterize brain activity under different cognitive states. Relying on those high-resolution imaging collections, it is of great interest to identify predictive imaging markers and intermodality interactions with respect to behavior outcomes. Currently, most of the existing variable selection models do not consider predictive effects from interactions, and the desired higher-order terms can only be included in the predictive mechanism following a two-step procedure, suffering from potential misspecification. In this paper, we propose a unified Bayesian prior model to simultaneously identify main effect features and intermodality interactions within the same inference platform in the presence of high-dimensional data. To accommodate the brain topological information and correlation between modalities, our prior is designed by compiling the intermediate selection status of sequential partitions in light of the data structure and brain anatomical architecture, so that we can improve posterior inference and enhance biological plausibility. Through extensive simulations, we show the superiority of our approach in main and interaction effects selection, and prediction under multimodality data. Applying the method to the Adolescent Brain Cognitive Development (ABCD) study, we characterize the brain functional underpinnings with respect to general cognitive ability under different memory load conditions.
Collapse
Affiliation(s)
- Yize Zhao
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, US
| | - Ben Wu
- Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing, China
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, US
| |
Collapse
|
3
|
Jiang S, Cao J, Colditz GA. Identifying regions of interest in mammogram images. Stat Methods Med Res 2023; 32:895-903. [PMID: 36951095 PMCID: PMC10247406 DOI: 10.1177/09622802231160551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]
Abstract
Screening mammography is the primary preventive strategy for early detection of breast cancer and an essential input to breast cancer risk prediction and application of prevention/risk management guidelines. Identifying regions of interest within mammogram images that are associated with 5- or 10-year breast cancer risk is therefore clinically meaningful. The problem is complicated by the irregular boundary issue posed by the semi-circular domain of the breast area within mammograms. Accommodating the irregular domain is especially crucial when identifying regions of interest, as the true signal comes only from the semi-circular domain of the breast region, and noise elsewhere. We address these challenges by introducing a proportional hazards model with imaging predictors characterized by bivariate splines over triangulation. The model sparsity is enforced with the group lasso penalty function. We apply the proposed method to the motivating Joanne Knight Breast Health Cohort to illustrate important risk patterns and show that the proposed method is able to achieve higher discriminatory performance.
Collapse
Affiliation(s)
- Shu Jiang
- Division of Public Health Sciences,
Washington University School of Medicine, St Louis, MO, USA
| | - Jiguo Cao
- Department of Statistics and Actuarial
Science, Simon Fraser University, Burnaby, BC, Canada
| | - Graham A. Colditz
- Division of Public Health Sciences,
Washington University School of Medicine, St Louis, MO, USA
| |
Collapse
|
4
|
Chen S, He K, He S, Ni Y, Wong RKW. Bayesian Nonlinear Tensor Regression with Functional Fused Elastic Net Prior. Technometrics 2023. [DOI: 10.1080/00401706.2023.2197471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/01/2023]
|
5
|
Yang Q, Wang C, He H, Zhou X, Song X. Additive hazards model with time-varying coefficients and imaging predictors. Stat Methods Med Res 2023; 32:353-372. [PMID: 36451621 DOI: 10.1177/09622802221137746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Conventional hazard regression analyses frequently assume constant regression coefficients and scalar covariates. However, some covariate effects may vary with time. Moreover, medical imaging has become an increasingly important tool in screening, diagnosis, and prognosis of various diseases, given its information visualization and quantitative assessment. This study considers an additive hazards model with time-varying coefficients and imaging predictors to examine the dynamic effects of potential scalar and imaging risk factors for the failure of interest. We develop a two-stage approach that comprises the high-dimensional functional principal component analysis technique in the first stage and the counting process-based estimating equation approach in the second stage. In addition, we construct the pointwise confidence intervals for the proposed estimators and provide a significance test for the effects of scalar and imaging covariates. Simulation studies demonstrate the satisfactory performance of the proposed method. An application to the Alzheimer's disease neuroimaging initiative study further illustrates the utility of the methodology.
Collapse
Affiliation(s)
- Qi Yang
- School of Management, Shandong University, Jinan, China
| | - Chuchu Wang
- Department of Statistics, 26451Chinese University of Hong Kong, Hong Kong, China
| | - Haijin He
- College of Mathematics and Statistics, 47890Shenzhen University, Shenzhen, China
| | - Xiaoxiao Zhou
- Department of Statistics, 26451Chinese University of Hong Kong, Hong Kong, China
| | - Xinyuan Song
- Department of Statistics, 26451Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
6
|
Liu B, Zhang Q, Xue L, Song PXK, Kang J. Robust High-Dimensional Regression with Coefficient Thresholding and its Application to Imaging Data Analysis. J Am Stat Assoc 2022; 119:715-729. [PMID: 38818252 PMCID: PMC11136478 DOI: 10.1080/01621459.2022.2142590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Accepted: 10/18/2022] [Indexed: 11/06/2022]
Abstract
It is important to develop statistical techniques to analyze high-dimensional data in the presence of both complex dependence and possible heavy tails and outliers in real-world applications such as imaging data analyses. We propose a new robust high-dimensional regression with coefficient thresholding, in which an efficient nonconvex estimation procedure is proposed through a thresholding function and the robust Huber loss. The proposed regularization method accounts for complex dependence structures in predictors and is robust against heavy tails and outliers in outcomes. Theoretically, we rigorously analyze the landscape of the population and empirical risk functions for the proposed method. The fine landscape enables us to establish both statistical consistency and computational convergence under the high-dimensional setting. We also present an extension to incorporate spatial information into the proposed method. Finite-sample properties of the proposed methods are examined by extensive simulation studies. An application concerns a scalar-on-image regression analysis for an association of psychiatric disorder measured by the general factor of psychopathology with features extracted from the task functional MRI data in the Adolescent Brain Cognitive Development (ABCD) study.
Collapse
Affiliation(s)
| | - Qi Zhang
- The Pennsylvania State University
| | | | | | | |
Collapse
|
7
|
Morris EL, He K, Kang J. Scalar on network regression via boosting. Ann Appl Stat 2022; 16:2755-2773. [DOI: 10.1214/22-aoas1612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | - Kevin He
- Department of Biostatistics, University of Michigan
| | - Jian Kang
- Department of Biostatistics, University of Michigan
| |
Collapse
|
8
|
Ma X, Kundu S. Multi-task Learning with High-Dimensional Noisy Images. J Am Stat Assoc 2022; 119:650-663. [PMID: 38660581 PMCID: PMC11035991 DOI: 10.1080/01621459.2022.2140052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Accepted: 10/17/2022] [Indexed: 10/31/2022]
Abstract
Recent medical imaging studies have given rise to distinct but inter-related datasets corresponding to multiple experimental tasks or longitudinal visits. Standard scalar-on-image regression models that fit each dataset separately are not equipped to leverage information across inter-related images, and existing multi-task learning approaches are compromised by the inability to account for the noise that is often observed in images. We propose a novel joint scalar-on-image regression framework involving wavelet-based image representations with grouped penalties that are designed to pool information across inter-related images for joint learning, and which explicitly accounts for noise in high-dimensional images via a projection-based approach. In the presence of non-convexity arising due to noisy images, we derive non-asymptotic error bounds under non-convex as well as convex grouped penalties, even when the number of voxels increases exponentially with sample size. A projected gradient descent algorithm is used for computation, which is shown to approximate the optimal solution via well-defined non-asymptotic optimization error bounds under noisy images. Extensive simulations and application to a motivating longitudinal Alzheimer's disease study illustrate significantly improved predictive ability and greater power to detect true signals, that are simply missed by existing methods without noise correction due to the attenuation to null phenomenon.
Collapse
Affiliation(s)
- Xin Ma
- Department of Biostatistics and Bioinfomatics, Emory University
| | - Suprateek Kundu
- Department of Biostatistics, The University of Texas at MD Anderson Cancer Center
| | | |
Collapse
|
9
|
Kang K, Song X. Joint Modeling of Longitudinal Imaging and Survival Data. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2102027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Kai Kang
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong, China
| | - Xinyuan Song
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
10
|
Sun Y, Wang Q. An adaptive group LASSO approach for domain selection in functional generalized linear models. J Stat Plan Inference 2022. [DOI: 10.1016/j.jspi.2021.11.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
11
|
Zhang Y, Shen W, Kong D. Covariance Estimation for Matrix-valued Data. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2068419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Yichi Zhang
- Department of Statistics, North Carolina State University
| | - Weining Shen
- Department of Statistics, University of California, Irvine
| | - Dehan Kong
- Department of Statistical Sciences, University of Toronto
| |
Collapse
|
12
|
Zhu W, Xu S, Liu C, Li Y. Minimax Powerful Functional Analysis of Covariance Tests with Application to Longitudinal
Genome‐Wide
Association Studies. Scand Stat Theory Appl 2022. [DOI: 10.1111/sjos.12583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
| | - Sheng Xu
- Global Statistics and Data Science, BeiGene Co., Ltd. China
| | - Catherine Liu
- Department of Applied Mathematics Hong Kong Polytechnic University Hong Kong China
| | - Yehua Li
- Department of Statistics University of California Riverside CA USA
| |
Collapse
|
13
|
Li Y, Qiu Y, Xu Y. From multivariate to functional data analysis: fundamentals, recent developments, and emerging areas. J MULTIVARIATE ANAL 2022; 188:104806. [PMID: 39040141 PMCID: PMC11261241 DOI: 10.1016/j.jmva.2021.104806] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Functional data analysis (FDA), which is a branch of statistics on modeling infinite dimensional random vectors resided in functional spaces, has become a major research area for Journal of Multivariate Analysis. We review some fundamental concepts of FDA, their origins and connections from multivariate analysis, and some of its recent developments, including multi-level functional data analysis, high-dimensional functional regression, and dependent functional data analysis. We also discuss the impact of these new methodology developments on genetics, plant science, wearable device data analysis, image data analysis, and business analytics. Two real data examples are provided to motivate our discussions.
Collapse
Affiliation(s)
- Yehua Li
- University of California - Riverside, Riverside, CA 92521, USA
| | - Yumou Qiu
- Iowa State University, Ames, IA 50011, USA
| | - Yuhang Xu
- Bowling Green State University, Bowling Green, OH 43403, USA
| |
Collapse
|
14
|
Shi H, Yang Y, Wang L, Ma D, Beg MF, Pei J, Cao J. Two-Dimensional Functional Principal Component Analysis for Image Feature Extraction. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2035738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Haolun Shi
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, Canada, V5A1S6
| | - Yuping Yang
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, Canada, V5A1S6
| | - Liangliang Wang
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, Canada, V5A1S6
| | - Da Ma
- School of Medicine, Wake Forest University, NC, United States
| | - Mirza Faisal Beg
- School of Engineering, Simon Fraser University, Burnaby, Canada, V5A1S6
| | - Jian Pei
- School of Computing Science, Simon Fraser University, Burnaby, Canada, V5A1S6
| | - Jiguo Cao
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, Canada, V5A1S6
| |
Collapse
|
15
|
Revisiting Convexity-Preserving Signal Recovery with the Linearly Involved GMC Penalty. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
16
|
Min K, Mai Q. A general framework for tensor screening through smoothing. Electron J Stat 2022. [DOI: 10.1214/21-ejs1954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Keqian Min
- Department of Statistics, Florida State University, Tallahassee, Florida 32306, U.S.A
| | - Qing Mai
- Department of Statistics, Florida State University, Tallahassee, Florida 32306, U.S.A
| |
Collapse
|
17
|
Feng L, Bi X, Zhang H. Brain Regions Identified as Being Associated with Verbal Reasoning through the Use of Imaging Regression via Internal Variation. J Am Stat Assoc 2021; 116:144-158. [PMID: 34955572 DOI: 10.1080/01621459.2020.1766468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Brain-imaging data have been increasingly used to understand intellectual disabilities. Despite significant progress in biomedical research, the mechanisms for most of the intellectual disabilities remain unknown. Finding the underlying neurological mechanisms has been proved difficult, especially in children due to the rapid development of their brains. We investigate verbal reasoning, which is a reliable measure of individuals' general intellectual abilities, and develop a class of high-order imaging regression models to identify brain subregions which might be associated with this specific intellectual ability. A key novelty of our method is to take advantage of spatial brain structures, and specifically the piecewise smooth nature of most imaging coefficients in the form of high-order tensors. Our approach provides an effective and urgently needed method for identifying brain subregions potentially underlying certain intellectual disabilities. The idea behind our approach is a carefully constructed concept called Internal Variation (IV). The IV employs tensor decomposition and provides a computationally feasible substitution for Total Variation (TV), which has been considered in the literature to deal with similar problems but is problematic in high order tensor regression. Before applying our method to analyze the real data, we conduct comprehensive simulation studies to demonstrate the validity of our method in imaging signal identification. Then, we present our results from the analysis of a dataset based on the Philadelphia Neurodevelopmental Cohort for which we preprocessed the data including re-orienting, bias-field correcting, extracting, normalizing and registering the magnetic resonance images from 978 individuals. Our analysis identified a subregion across the cingulate cortex and the corpus callosum as being associated with individuals' verbal reasoning ability, which, to the best of our knowledge, is a novel region that has not been reported in the literature. This finding is useful in further investigation of functional mechansims for verbal reasoning.
Collapse
Affiliation(s)
- Long Feng
- Department of Biostatistics, Yale University
| | - Xuan Bi
- Information and Decision Sciences, Carlson School of Management, University of Minnesota
| | | |
Collapse
|
18
|
Lin Z, Müller HG. Total variation regularized Fréchet regression for metric-space valued data. Ann Stat 2021. [DOI: 10.1214/21-aos2095] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Zhenhua Lin
- Department of Statistics and Data Science, National University of Singapore
| | | |
Collapse
|
19
|
Cai Li BY, Zhang H. TENSOR QUANTILE REGRESSION WITH APPLICATION TO ASSOCIATION BETWEEN NEUROIMAGES AND HUMAN INTELLIGENCE. Ann Appl Stat 2021; 15:1455-1477. [PMID: 34567336 PMCID: PMC8462802 DOI: 10.1214/21-aoas1475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Human intelligence is usually measured by well-established psychometric tests through a series of problem solving. The recorded cognitive scores are continuous but usually heavy-tailed with potential outliers and violating the normality assumption. Meanwhile, magnetic resonance imaging (MRI) provides an unparalleled opportunity to study brain structures and cognitive ability. Motivated by association studies between MRI images and human intelligence, we propose a tensor quantile regression model, which is a general and robust alternative to the commonly used scalar-on-image linear regression. Moreover, we take into account rich spatial information of brain structures, incorporating low-rankness and piece-wise smoothness of imaging coefficients into a regularized regression framework. We formulate the optimization problem as a sequence of penalized quantile regressions with a generalized Lasso penalty based on tensor decomposition, and develop a computationally efficient alternating direction method of multipliers algorithm (ADMM) to estimate the model components. Extensive numerical studies are conducted to examine the empirical performance of the proposed method and its competitors. Finally, we apply the proposed method to a large-scale important dataset: the Human Connectome Project. We find that the tensor quantile regression can serve as a prognostic tool to assess future risk of cognitive impairment progression. More importantly, with the proposed method, we are able to identify the most activated brain subregions associated with quantiles of human intelligence. The prefrontal and anterior cingulate cortex are found to be mostly associated with lower and upper quantile of fluid intelligence. The insular cortex associated with median of fluid intelligence is a rarely reported region.
Collapse
Affiliation(s)
- B Y Cai Li
- Department of Biostatistics, Yale University
| | | |
Collapse
|
20
|
Zhou J, Sun WW, Zhang J, Li L. Partially Observed Dynamic Tensor Response Regression. J Am Stat Assoc 2021; 118:424-439. [PMID: 37333062 PMCID: PMC10274377 DOI: 10.1080/01621459.2021.1938082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Revised: 05/22/2021] [Accepted: 05/25/2021] [Indexed: 10/21/2022]
Abstract
In modern data science, dynamic tensor data prevail in numerous applications. An important task is to characterize the relationship between dynamic tensor datasets and external covariates. However, the tensor data are often only partially observed, rendering many existing methods inapplicable. In this article, we develop a regression model with a partially observed dynamic tensor as the response and external covariates as the predictor. We introduce the low-rankness, sparsity, and fusion structures on the regression coefficient tensor, and consider a loss function projected over the observed entries. We develop an efficient nonconvex alternating updating algorithm, and derive the finite-sample error bound of the actual estimator from each step of our optimization algorithm. Unobserved entries in the tensor response have imposed serious challenges. As a result, our proposal differs considerably in terms of estimation algorithm, regularity conditions, as well as theoretical properties, compared to the existing tensor completion or tensor response regression solutions. We illustrate the efficacy of our proposed method using simulations and two real applications, including a neuroimaging dementia study and a digital advertising study.
Collapse
Affiliation(s)
- Jie Zhou
- Department of Management Science, University of Miami Herbert Business School, Miami, FL
| | - Will Wei Sun
- Krannert School of Management, Purdue University, West Lafayette, IN
| | - Jingfei Zhang
- Department of Management Science, University of Miami Herbert Business School, Miami, FL
| | - Lexin Li
- Division of Biostatistics, University of California, Berkeley, Berkeley, CA
| |
Collapse
|
21
|
Roy A, Reich BJ, Guinness J, Shinohara RT, Staicu AM. Spatial Shrinkage Via the Product Independent Gaussian Process Prior. J Comput Graph Stat 2021. [DOI: 10.1080/10618600.2021.1923512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Arkaprava Roy
- Department of Biostatistics, University of Florida, Gainesville, FL
| | - Brian J. Reich
- Department of Statistics, North Carolina State University, Raleigh, NC
| | - Joseph Guinness
- Department of Statistics and Data Science, Cornell University, Ithaca, NY
| | | | - Ana-Maria Staicu
- Department of Statistics, North Carolina State University, Raleigh, NC
| |
Collapse
|
22
|
Mai Q, Zhang X, Pan Y, Deng K. A Doubly Enhanced EM Algorithm for Model-Based Tensor Clustering. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2021.1904959] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Qing Mai
- Department of Statistics, Florida State University, Tallahassee, FL
| | - Xin Zhang
- Department of Statistics, Florida State University, Tallahassee, FL
| | - Yuqing Pan
- Department of Statistics, Florida State University, Tallahassee, FL
| | - Kai Deng
- Department of Statistics, Florida State University, Tallahassee, FL
| |
Collapse
|
23
|
Zhou Y, He K. An improved tensor regression model via location smoothing. Stat (Int Stat Inst) 2021. [DOI: 10.1002/sta4.377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Ya Zhou
- Center for Applied Statistics and Institute of Statistics and Big Data Renmin University of China Beijing China
| | - Kejun He
- Center for Applied Statistics and Institute of Statistics and Big Data Renmin University of China Beijing China
| |
Collapse
|
24
|
Zhang Z, Wang X, Kong L, Zhu H. High-Dimensional Spatial Quantile Function-on-Scalar Regression. J Am Stat Assoc 2021; 117:1563-1578. [PMID: 37008532 PMCID: PMC10065478 DOI: 10.1080/01621459.2020.1870984] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
This article develops a novel spatial quantile function-on-scalar regression model, which studies the conditional spatial distribution of a high-dimensional functional response given scalar predictors. With the strength of both quantile regression and copula modeling, we are able to explicitly characterize the conditional distribution of the functional or image response on the whole spatial domain. Our method provides a comprehensive understanding of the effect of scalar covariates on functional responses across different quantile levels and also gives a practical way to generate new images for given covariate values. Theoretically, we establish the minimax rates of convergence for estimating coefficient functions under both fixed and random designs. We further develop an efficient primal-dual algorithm to handle high-dimensional image data. Simulations and real data analysis are conducted to examine the finite-sample performance.
Collapse
Affiliation(s)
- Zhengwu Zhang
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC
| | - Xiao Wang
- Department of Statistics, Purdue University, West Lafayette, IN
| | - Linglong Kong
- Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, AB, Canada
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC
| |
Collapse
|
25
|
Guo C, Kang J, Johnson TD. A spatial Bayesian latent factor model for image-on-image regression. Biometrics 2020; 78:72-84. [PMID: 33368210 DOI: 10.1111/biom.13420] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Revised: 12/03/2020] [Accepted: 12/10/2020] [Indexed: 11/30/2022]
Abstract
Image-on-image regression analysis, using images to predict images, is a challenging task, due to (1) the high dimensionality and (2) the complex spatial dependence structures in image predictors and image outcomes. In this work, we propose a novel image-on-image regression model, by extending a spatial Bayesian latent factor model to image data, where low-dimensional latent factors are adopted to make connections between high-dimensional image outcomes and image predictors. We assign Gaussian process priors to the spatially varying regression coefficients in the model, which can well capture the complex spatial dependence among image outcomes as well as that among the image predictors. We perform simulation studies to evaluate the out-of-sample prediction performance of our method compared with linear regression and voxel-wise regression methods for different scenarios. The proposed method achieves better prediction accuracy by effectively accounting for the spatial dependence and efficiently reduces image dimensions with latent factors. We apply the proposed method to analysis of multimodal image data in the Human Connectome Project where we predict task-related contrast maps using subcortical volumetric seed maps.
Collapse
Affiliation(s)
- Cui Guo
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
| | - Timothy D Johnson
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
26
|
Li T, Li T, Zhu Z, Zhu H. Regression Analysis of Asynchronous Longitudinal Functional and Scalar Data. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1844211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Ting Li
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| | - Tengfei Li
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Zhongyi Zhu
- Department of Statistics, Fudan University, Shanghai, China
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC
| |
Collapse
|
27
|
Liu Y, Liu J, Zhu C. Low-Rank Tensor Train Coefficient Array Estimation for Tensor-on-Tensor Regression. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:5402-5411. [PMID: 32054589 DOI: 10.1109/tnnls.2020.2967022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The tensor-on-tensor regression can predict a tensor from a tensor, which generalizes most previous multilinear regression approaches, including methods to predict a scalar from a tensor, and a tensor from a scalar. However, the coefficient array could be much higher dimensional due to both high-order predictors and responses in this generalized way. Compared with the current low CANDECOMP/PARAFAC (CP) rank approximation-based method, the low tensor train (TT) approximation can further improve the stability and efficiency of the high or even ultrahigh-dimensional coefficient array estimation. In the proposed low TT rank coefficient array estimation for tensor-on-tensor regression, we adopt a TT rounding procedure to obtain adaptive ranks, instead of selecting ranks by experience. Besides, an l2 constraint is imposed to avoid overfitting. The hierarchical alternating least square is used to solve the optimization problem. Numerical experiments on a synthetic data set and two real-life data sets demonstrate that the proposed method outperforms the state-of-the-art methods in terms of prediction accuracy with comparable computational complexity, and the proposed method is more computationally efficient when the data are high dimensional with small size in each mode.
Collapse
|
28
|
Hu W, Pan T, Kong D, Shen W. Nonparametric matrix response regression with application to brain imaging data analysis. Biometrics 2020; 77:1227-1240. [PMID: 32869275 DOI: 10.1111/biom.13362] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 07/19/2020] [Accepted: 08/20/2020] [Indexed: 11/26/2022]
Abstract
With the rapid growth of neuroimaging technologies, a great effort has been dedicated recently to investigate the dynamic changes in brain activity. Examples include time course calcium imaging and dynamic brain functional connectivity. In this paper, we propose a novel nonparametric matrix response regression model to characterize the nonlinear association between 2D image outcomes and predictors such as time and patient information. Our estimation procedure can be formulated as a nuclear norm regularization problem, which can capture the underlying low-rank structure of the dynamic 2D images. We present a computationally efficient algorithm, derive the asymptotic theory, and show that the method outperforms other existing approaches in simulations. We then apply the proposed method to a calcium imaging study for estimating the change of fluorescent intensities of neurons, and an electroencephalography study for a comparison in the dynamic connectivity covariance matrices between alcoholic and control individuals. For both studies, the method leads to a substantial improvement in prediction error.
Collapse
Affiliation(s)
- Wei Hu
- Department of Statistics, University of California, Irvine, California
| | - Tianyu Pan
- Department of Statistics, University of California, Irvine, California
| | - Dehan Kong
- Department of Statistical Sciences, University of Toronto, Canada
| | - Weining Shen
- Department of Statistics, University of California, Irvine, California
| |
Collapse
|
29
|
Gao X, Shen W, Zhang L, Hu J, Fortin NJ, Frostig RD, Ombao H. Regularized matrix data clustering and its application to image analysis. Biometrics 2020; 77:890-902. [PMID: 32799339 DOI: 10.1111/biom.13354] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 06/13/2020] [Accepted: 07/21/2020] [Indexed: 11/26/2022]
Abstract
We propose a novel regularized mixture model for clustering matrix-valued data. The proposed method assumes a separable covariance structure for each cluster and imposes a sparsity structure (eg, low rankness, spatial sparsity) for the mean signal of each cluster. We formulate the problem as a finite mixture model of matrix-normal distributions with regularization terms, and then develop an expectation maximization type of algorithm for efficient computation. In theory, we show that the proposed estimators are strongly consistent for various choices of penalty functions. Simulation and two applications on brain signal studies confirm the excellent performance of the proposed method including a better prediction accuracy than the competitors and the scientific interpretability of the solution.
Collapse
Affiliation(s)
- Xu Gao
- Department of Statistics, University of California, Irvine, California
| | - Weining Shen
- Department of Statistics, University of California, Irvine, California
| | - Liwen Zhang
- Shanghai University of Finance and Economics, Shanghai, China
| | - Jianhua Hu
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York
| | - Norbert J Fortin
- Department of Neurobiology and Behavior, University of California, Irvine, California
| | - Ron D Frostig
- Department of Neurobiology and Behavior, University of California, Irvine, California.,Department of Biomedical Engineering, University of California, Irvine, California
| | - Hernando Ombao
- Statistics Program, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| |
Collapse
|
30
|
An B, Zhang B. Logistic regression with image covariates via the combination of L1 and Sobolev regularizations. PLoS One 2020; 15:e0234975. [PMID: 32589677 PMCID: PMC7319310 DOI: 10.1371/journal.pone.0234975] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 06/06/2020] [Indexed: 11/19/2022] Open
Abstract
The use of image covariates to build a classification model has lots of impact in various fields, such as computer science, medicine, and so on. The aim of this paper is to develop an estimation method for logistic regression model with image covariates. We propose a novel regularized estimation approach, where the regularization is a combination of L1 regularization and Sobolev norm regularization. The L1 penalty can perform variable selection, while the Sobolev norm penalty can capture the shape edges information of image data. We develop an efficient algorithm for the optimization problem. We also establish a nonasymptotic error bound on parameter estimation. Simulated studies and a real data application demonstrate that our proposed method performs very well.
Collapse
Affiliation(s)
- Baiguo An
- School of Statistics, Capital University of Economics and Business, Beijing, China
| | - Beibei Zhang
- School of Statistics, Capital University of Economics and Business, Beijing, China
| |
Collapse
|
31
|
Feng X, Li T, Song X, Zhu H. Bayesian Scalar on Image Regression With Nonignorable Nonresponse. J Am Stat Assoc 2019; 115:1574-1597. [PMID: 33627920 PMCID: PMC7901831 DOI: 10.1080/01621459.2019.1686391] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Revised: 10/09/2019] [Accepted: 10/23/2019] [Indexed: 10/25/2022]
Abstract
Medical imaging has become an increasingly important tool in screening, diagnosis, prognosis, and treatment of various diseases given its information visualization and quantitative assessment. The aim of this article is to develop a Bayesian scalar-on-image regression model to integrate high-dimensional imaging data and clinical data to predict cognitive, behavioral, or emotional outcomes, while allowing for nonignorable missing outcomes. Such a nonignorable nonresponse consideration is motivated by examining the association between baseline characteristics and cognitive abilities for 802 Alzheimer patients enrolled in the Alzheimer's Disease Neuroimaging Initiative 1 (ADNI1), for which data are partially missing. Ignoring such missing data may distort the accuracy of statistical inference and provoke misleading results. To address this issue, we propose an imaging exponential tilting model to delineate the data missing mechanism and incorporate an instrumental variable to facilitate model identifiability followed by a Bayesian framework with Markov chain Monte Carlo algorithms to conduct statistical inference. This approach is validated in simulation studies where both the finite sample performance and asymptotic properties are evaluated and compared with the model with fully observed data and that with a misspecified ignorable missing mechanism. Our proposed methods are finally carried out on the ADNI1 dataset, which turns out to capture both of those clinical risk factors and imaging regions consistent with the existing literature that exhibits clinical significance. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
Collapse
Affiliation(s)
- Xiangnan Feng
- School of Economics and Management, Southwest Jiaotong University, Chengdu, China
| | - Tengfei Li
- Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Xinyuan Song
- Department of Statistics, Chinese University of Hong Kong, Shatin, NT, Hong Kong
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC
| |
Collapse
|
32
|
Affiliation(s)
- Wei Hu
- Department of Statistics, University of California, Irvine, CA
| | - Weining Shen
- Department of Statistics, University of California, Irvine, CA
| | - Hua Zhou
- Department of Biostatistics, University of California, Los Angeles, CA
| | - Dehan Kong
- Department of Statistical Sciences, University of Toronto, Toronto
| |
Collapse
|
33
|
Hu W, Shen W, Zhou H, Kong D. Matrix Linear Discriminant Analysis. Technometrics 2019; 62:196-205. [PMID: 32523233 PMCID: PMC7286587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
We propose a novel linear discriminant analysis (LDA) approach for the classification of high-dimensional matrix-valued data that commonly arises from imaging studies. Motivated by the equivalence of the conventional LDA and the ordinary least squares, we consider an efficient nuclear norm penalized regression that encourages a low-rank structure. Theoretical properties including a nonasymptotic risk bound and a rank consistency result are established. Simulation studies and an application to electroencephalography data show the superior performance of the proposed method over the existing approaches.
Collapse
Affiliation(s)
- Wei Hu
- Department of Statistics, University of California, Irvine, CA
| | - Weining Shen
- Department of Statistics, University of California, Irvine, CA
| | - Hua Zhou
- Department of Biostatistics, University of California, Los Angeles, CA
| | - Dehan Kong
- Department of Statistical Sciences, University of Toronto, Toronto
| |
Collapse
|
34
|
Tang X, Bi X, Qu A. Individualized Multilayer Tensor Learning With an Application in Imaging Analysis. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2019.1585254] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Xiwei Tang
- Department of Statistics, University of Virginia, Charlottesville, VA
| | - Xuan Bi
- Department of Information and Decision Sciences, Carlson School of Management, University of Minnesota, Minneapolis, MN
| | - Annie Qu
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL
| |
Collapse
|
35
|
Affiliation(s)
- Yuqing Pan
- Department of Statistics, Florida State University, Tallahassee, FL
| | - Qing Mai
- Department of Statistics, Florida State University, Tallahassee, FL
| | - Xin Zhang
- Department of Statistics, Florida State University, Tallahassee, FL
| |
Collapse
|
36
|
Xue W, Bowman FD, Kang J. A Bayesian Spatial Model to Predict Disease Status Using Imaging Data From Various Modalities. Front Neurosci 2018; 12:184. [PMID: 29632471 PMCID: PMC5879954 DOI: 10.3389/fnins.2018.00184] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2017] [Accepted: 03/06/2018] [Indexed: 11/24/2022] Open
Abstract
Relating disease status to imaging data stands to increase the clinical significance of neuroimaging studies. Many neurological and psychiatric disorders involve complex, systems-level alterations that manifest in functional and structural properties of the brain and possibly other clinical and biologic measures. We propose a Bayesian hierarchical model to predict disease status, which is able to incorporate information from both functional and structural brain imaging scans. We consider a two-stage whole brain parcellation, partitioning the brain into 282 subregions, and our model accounts for correlations between voxels from different brain regions defined by the parcellations. Our approach models the imaging data and uses posterior predictive probabilities to perform prediction. The estimates of our model parameters are based on samples drawn from the joint posterior distribution using Markov Chain Monte Carlo (MCMC) methods. We evaluate our method by examining the prediction accuracy rates based on leave-one-out cross validation, and we employ an importance sampling strategy to reduce the computation time. We conduct both whole-brain and voxel-level prediction and identify the brain regions that are highly associated with the disease based on the voxel-level prediction results. We apply our model to multimodal brain imaging data from a study of Parkinson's disease. We achieve extremely high accuracy, in general, and our model identifies key regions contributing to accurate prediction including caudate, putamen, and fusiform gyrus as well as several sensory system regions.
Collapse
Affiliation(s)
- Wenqiong Xue
- Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - F DuBois Bowman
- Department of Biostatistics, The Mailman School of Public Health, Columbia University, New York, NY, United States
| | - Jian Kang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
37
|
Kang J, Reich BJ, Staicu AM. Scalar-on-Image Regression via the Soft-Thresholded Gaussian Process. Biometrika 2018; 105:165-184. [PMID: 30686828 PMCID: PMC6345249 DOI: 10.1093/biomet/asx075] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
This work concerns spatial variable selection for scalar-on-image regression. We propose a new class of Bayesian nonparametric models and develop an efficient posterior computational aigorithm. The proposed soft-thresholded Gaussian process provides large prior support over the class of piecewise-smooth, sparse, and continuous spatially-varying regression coefficient functions. In addition, under some mild regularity conditions the soft-thresholded Gaussian proess prior leads to the posterior consistency for parameter estimation and variable selection for scalar-on-image regression, even when the number of predictors is larger than the sample size. The proposed method is compared to alternatives via simulation and applied to an electroen-cephalography study of alcoholism.
Collapse
Affiliation(s)
- Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, U.S.A.
| | - Brian J Reich
- Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695, U.S.A.
| | - Ana-Maria Staicu
- Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695, U.S.A.
| |
Collapse
|
38
|
Abstract
Traditional variable selection methods are compromised by overlooking useful information on covariates with similar functionality or spatial proximity, and by treating each covariate independently. Leveraging prior grouping information on covariates, we propose partition-based screening methods for ultrahigh-dimensional variables in the framework of generalized linear models. We show that partition-based screening exhibits the sure screening property with a vanishing false selection rate, and we propose a data-driven partition screening framework with unavailable or unreliable prior knowledge on covariate grouping and investigate its theoretical properties. We consider two special cases: correlation-guided partitioning and spatial location- guided partitioning. In the absence of a single partition, we propose a theoretically justified strategy for combining statistics from various partitioning methods. The utility of the proposed methods is demonstrated via simulation and analysis of functional neuroimaging data.
Collapse
Affiliation(s)
- Jian Kang
- Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, Michigan 48109, U.S.A
| | - Hyokyoung G Hong
- Department of Statistics and Probability, Michigan State University, 619 Red Cedar Rd, East Lansing, Michigan 48823, U.S.A
| | - Y I Li
- Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, Michigan 48109, U.S.A
| |
Collapse
|