1
|
Miranda MF. A canonical polyadic tensor basis for fast Bayesian estimation of multi-subject brain activation patterns. Front Neuroinform 2024; 18:1399391. [PMID: 39188665 PMCID: PMC11345152 DOI: 10.3389/fninf.2024.1399391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 07/29/2024] [Indexed: 08/28/2024] Open
Abstract
Task-evoked functional magnetic resonance imaging studies, such as the Human Connectome Project (HCP), are a powerful tool for exploring how brain activity is influenced by cognitive tasks like memory retention, decision-making, and language processing. A fast Bayesian function-on-scalar model is proposed for estimating population-level activation maps linked to the working memory task. The model is based on the canonical polyadic (CP) tensor decomposition of coefficient maps obtained for each subject. This decomposition effectively yields a tensor basis capable of extracting both common features and subject-specific features from the coefficient maps. These subject-specific features, in turn, are modeled as a function of covariates of interest using a Bayesian model that accounts for the correlation of the CP-extracted features. The dimensionality reduction achieved with the tensor basis allows for a fast MCMC estimation of population-level activation maps. This model is applied to one hundred unrelated subjects from the HCP dataset, yielding significant insights into brain signatures associated with working memory.
Collapse
Affiliation(s)
- Michelle F. Miranda
- Department of Mathematics and Statistics, University of Victoria, Victoria, BC, Canada
| |
Collapse
|
2
|
Chen M, Zhou Y. Causal mediation analysis with a three-dimensional image mediator. Stat Med 2024; 43:2869-2893. [PMID: 38733218 DOI: 10.1002/sim.10106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 03/20/2024] [Accepted: 04/29/2024] [Indexed: 05/13/2024]
Abstract
Causal mediation analysis is increasingly abundant in biology, psychology, and epidemiology studies and so forth. In particular, with the advent of the big data era, the issue of high-dimensional mediators is becoming more prevalent. In neuroscience, with the widespread application of magnetic resonance technology in the field of brain imaging, studies on image being a mediator emerged. In this study, a novel causal mediation analysis method with a three-dimensional image mediator is proposed. We define the average casual effects under the potential outcome framework, explore several sufficient conditions for the valid identification, and develop techniques for estimation and inference. To verify the effectiveness of the proposed method, a series of simulations under various scenarios is performed. Finally, the proposed method is applied to a study on the causal effect of mother's delivery mode on child's IQ development. It is found that cesarean section may have a negative effect on intellectual performance and that this effect is mediated by white matter development. Additional prospective and longitudinal studies may be necessary to validate these emerging findings.
Collapse
Affiliation(s)
- Minghao Chen
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, People's Republic of China
| | - Yingchun Zhou
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, People's Republic of China
- Institute of Brain and Education Innovation, East China Normal University, Shanghai, People's Republic of China
| |
Collapse
|
3
|
Guha S, Rodriguez-Acosta J, Dinov ID. A Bayesian Multiplex Graph Classifier of Functional Brain Connectivity Across Diverse Tasks of Cognitive Control. Neuroinformatics 2024:10.1007/s12021-024-09670-w. [PMID: 38861097 DOI: 10.1007/s12021-024-09670-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/22/2024] [Indexed: 06/12/2024]
Abstract
This article seeks to investigate the impact of aging on functional connectivity across different cognitive control scenarios, particularly emphasizing the identification of brain regions significantly associated with early aging. By conceptualizing functional connectivity within each cognitive control scenario as a graph, with brain regions as nodes, the statistical challenge revolves around devising a regression framework to predict a binary scalar outcome (aging or normal) using multiple graph predictors. Popular regression methods utilizing multiplex graph predictors often face limitations in effectively harnessing information within and across graph layers, leading to potentially less accurate inference and predictive accuracy, especially for smaller sample sizes. To address this challenge, we propose the Bayesian Multiplex Graph Classifier (BMGC). Accounting for multiplex graph topology, our method models edge coefficients at each graph layer using bilinear interactions between the latent effects associated with the two nodes connected by the edge. This approach also employs a variable selection framework on node-specific latent effects from all graph layers to identify influential nodes linked to observed outcomes. Crucially, the proposed framework is computationally efficient and quantifies the uncertainty in node identification, coefficient estimation, and binary outcome prediction. BMGC outperforms alternative methods in terms of the aforementioned metrics in simulation studies. An additional BMGC validation was completed using an fMRI study of brain networks in adults. The proposed BMGC technique identified that sensory motor brain network obeys certain lateral symmetries, whereas the default mode network exhibits significant brain asymmetries associated with early aging.
Collapse
Affiliation(s)
- Sharmistha Guha
- Department of Statistics, Texas A&M University, 3143 TAMU, College Station, 77843, TX, USA.
| | - Jose Rodriguez-Acosta
- Department of Statistics, Texas A&M University, 3143 TAMU, College Station, 77843, TX, USA
| | - Ivo D Dinov
- Statistics Online Computational Resource, University of Michigan, 426 N. Ingalls St., Ann Arbor, 48109, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, 48109, MI, USA
| |
Collapse
|
4
|
Liu Z, Lee CY, Zhang H. TENSOR QUANTILE REGRESSION WITH LOW-RANK TENSOR TRAIN ESTIMATION. Ann Appl Stat 2024; 18:1294-1318. [PMID: 38682044 PMCID: PMC11046526 DOI: 10.1214/23-aoas1835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2024]
Abstract
Neuroimaging studies often involve predicting a scalar outcome from an array of images collectively called tensor. The use of magnetic resonance imaging (MRI) provides a unique opportunity to investigate the structures of the brain. To learn the association between MRI images and human intelligence, we formulate a scalar-on-image quantile regression framework. However, the high dimensionality of the tensor makes estimating the coefficients for all elements computationally challenging. To address this, we propose a low-rank coefficient array estimation algorithm based on tensor train (TT) decomposition which we demonstrate can effectively reduce the dimensionality of the coefficient tensor to a feasible level while ensuring adequacy to the data. Our method is more stable and efficient compared to the commonly used, Canonic Polyadic rank approximation-based method. We also propose a generalized Lasso penalty on the coefficient tensor to take advantage of the spatial structure of the tensor, further reduce the dimensionality of the coefficient tensor, and improve the interpretability of the model. The consistency and asymptotic normality of the TT estimator are established under some mild conditions on the covariates and random errors in quantile regression models. The rate of convergence is obtained with regularization under the total variation penalty. Extensive numerical studies, including both synthetic and real MRI imaging data, are conducted to examine the empirical performance of the proposed method and its competitors.
Collapse
Affiliation(s)
- Zihuan Liu
- Department of Biostatistics, Yale University
| | - Cheuk Yin Lee
- School of Science and Engineering, Chinese University of Hong Kong, Shenzhen
| | | |
Collapse
|
5
|
Xu T, Chen K, Li G. TENSOR REGRESSION FOR INCOMPLETE OBSERVATIONS WITH APPLICATION TO LONGITUDINAL STUDIES. Ann Appl Stat 2024; 18:1195-1212. [PMID: 39360180 PMCID: PMC11446469 DOI: 10.1214/23-aoas1830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2024]
Abstract
Multivariate longitudinal data are frequently encountered in practice such as in our motivating longitudinal microbiome study. It is of general interest to associate such high-dimensional, longitudinal measures with some univariate continuous outcome. However, incomplete observations are common in a regular study design, as not all samples are measured at every time point, giving rise to the so-called blockwise missing values. Such missing structure imposes significant challenges for association analysis and defies many existing methods that require complete samples. In this paper we propose to represent multivariate longitudinal data as a three-way tensor array (i.e., sample-by-feature-by-time) and exploit a parsimonious scalar-on-tensor regression model for association analysis. We develop a regularized covariance-based estimation procedure that effectively leverages all available observations without imputation. The method achieves variable selection and smooth estimation of time-varying effects. The application to the motivating microbiome study reveals interesting links between the preterm infant's gut microbiome dynamics and their neurodevelopment. Additional numerical studies on synthetic data and a longitudinal aging study further demonstrate the efficacy of the proposed method.
Collapse
Affiliation(s)
| | - Kun Chen
- Department of Statistics, University of Connecticut
| | - Gen Li
- Department of Biostatistics, University of Michigan, Ann Arbor
| |
Collapse
|
6
|
Brzyski D, Hu X, Goni J, Ances B, Randolph TW, Harezlak J. Matrix-Variate Regression for Sparse, Low-Rank Estimation of Brain Connectivities Associated With a Clinical Outcome. IEEE Trans Biomed Eng 2024; 71:1378-1390. [PMID: 37995175 PMCID: PMC11127715 DOI: 10.1109/tbme.2023.3336241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2023]
Abstract
OBJECTIVE We address the problem of finding brain connectivities that are associated with a clinical outcome or phenotype. METHODS The proposed framework regresses a (scalar) clinical outcome on matrix-variate predictors which arise in the form of brain connectivity matrices. For example, in a large cohort of subjects we estimate those regions of functional connectivities that are associated with neurocognitive scores. We approach this high-dimensional yet highly structured estimation problem by formulating a regularized estimation process that results in a low-rank coefficient matrix having a sparse set of nonzero entries which represent regions of biologically relevant connectivities. In contrast to the recent literature on estimating a sparse, low-rank matrix from a single noisy observation, our scalar-on-matrix regression framework produces a data-driven extraction of structures that are associated with a clinical response. The method, called Sparsity Inducing Nuclear-Norm Estimator (SpINNEr), simultaneously constrains the regression coefficient matrix in two ways: a nuclear norm penalty encourages low-rank structure while an l1 norm encourages entry-wise sparsity. RESULTS Our simulations show that SpINNEr outperforms other methods in estimation accuracy when the response-related entries (representing the brain's functional connectivity) are arranged in well-connected communities. SpINNEr is applied to investigate associations between HIV-related outcomes and functional connectivity in the human brain. CONCLUSION AND SIGNIFICANCE Overall, this work demonstrates the potential of SpINNEr to recover sparse and low-rank estimates under scalar-on-matrix regression framework.
Collapse
|
7
|
Niyogi PG, Lindquist MA, Maiti T. A tensor based varying-coefficient model for multi-modal neuroimaging data analysis. IEEE TRANSACTIONS ON SIGNAL PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 72:1607-1619. [PMID: 39479188 PMCID: PMC11521373 DOI: 10.1109/tsp.2024.3375768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2024]
Abstract
All neuroimaging modalities have their own strengths and limitations. A current trend is toward interdisciplinary approaches that use multiple imaging methods to overcome limitations of each method in isolation. At the same time neuroimaging data is increasingly being combined with other non-imaging modalities, such as behavioral and genetic data. The data structure of many of these modalities can be expressed as time-varying multidimensional arrays (tensors), collected at different time-points on multiple subjects. Here, we consider a new approach for the study of neural correlates in the presence of tensor-valued brain images and tensor-valued covariates, where both data types are collected over the same set of time points. We propose a time-varying tensor regression model with an inherent structural composition of responses and covariates. Regression coefficients are expressed using the B-spline technique, and the basis function coefficients are estimated using CP-decomposition by minimizing a penalized loss function. We develop a varying-coefficient model for the tensor-valued regression model, where both covariates and responses are modeled as tensors. This development is a non-trivial extension of function-on-function concurrent linear models for complex and large structural data, where the inherent structures are preserved. In addition to the methodological and theoretical development, the efficacy of the proposed method based on both simulated and real data analysis (e.g., the combination of eye-tracking data and functional magnetic resonance imaging (fMRI) data) is also discussed.
Collapse
Affiliation(s)
- Pratim Guha Niyogi
- Department of Biostatistics at Johns Hopkins Bloomberg School of Public Health
| | | | - Tapabrata Maiti
- Department of Statistics and Probability, Division of Mathematical Sciences, National Science Foundation (NSF)
| |
Collapse
|
8
|
Wang S, Constable T, Zhang H, Zhao Y. Heterogeneity Analysis on Multi-state Brain Functional Connectivity and Adolescent Neurocognition. J Am Stat Assoc 2024; 119:851-863. [PMID: 39371422 PMCID: PMC11451334 DOI: 10.1080/01621459.2024.2311363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 01/03/2024] [Accepted: 01/22/2024] [Indexed: 10/08/2024]
Abstract
Brain functional connectivity or connectome, a unique measure for brain functional organization, provides a great potential to explain the neurobiological underpinning of behavioral profiles. Existing connectome-based analyses highly concentrate on brain activities under a single cognitive state, and fail to consider heterogeneity when attempting to characterize brain-to-behavior relationships. In this work, we study the complex impact of multi-state functional connectivity on behaviors by analyzing the data from a recent landmark brain development and child health study. We propose a nonparametric, Bayesian supervised heterogeneity analysis to uncover neurodevelopmental subtypes with distinct effect mechanisms. We impose stochastic block structures to identify network-based functional phenotypes and develop a variational expectation-maximization algorithm to facilitate an efficient posterior computation. Through integrating resting-state and task-related functional connectomes, we dissect heterogeneous effect mechanisms on children's fluid intelligence from the functional network phenotypes including Fronto-parietal Network and Default Mode Network under different cognitive states. Based on extensive simulations, we further confirm the superior performance of our method on uncovering brain-to-behavior relationships.
Collapse
Affiliation(s)
- Shiying Wang
- Department of Biostatistics, Yale University, New Haven, CT
| | - Todd Constable
- Department of Radiology & Biomedical Imaging, Yale University, New Haven, CT
| | - Heping Zhang
- Department of Biostatistics, Yale University, New Haven, CT
| | - Yize Zhao
- Department of Biostatistics, Yale University, New Haven, CT
| |
Collapse
|
9
|
Wang K, Xu Y. Bayesian tensor-on-tensor regression with efficient computation. STATISTICS AND ITS INTERFACE 2024; 17:199-217. [PMID: 38469276 PMCID: PMC10927259 DOI: 10.4310/23-sii786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
We propose a Bayesian tensor-on-tensor regression approach to predict a multidimensional array (tensor) of arbitrary dimensions from another tensor of arbitrary dimensions, building upon the Tucker decomposition of the regression coefficient tensor. Traditional tensor regression methods making use of the Tucker decomposition either assume the dimension of the core tensor to be known or estimate it via cross-validation or some model selection criteria. However, no existing method can simultaneously estimate the model dimension (the dimension of the core tensor) and other model parameters. To fill this gap, we develop an efficient Markov Chain Monte Carlo (MCMC) algorithm to estimate both the model dimension and parameters for posterior inference. Besides the MCMC sampler, we also develop an ultra-fast optimization-based computing algorithm wherein the maximum a posteriori estimators for parameters are computed, and the model dimension is optimized via a simulated annealing algorithm. The proposed Bayesian framework provides a natural way for uncertainty quantification. Through extensive simulation studies, we evaluate the proposed Bayesian tensor-on-tensor regression model and show its superior performance compared to alternative methods. We also demonstrate its practical effectiveness by applying it to two real-world datasets, including facial imaging data and 3D motion data.
Collapse
Affiliation(s)
- Kunbo Wang
- 3400 N. Charles Street, Baltimore, MD 21218
| | - Yanxun Xu
- 3400 N. Charles Street, Baltimore, MD 21218
| |
Collapse
|
10
|
Sanchez S DA, Guevara G RD, Calderón V SA. Comparison of two statistical methodologies for a binary classification problem of two-dimensional images. J Appl Stat 2023; 51:2279-2297. [PMID: 39267712 PMCID: PMC11389644 DOI: 10.1080/02664763.2023.2279012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Accepted: 10/29/2023] [Indexed: 09/15/2024]
Abstract
The present work intends to compare two statistical classification methods using images as covariates and under the comparison criterion of the ROC curve. The first implemented procedure is based on exploring a mathematical-statistical model using multidimensional arrangements, frequently known as tensors. It is based on the theoretical framework of the high-dimensional generalized linear model. The second methodology is situated in the field of functional data analysis, particularly in the space of functions that have a finite measure of the total variation. A simulation study is carried out to compare both classification methodologies using the area under the ROC curve (AUC). The model based on functional data had better performance than the tensor model. A real data application using medical images is presented.
Collapse
Affiliation(s)
- Deniz A Sanchez S
- Facultad de Ciencias, Departamento de Estadística, Universidad Nacional de Colombia, Sede Bogotá, Bogotá, Colombia
| | - Rubén D Guevara G
- Facultad de Ciencias, Departamento de Estadística, Universidad Nacional de Colombia, Sede Bogotá, Bogotá, Colombia
| | - Sergio A Calderón V
- Facultad de Ciencias, Departamento de Estadística, Universidad Nacional de Colombia, Sede Bogotá, Bogotá, Colombia
| |
Collapse
|
11
|
Kim J, Sandri BJ, Rao RB, Lock EF. Bayesian predictive modeling of multi-source multi-way data. Comput Stat Data Anal 2023; 186:107783. [PMID: 37274461 PMCID: PMC10237362 DOI: 10.1016/j.csda.2023.107783] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
A Bayesian approach to predict a continuous or binary outcome from data that are collected from multiple sources with a multi-way (i.e., multidimensional tensor) structure is described. As a motivating example, molecular data from multiple 'omics sources, each measured over multiple developmental time points, as predictors of early-life iron deficiency (ID) in a rhesus monkey model are considered. The method uses a linear model with a low-rank structure on the coefficients to capture multi-way dependence and model the variance of the coefficients separately across each source to infer their relative contributions. Conjugate priors facilitate an efficient Gibbs sampling algorithm for posterior inference, assuming a continuous outcome with normal errors or a binary outcome with a probit link. Simulations demonstrate that the model performs as expected in terms of misclassification rates and correlation of estimated coefficients with true coefficients, with large gains in performance by incorporating multi-way structure and modest gains when accounting for differing signal sizes across the different sources. Moreover, it provides robust classification of ID monkeys for the motivating application.
Collapse
Affiliation(s)
- Jonathan Kim
- Division of Biostatistics, University of Minnesota, Minneapolis, 55455, USA
| | - Brian J. Sandri
- Division of Neonatology, Department of Pediatrics, University of Minnesota, Minneapolis, MN, USA
- Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, MN, USA
| | - Raghavendra B. Rao
- Division of Neonatology, Department of Pediatrics, University of Minnesota, Minneapolis, MN, USA
- Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, MN, USA
| | - Eric F. Lock
- Division of Biostatistics, University of Minnesota, Minneapolis, 55455, USA
| |
Collapse
|
12
|
Yao Y, Charkraborty D, Zhang L, Shen X, Pan W. Deep causal feature extraction and inference with neuroimaging genetic data. Stat Med 2023; 42:3665-3684. [PMID: 37336556 PMCID: PMC11193942 DOI: 10.1002/sim.9824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 04/04/2023] [Accepted: 05/29/2023] [Indexed: 06/21/2023]
Abstract
Alzheimer's disease (AD) is a severe public health issue in the world. Magnetic Resonance Imaging (MRI) offers a way to study brain differences between AD patients and healthy individuals through feature extraction and comparison. However, in most previous works, the extracted features were not aimed to be causal, hindering biological understanding and interpretation. In order to extract causal features, we propose using instrumental variable (IV) regression with genetic variants as IVs. Specifically, we propose Deep Feature Extraction via Instrumental Variable Regression (DeepFEIVR), which uses a nonlinear neural network to extract causal features from three-dimensional neuroimages to predict an outcome (eg, AD status in our application) while maintaining a linear relationship between the extracted features and IVs. DeepFEIVR not only can handle high dimensional individual-level data for model building, but also is applicable to GWAS summary data to test associations of the extracted features with the outcome in subsequent analysis. In addition, we propose an extension of DeepFEIVR, called DeepFEIVR-CA, for covariate adjustment (CA). We apply DeepFEIVR and DeepFEIVR-CA to the Alzheimer's Disease Neuroimaging Initiative (ADNI) individual-level data as training data for model building, then apply to the UK Biobank neuroimaging and the International Genomics of Alzheimer's Project (IGAP) AD GWAS summary data, showcasing how the extracted causal features are related to AD and various brain endophenotypes.
Collapse
Affiliation(s)
- Yuchen Yao
- School of Statistics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Dipnil Charkraborty
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| | - Lin Zhang
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| | - Xiaotong Shen
- School of Statistics, University of Minnesota, Minneapolis, Minnesota, USA
| | | | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
13
|
Zhang Y, Zhang X, Zhang H, Liu A, Liu CC. Low-rank latent matrix-factor prediction modeling for generalized high-dimensional matrix-variate regression. Stat Med 2023; 42:3616-3635. [PMID: 37314066 DOI: 10.1002/sim.9821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 04/19/2023] [Accepted: 06/01/2023] [Indexed: 06/15/2023]
Abstract
Motivated by diagnosing the COVID-19 disease using two-dimensional (2D) image biomarkers from computed tomography (CT) scans, we propose a novel latent matrix-factor regression model to predict responses that may come from an exponential distribution family, where covariates include high-dimensional matrix-variate biomarkers. A latent generalized matrix regression (LaGMaR) is formulated, where the latent predictor is a low-dimensional matrix factor score extracted from the low-rank signal of the matrix variate through a cutting-edge matrix factor model. Unlike the general spirit of penalizing vectorization plus the necessity of tuning parameters in the literature, instead, our prediction modeling in LaGMaR conducts dimension reduction that respects the geometric characteristic of intrinsic 2D structure of the matrix covariate and thus avoids iteration. This greatly relieves the computation burden, and meanwhile maintains structural information so that the latent matrix factor feature can perfectly replace the intractable matrix-variate owing to high-dimensionality. The estimation procedure of LaGMaR is subtly derived by transforming the bilinear form matrix factor model onto a high-dimensional vector factor model, so that the method of principle components can be applied. We establish bilinear-form consistency of the estimated matrix coefficient of the latent predictor and consistency of prediction. The proposed approach can be implemented conveniently. Through simulation experiments, the prediction capability of LaGMaR is shown to outperform some existing penalized methods under diverse scenarios of generalized matrix regressions. Through the application to a real COVID-19 dataset, the proposed approach is shown to predict efficiently the COVID-19.
Collapse
Affiliation(s)
- Yuzhe Zhang
- School of Management, University of Science and Technology of China, Hefei, Anhui, China
| | - Xu Zhang
- School of Mathematical Sciences, South China Normal University, Guangzhou, Guangdong, China
| | - Hong Zhang
- School of Management, University of Science and Technology of China, Hefei, Anhui, China
| | - Aiyi Liu
- National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, USA
| | - Catherine C Liu
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hom, Hong Kong SAR
| |
Collapse
|
14
|
Wang JX, Li Y, Reddick WE, Conklin HM, Glass JO, Onar-Thomas A, Gajjar A, Cheng C, Lu ZH. A high-dimensional mediation model for a neuroimaging mediator: Integrating clinical, neuroimaging, and neurocognitive data to mitigate late effects in pediatric cancer. Biometrics 2023; 79:2430-2443. [PMID: 35962595 DOI: 10.1111/biom.13729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 07/06/2022] [Indexed: 11/30/2022]
Abstract
Pediatric cancer treatment, especially for brain tumors, can have profound and complicated late effects. With the survival rates increasing because of improved detection and treatment, a more comprehensive understanding of the impact of current treatments on neurocognitive function and brain structure is critically needed. A frontline medulloblastoma clinical trial (SJMB03) has collected data, including treatment, clinical, neuroimaging, and cognitive variables. Advanced methods for modeling and integrating these data are critically needed to understand the mediation pathway from the treatment through brain structure to neurocognitive outcomes. We propose an integrative Bayesian mediation analysis approach to model jointly a treatment exposure, a high-dimensional structural neuroimaging mediator, and a neurocognitive outcome and to uncover the mediation pathway. The high-dimensional imaging-related coefficients are modeled via a binary Ising-Gaussian Markov random field prior (BI-GMRF), addressing the sparsity, spatial dependency, and smoothness and increasing the power to detect brain regions with mediation effects. Numerical simulations demonstrate the estimation accuracy, power, and robustness. For the SJMB03 study, the BI-GMRF method has identified white matter microstructure that is damaged by cancer-directed treatment and impacts late neurocognitive outcomes. The results provide guidance on improving treatment planning to minimize long-term cognitive sequela for pediatric brain tumor patients.
Collapse
Affiliation(s)
- Jade Xiaoqing Wang
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Yimei Li
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Wilburn E Reddick
- Department of Diagnostic Imaging, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Heather M Conklin
- Department of Psychology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - John O Glass
- Department of Diagnostic Imaging, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Arzu Onar-Thomas
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Amar Gajjar
- Department of Pediatric Medicine, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
- Department of Oncology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Cheng Cheng
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Zhao-Hua Lu
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| |
Collapse
|
15
|
Wei B, Peng L, Guo Y, Manatunga A, Stevens J. Tensor response quantile regression with neuroimaging data. Biometrics 2023; 79:1947-1958. [PMID: 36482808 PMCID: PMC10250564 DOI: 10.1111/biom.13809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Accepted: 11/25/2022] [Indexed: 12/14/2022]
Abstract
Collecting neuroimaging data in the form of tensors (i.e. multidimensional arrays) has become more common in mental health studies, driven by an increasing interest in studying the associations between neuroimaging phenotypes and clinical disease manifestation. Motivated by a neuroimaging study of post-traumatic stress disorder (PTSD) from the Grady Trauma Project, we study a tensor response quantile regression framework, which enables novel analyses that confer a detailed view of the potentially heterogeneous association between a neuroimaging phenotype and relevant clinical predictors. We adopt a sensible low-rank structure to represent the association of interest, and propose a simple two-step estimation procedure which is easy to implement with existing software. We provide rigorous theoretical justifications for the intuitive two-step procedure. Simulation studies demonstrate good performance of the proposed method with realistic sample sizes in neuroimaging studies. We conduct the proposed tensor response quantile regression analysis of the motivating PTSD study to investigate the association between fMRI resting-state functional connectivity and PTSD symptom severity. Our results uncover non-homogeneous effects of PTSD symptoms on brain functional connectivity, which cannot be captured by existing tensor response methods.
Collapse
Affiliation(s)
- Bo Wei
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, U.S.A
| | - Limin Peng
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, U.S.A
| | - Ying Guo
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, U.S.A
| | - Amita Manatunga
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, U.S.A
| | - Jennifer Stevens
- Department of Psychiatry and Behavior Sciences, Emory University, Atlanta, GA, 30322, U.S.A
| |
Collapse
|
16
|
Yan X, Yu J, Ding W, Wang H, Zhao P. A novel two-way functional linear model with applications in human mortality data analysis. J Appl Stat 2023; 51:2025-2038. [PMID: 39071246 PMCID: PMC11271083 DOI: 10.1080/02664763.2023.2253379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 08/15/2023] [Indexed: 07/30/2024]
Abstract
Recently, two-way or longitudinal functional data analysis has attracted much attention in many fields. However, little is known on how to appropriately characterize the association between two-way functional predictor and scalar response. Motivated by a mortality study, in this paper, we propose a novel two-way functional linear model, where the response is a scalar and functional predictor is two-way trajectory. The model is intuitive, interpretable and naturally captures relationship between each way of two-way functional predictor and scalar-type response. Further, we develop a new estimation method to estimate the regression functions in the framework of weak separability. The main technical tools for the construction of the regression functions are product functional principal component analysis and iterative least square procedure. The solid performance of our method is demonstrated in extensive simulation studies. We also analyze the mortality dataset to illustrate the usefulness of the proposed procedure.
Collapse
Affiliation(s)
- Xingyu Yan
- School of Mathematics and Statistics and RIMS, Jiangsu Provincial Key Laboratory of Educational Big Data Science and Engineering, Jiangsu Normal University, Xuzhou, Jiangsu, People's Republic of China
| | - Jiaqian Yu
- School of Mathematics and Statistics and RIMS, Jiangsu Provincial Key Laboratory of Educational Big Data Science and Engineering, Jiangsu Normal University, Xuzhou, Jiangsu, People's Republic of China
| | - Weiyong Ding
- School of Mathematics and Statistics and RIMS, Jiangsu Provincial Key Laboratory of Educational Big Data Science and Engineering, Jiangsu Normal University, Xuzhou, Jiangsu, People's Republic of China
| | - Hao Wang
- School of Mathematics and Statistics, Anhui Normal University, Wuhu, People's Republic of China
| | - Peng Zhao
- School of Mathematics and Statistics and RIMS, Jiangsu Provincial Key Laboratory of Educational Big Data Science and Engineering, Jiangsu Normal University, Xuzhou, Jiangsu, People's Republic of China
| |
Collapse
|
17
|
Lee I, Sinha D, Mai Q, Zhang X, Bandyopadhyay D. Bayesian regression analysis of skewed tensor responses. Biometrics 2023; 79:1814-1825. [PMID: 35983634 DOI: 10.1111/biom.13743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 08/10/2022] [Indexed: 11/30/2022]
Abstract
Tensor regression analysis is finding vast emerging applications in a variety of clinical settings, including neuroimaging, genomics, and dental medicine. The motivation for this paper is a study of periodontal disease (PD) with an order-3 tensor response: multiple biomarkers measured at prespecified tooth-sites within each tooth, for each participant. A careful investigation would reveal considerable skewness in the responses, in addition to response missingness. To mitigate the shortcomings of existing analysis tools, we propose a new Bayesian tensor response regression method that facilitates interpretation of covariate effects on both marginal and joint distributions of highly skewed tensor responses, and accommodates missing-at-random responses under a closure property of our tensor model. Furthermore, we present a prudent evaluation of the overall covariate effects while identifying their possible variations on only a sparse subset of the tensor components. Our method promises Markov chain Monte Carlo (MCMC) tools that are readily implementable. We illustrate substantial advantages of our proposal over existing methods via simulation studies and application to a real data set derived from a clinical study of PD. The R package BSTN available in GitHub implements our model.
Collapse
Affiliation(s)
- Inkoo Lee
- Department of Statistics, Rice University, Houston, Texas, USA
| | - Debajyoti Sinha
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | - Qing Mai
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | - Xin Zhang
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | | |
Collapse
|
18
|
Wang Y, Guo Y. LOCUS: A REGULARIZED BLIND SOURCE SEPARATION METHOD WITH LOW-RANK STRUCTURE FOR INVESTIGATING BRAIN CONNECTIVITY. Ann Appl Stat 2023; 17:1307-1332. [PMID: 39040949 PMCID: PMC11262594 DOI: 10.1214/22-aoas1670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
Network-oriented research has been increasingly popular in many scientific areas. In neuroscience research, imaging-based network connectivity measures have become the key for understanding brain organizations, potentially serving as individual neural fingerprints. There are major challenges in analyzing connectivity matrices, including the high dimensionality of brain networks, unknown latent sources underlying the observed connectivity, and the large number of brain connections leading to spurious findings. In this paper we propose a novel blind source separation method with low-rank structure and uniform sparsity (LOCUS) as a fully data-driven decomposition method for network measures. Compared with the existing method that vectorizes connectivity matrices ignoring brain network topology, LOCUS achieves more efficient and accurate source separation for connectivity matrices using low-rank structure. We propose a novel angle-based uniform sparsity regularization that demonstrates better performance than the existing sparsity controls for low-rank tensor methods. We propose a highly efficient iterative node-rotation algorithm that exploits the block multiconvexity of the objective function to solve the nonconvex optimization problem for learning LOCUS. We illustrate the advantage of LOCUS through extensive simulation studies. Application of LOCUS to Philadelphia Neurodevelopmental Cohort neuroimaging study reveals biologically insightful connectivity traits which are not found using the existing method.
Collapse
Affiliation(s)
- Yikai Wang
- Department of Biostatistics and Bioinformatics, Emory University
| | - Ying Guo
- Department of Biostatistics and Bioinformatics, Emory University
| |
Collapse
|
19
|
Zhang N, Liu Y, Yang J. Contextual tensor decomposition by projected alternating least squares. COMMUN STAT-SIMUL C 2023. [DOI: 10.1080/03610918.2023.2196748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
Affiliation(s)
- Nan Zhang
- School of Data Science, Fudan University, Shanghai, China
| | - Yanshuo Liu
- School of Data Science, Fudan University, Shanghai, China
| | - Jichen Yang
- Wanjia Asset Management Co., Ltd, Shanghai, China
| |
Collapse
|
20
|
Chen S, He K, He S, Ni Y, Wong RKW. Bayesian Nonlinear Tensor Regression with Functional Fused Elastic Net Prior. Technometrics 2023. [DOI: 10.1080/00401706.2023.2197471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/01/2023]
|
21
|
Statistical inference on the significance of rows and columns for matrix-valued data in an additive model. TEST-SPAIN 2023. [DOI: 10.1007/s11749-023-00852-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
|
22
|
Latimer KW, Freedman DJ. Low-dimensional encoding of decisions in parietal cortex reflects long-term training history. Nat Commun 2023; 14:1010. [PMID: 36823109 PMCID: PMC9950136 DOI: 10.1038/s41467-023-36554-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 02/07/2023] [Indexed: 02/25/2023] Open
Abstract
Neurons in parietal cortex exhibit task-related activity during decision-making tasks. However, it remains unclear how long-term training to perform different tasks over months or even years shapes neural computations and representations. We examine lateral intraparietal area (LIP) responses during a visual motion delayed-match-to-category task. We consider two pairs of male macaque monkeys with different training histories: one trained only on the categorization task, and another first trained to perform fine motion-direction discrimination (i.e., pretrained). We introduce a novel analytical approach-generalized multilinear models-to quantify low-dimensional, task-relevant components in population activity. During the categorization task, we found stronger cosine-like motion-direction tuning in the pretrained monkeys than in the category-only monkeys, and that the pretrained monkeys' performance depended more heavily on fine discrimination between sample and test stimuli. These results suggest that sensory representations in LIP depend on the sequence of tasks that the animals have learned, underscoring the importance of considering training history in studies with complex behavioral tasks.
Collapse
Affiliation(s)
- Kenneth W Latimer
- Department of Neurobiology, University of Chicago, Chicago, IL, USA.
| | - David J Freedman
- Department of Neurobiology, University of Chicago, Chicago, IL, USA
| |
Collapse
|
23
|
Llosa-Vite C, Maitra R. Reduced-Rank Tensor-on-Tensor Regression and Tensor-Variate Analysis of Variance. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2282-2296. [PMID: 35380954 DOI: 10.1109/tpami.2022.3164836] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Fitting regression models with many multivariate responses and covariates can be challenging, but such responses and covariates sometimes have tensor-variate structure. We extend the classical multivariate regression model to exploit such structure in two ways: first, we impose four types of low-rank tensor formats on the regression coefficients. Second, we model the errors using the tensor-variate normal distribution that imposes a Kronecker separable format on the covariance matrix. We obtain maximum likelihood estimators via block-relaxation algorithms and derive their computational complexity and asymptotic distributions. Our regression framework enables us to formulate tensor-variate analysis of variance (TANOVA) methodology. This methodology, when applied in a one-way TANOVA layout, enables us to identify cerebral regions significantly associated with the interaction of suicide attempters or non-attemptor ideators and positive-, negative- or death-connoting words in a functional Magnetic Resonance Imaging study. Another application uses three-way TANOVA on the Labeled Faces in the Wild image dataset to distinguish facial characteristics related to ethnic origin, age group and gender. A R package totr implements the methodology.
Collapse
|
24
|
Ke B, Zhao W, Wang L. Smoothed tensor quantile regression estimation for longitudinal data. Comput Stat Data Anal 2023. [DOI: 10.1016/j.csda.2022.107609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
25
|
Zhen Z, Paynabar K, Shi J(J. Image-Based Feedback Control Using Tensor Analysis. Technometrics 2022. [DOI: 10.1080/00401706.2022.2157880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Zhong Zhen
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Tech, Atlanta, GA, 30032
| | - Kamran Paynabar
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Tech, Atlanta, GA, 30032
| | - Jianjun (Jan) Shi
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Tech, Atlanta, GA, 30032
| |
Collapse
|
26
|
Weaver C, Xiao L, Lindquist MA. Single-index models with functional connectivity network predictors. Biostatistics 2022; 24:52-67. [PMID: 33948617 PMCID: PMC9748592 DOI: 10.1093/biostatistics/kxab015] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Revised: 03/22/2021] [Accepted: 03/25/2021] [Indexed: 12/16/2022] Open
Abstract
Functional connectivity is defined as the undirected association between two or more functional magnetic resonance imaging (fMRI) time series. Increasingly, subject-level functional connectivity data have been used to predict and classify clinical outcomes and subject attributes. We propose a single-index model wherein response variables and sparse functional connectivity network valued predictors are linked by an unspecified smooth function in order to accommodate potentially nonlinear relationships. We exploit the network structure of functional connectivity by imposing meaningful sparsity constraints, which lead not only to the identification of association of interactions between regions with the response but also the assessment of whether or not the functional connectivity associated with a brain region is related to the response variable. We demonstrate the effectiveness of the proposed model in simulation studies and in an application to a resting-state fMRI data set from the Human Connectome Project to model fluid intelligence and sex and to identify predictive links between brain regions.
Collapse
Affiliation(s)
- Caleb Weaver
- Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, NC 27606, USA
| | - Luo Xiao
- Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, NC 27606, USA
| | - Martin A Lindquist
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N. Wolfe Street, Baltimore, MD 21205, USA
| |
Collapse
|
27
|
Abstract
Neuroimaging studies have a growing interest in learning the association between the individual brain connectivity networks and their clinical characteristics. It is also of great interest to identify the sub brain networks as biomarkers to predict the clinical symptoms, such as disease status, potentially providing insight on neuropathology. This motivates the need for developing a new type of regression model where the response variable is scalar, and predictors are networks that are typically represented as adjacent matrices or weighted adjacent matrices, to which we refer as scalar-on-network regression. In this work, we develop a new boosting method for model fitting with sub-network markers selection. Our approach, as opposed to group lasso or other existing regularization methods, is essentially a gradient descent algorithm leveraging known network structure. We demonstrate the utility of our methods via simulation studies and analysis of the resting-state fMRI data in a cognitive developmental cohort study.
Collapse
Affiliation(s)
| | - Kevin He
- University of Michigan, Department of Biostatistics
| | - Jian Kang
- University of Michigan, Department of Biostatistics
| |
Collapse
|
28
|
Wu Y, Chen D, Li C, Tang N. Bayesian tensor logistic regression with applications to neuroimaging data analysis of Alzheimer's disease. Stat Methods Med Res 2022; 31:2368-2382. [PMID: 36154344 DOI: 10.1177/09622802221122409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Alzheimer's disease (AD) can be diagnosed by utilizing traditional logistic regression models to fit magnetic resonance imaging (MRI) data of brain, which is regarded as a vector of covariates. But its parameter estimation is inefficient and computationally extensive due to ultrahigh dimensionality and complicated structure of MRI data. To overcome this deficiency, this paper proposes a tensor logistic regression model (TLRM) for AD's MRI data by regarding MRI tensor as covariates. Under this framework, a tensor candecomp/parafac (CP) decomposition tool is employed to reduce ultrahigh dimensional tensor to a high dimensional level, a novel Bayesian adaptive Lasso method is developed to simultaneously select important components of tensor and estimate model parameters by incorporating the Po´lya-Gamma method leading a closed-form likelihood and avoiding the usage of the Metropolis-Hastings algorithm, and Gibbs sampler technique in Markov chain Monte Carlo (MCMC). A tensor's product technique is utilized to optimize the calculation program and speed up the calculation of MCMC. Bayes factor together with the path sampling approach is presented to select tensor rank in CP decomposition. Effectiveness of the proposed method is illustrated on simulation studies and an MRI data analysis.
Collapse
Affiliation(s)
- Ying Wu
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, 12635Yunnan University, Kunming, Yunnan, China
| | - Dan Chen
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, 12635Yunnan University, Kunming, Yunnan, China
| | - Chaoqian Li
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, 12635Yunnan University, Kunming, Yunnan, China
| | - Niansheng Tang
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, 12635Yunnan University, Kunming, Yunnan, China
| |
Collapse
|
29
|
Han Y, Zhang RCCH, Yao Q. Simultaneous Decorrelation of Matrix Time Series*. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2151448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Affiliation(s)
- Yuefeng Han
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN
| | | | - Qiwei Yao
- Department of Statistics, London School of Economics, London, U.K
| |
Collapse
|
30
|
Cheng Z, Xu X, Song Z, Zhao W. Randomized algorithms for tensor response regression. Stat Anal Data Min 2022. [DOI: 10.1002/sam.11603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Affiliation(s)
- Zhe Cheng
- School of Sciences Nantong University Nantong China
| | - Xiangjian Xu
- School of Sciences Nantong University Nantong China
| | - Zihao Song
- School of Sciences Nantong University Nantong China
| | - Weihua Zhao
- School of Sciences Nantong University Nantong China
| |
Collapse
|
31
|
Han R, Luo Y, Wang M, Zhang AR. Exact clustering in tensor block model: Statistical optimality and computational limit. J R Stat Soc Series B Stat Methodol 2022. [DOI: 10.1111/rssb.12547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Rungang Han
- Department of Statistics University of Wisconsin‐Madison Madison WI USA
- Department of Statistical Science Duke University Durham NC USA
| | - Yuetian Luo
- Department of Statistics University of Wisconsin‐Madison Madison WI USA
| | - Miaoyan Wang
- Department of Statistics University of Wisconsin‐Madison Madison WI USA
| | - Anru R. Zhang
- Department of Statistics University of Wisconsin‐Madison Madison WI USA
- Department of Statistical Science Duke University Durham NC USA
- Department of Biostatistics & Bioinformatics Duke University Durham NC USA
- Department of Computer Science Duke University Durham NC USA
- Department of Mathematics Duke University Durham NC USA
| |
Collapse
|
32
|
Haliassos A, Konstantinidis K, Mandic DP. Supervised Learning for Nonsequential Data: A Canonical Polyadic Decomposition Approach. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5162-5176. [PMID: 33822727 DOI: 10.1109/tnnls.2021.3069399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Efficient modeling of feature interactions underpins supervised learning for nonsequential tasks, characterized by a lack of inherent ordering of features (variables). The brute force approach of learning a parameter for each interaction of every order comes at an exponential computational and memory cost (curse of dimensionality). To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor, the order of which is equal to the number of features; for efficiency, it can be further factorized into a compact tensor train (TT) format. However, both TT and other tensor networks (TNs), such as tensor ring and hierarchical Tucker, are sensitive to the ordering of their indices (and hence to the features). To establish the desired invariance to feature ordering, we propose to represent the weight tensor through the canonical polyadic (CP) decomposition (CPD) and introduce the associated inference and learning algorithms, including suitable regularization and initialization schemes. It is demonstrated that the proposed CP-based predictor significantly outperforms other TN-based predictors on sparse data while exhibiting comparable performance on dense nonsequential tasks. Furthermore, for enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors. In conjunction with feature vector normalization, this is shown to yield dramatic improvements in performance for dense nonsequential tasks, matching models such as fully connected neural networks.
Collapse
|
33
|
Towle-Miller LM, Miecznikowski JC. MOSCATO: a supervised approach for analyzing multi-Omic single-Cell data. BMC Genomics 2022; 23:557. [PMID: 35927608 PMCID: PMC9351124 DOI: 10.1186/s12864-022-08759-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 07/13/2022] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Advancements in genomic sequencing continually improve personalized medicine, and recent breakthroughs generate multimodal data on a cellular level. We introduce MOSCATO, a technique for selecting features across multimodal single-cell datasets that relate to clinical outcomes. We summarize the single-cell data using tensors and perform regularized tensor regression to return clinically-associated variable sets for each 'omic' type. RESULTS Robustness was assessed over simulations based on available single-cell simulation methods, and applicability was assessed through an example using CITE-seq data to detect genes associated with leukemia. We find that MOSCATO performs favorably in selecting network features while also shown to be applicable to real multimodal single-cell data. CONCLUSIONS MOSCATO is a useful analytical technique for supervised feature selection in multimodal single-cell data. The flexibility of our approach enables future extensions on distributional assumptions and covariate adjustments.
Collapse
|
34
|
Billio M, Casarin R, Iacopini M. Bayesian Markov-Switching Tensor Regression for Time-Varying Networks. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2102502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Affiliation(s)
- Monica Billio
- Department of Economics, Ca’ Foscari University of Venice
| | | | - Matteo Iacopini
- Department of Econometrics and Data Science, Vrije Universiteit Amsterdam
| |
Collapse
|
35
|
Kang K, Song X. Joint Modeling of Longitudinal Imaging and Survival Data. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2102027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Kai Kang
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong, China
| | - Xinyuan Song
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
36
|
Bobrovnikov M, Chai JT, Dinov ID. Interactive Visualization and Computation of 2D and 3D Probability Distributions. SN COMPUTER SCIENCE 2022; 3:327. [PMID: 37483660 PMCID: PMC10361712 DOI: 10.1007/s42979-022-01206-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 05/13/2022] [Indexed: 07/25/2023]
Abstract
Purpose Mathematical modeling, probability estimation, and statistical inference represent core elements of modern artificial intelligence (AI) approaches for data-driven prediction, forecasting, classification, risk-estimation, and prognosis. Currently there are many tools that help calculate and visualize univariate probability distributions, however, very few resources venture beyond into multivariate distributions, which are commonly used in advanced statistical inference and AI decision-making. This article presents a new web-calculator that enables some calculation and visualization of bivariate and trivariate probability distributions. Methods Several methods are explored to compute the joint bivariate and trivariate probability densities, including the optimal multivariate modeling using Gaussian copula. We developed an interactive webapp to visually illustrate the parallels between the mathematical formulation, computational implementation, and graphical depiction of multivariate probability density and cumulative distribution functions. To ensure the interface and functionality are hardware platform independent, scalable, and functional, the app and its component widgets are implemented using HTML5 and JavaScript. Results We validated the webapp by testing the multivariate copula models under different experimental conditions and inspecting the performance in terms of accuracy and reliability of the estimated multivariate probability densities and distribution function values. Conclusion This article demonstrates the construction, implementation, and utilization of multivariate probability calculators. The proposed webapp implementation is freely available online (https://socr.umich.edu/HTML5/BivariateNormal/BVN2/) and can be used to assist with education and research of a diverse array of data scientists, STEM instructors, and AI learners.
Collapse
Affiliation(s)
- Mark Bobrovnikov
- Statistics Online Computational Resource (SOCR) University of Michigan, Ann Arbor, MI 48109, USA https://socr.umich.edu
| | - Jared Tianyi Chai
- Statistics Online Computational Resource (SOCR) University of Michigan, Ann Arbor, MI 48109, USA https://socr.umich.edu
| | - Ivo D. Dinov
- Statistics Online Computational Resource (SOCR) University of Michigan, Ann Arbor, MI 48109, USA https://socr.umich.edu
| |
Collapse
|
37
|
Zhou Y, Zhang AR, Zheng L, Wang Y. Optimal High-order Tensor SVD via Tensor-Train Orthogonal Iteration. IEEE TRANSACTIONS ON INFORMATION THEORY 2022; 68:3991-4019. [PMID: 36274655 PMCID: PMC9585995 DOI: 10.1109/tit.2022.3152733] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
This paper studies a general framework for high-order tensor SVD. We propose a new computationally efficient algorithm, tensor-train orthogonal iteration (TTOI), that aims to estimate the low tensor-train rank structure from the noisy high-order tensor observation. The proposed TTOI consists of initialization via TT-SVD [1] and new iterative backward/forward updates. We develop the general upper bound on estimation error for TTOI with the support of several new representation lemmas on tensor matricizations. By developing a matching information-theoretic lower bound, we also prove that TTOI achieves the minimax optimality under the spiked tensor model. The merits of the proposed TTOI are illustrated through applications to estimation and dimension reduction of high-order Markov processes, numerical studies, and a real data example on New York City taxi travel records. The software of the proposed algorithm is available online (https://github.com/Lili-Zheng-stat/TTOI).
Collapse
Affiliation(s)
- Yuchen Zhou
- Department of Statistics and Data Science, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Anru R Zhang
- Departments of Biostatistics & Bioinformatics, Computer Science, Mathematics, and Statistical Science, Duke University, Durham, NC 27710, USA
| | - Lili Zheng
- Department of Electrical and Computer Engineering, Rice University, Houston, TX 77005, USA
| | - Yazhen Wang
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
38
|
Li P, Sofuoglu SE, Aviyente S, Maiti T. Coupled support tensor machine classification for multimodal neuroimaging data. Stat Anal Data Min 2022. [DOI: 10.1002/sam.11587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Peide Li
- Boehringer Ingelheim Pharmaceuticals Duluth Georgia USA
| | | | - Selin Aviyente
- College of Engineering Michigan State University East Lansing Michigan USA
| | - Tapabrata Maiti
- College of Natural Science Michigan State University East Lansing Michigan USA
| |
Collapse
|
39
|
Cai JF, Li J, Xia D. Generalized Low-rank plus Sparse Tensor Estimation by Fast Riemannian Optimization. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2063131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Jian-Feng Cai
- Department of Mathematics, Hong Kong University of Science and Technology
| | - Jingyang Li
- Department of Mathematics, Hong Kong University of Science and Technology
| | - Dong Xia
- Department of Mathematics, Hong Kong University of Science and Technology
| |
Collapse
|
40
|
Ibriga HS, Sun WW. Covariate-assisted Sparse Tensor Completion. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2066537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
41
|
Li Y, Zhu R, Yeh M, Qu A. Dermoscopic Image Classification with Neural Style Transfer. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2061496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
| | - Ruoqing Zhu
- Department of Statistics, University of Illinois at Urbana-Champaign
| | | | - Annie Qu
- Department of Statistics, University of California, Irvine
| |
Collapse
|
42
|
Xia D, Zhang AR, Zhou Y. Inference for low-rank tensors—no need to debias. Ann Stat 2022. [DOI: 10.1214/21-aos2146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Dong Xia
- Department of Mathematics, Hong Kong University of Science and Technology
| | - Anru R. Zhang
- Departments of Biostatistics & Bioinformatics, Computer Science, Mathematics, and Statistical Science, Duke University
| | - Yuchen Zhou
- Department of Statistics and Data Science, The Wharton School, University of Pennsylvania
| |
Collapse
|
43
|
Li L, Zeng J, Zhang X. Generalized Liquid Association Analysis for Multimodal Data Integration. J Am Stat Assoc 2022; 118:1984-1996. [PMID: 38099062 PMCID: PMC10720690 DOI: 10.1080/01621459.2021.2024437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 12/27/2021] [Indexed: 10/19/2022]
Abstract
Multimodal data are now prevailing in scientific research. One of the central questions in multimodal integrative analysis is to understand how two data modalities associate and interact with each other given another modality or demographic variables. The problem can be formulated as studying the associations among three sets of random variables, a question that has received relatively less attention in the literature. In this article, we propose a novel generalized liquid association analysis method, which offers a new and unique angle to this important class of problems of studying three-way associations. We extend the notion of liquid association of Li (2002) from the univariate setting to the sparse, multivariate, and high-dimensional setting. We establish a population dimension reduction model, transform the problem to sparse Tucker decomposition of a three-way tensor, and develop a higher-order orthogonal iteration algorithm for parameter estimation. We derive the non-asymptotic error bound and asymptotic consistency of the proposed estimator, while allowing the variable dimensions to be larger than and diverge with the sample size. We demonstrate the efficacy of the method through both simulations and a multimodal neuroimaging application for Alzheimer's disease research.
Collapse
Affiliation(s)
- Lexin Li
- University of California at Berkeley
| | | | | |
Collapse
|
44
|
Wang L, Zhang J, Li B, Liu X. Quantile trace regression via nuclear norm regularization. Stat Probab Lett 2022. [DOI: 10.1016/j.spl.2021.109299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
45
|
Fang J, Yi GY. Regularized matrix-variate logistic regression with response subject to misclassification. J Stat Plan Inference 2022. [DOI: 10.1016/j.jspi.2021.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
46
|
Han R, Willett R, Zhang AR. An optimal statistical and computational framework for generalized tensor estimation. Ann Stat 2022. [DOI: 10.1214/21-aos2061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Rungang Han
- Department of Statistics, University of Wisconsin-Madison
| | - Rebecca Willett
- Departments of Statistics and Computer Science, University of Chicago
| | - Anru R. Zhang
- Department of Statistics, University of Wisconsin-Madison
| |
Collapse
|
47
|
He S, He K, Huang JZ. Improved Estimation of High-dimensional Additive Models Using Subspace Learning. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2034638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Shiyuan He
- Center for Applied Statistics, Institute of Statistics and Big Data, Renmin University of China, Beijing 100872, China
| | - Kejun He
- Center for Applied Statistics, Institute of Statistics and Big Data, Renmin University of China, Beijing 100872, China
| | - Jianhua Z. Huang
- School of Data Science, The Chinese University of Hong Kong, Shenzhen 518172, China
| |
Collapse
|
48
|
Tong J, Zhao X. Deep survival algorithm based on nuclear norm. J STAT COMPUT SIM 2022. [DOI: 10.1080/00949655.2021.2015770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Jianyang Tong
- School of Mathematics and Statistics, Center for Data Science, Lanzhou University, Lanzhou, People's Republic of China
- School of Mathematics and Statistics, Yunnan University, Kunming, People's Republic of China
| | - Xuejing Zhao
- School of Mathematics and Statistics, Center for Data Science, Lanzhou University, Lanzhou, People's Republic of China
| |
Collapse
|
49
|
Aberrant Structure MRI in Parkinson’s Disease and Comorbidity with Depression Based on Multinomial Tensor Regression Analysis. J Pers Med 2022; 12:jpm12010089. [PMID: 35055404 PMCID: PMC8779164 DOI: 10.3390/jpm12010089] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 01/06/2022] [Accepted: 01/07/2022] [Indexed: 02/08/2023] Open
Abstract
Background: Depression is a prominent and highly prevalent nonmotor feature in patients with Parkinson’s disease (PD). The neural and pathophysiologic mechanisms of PD with depression (DPD) remain unclear. The current diagnosis of DPD largely depends on clinical evaluation. Methods: We proposed a new family of multinomial tensor regressions that leveraged whole-brain structural magnetic resonance imaging (MRI) data to discriminate among 196 non-depressed PD (NDPD) patients, 84 DPD patients, 200 healthy controls (HC), and to assess the special brain microstructures in NDPD and DPD. The method of maximum likelihood estimation coupled with state-of-art gradient descent algorithms was used to predict the individual diagnosis of PD and the development of DPD in PD patients. Results: The results reveal that the proposed efficient approach not only achieved a high prediction accuracy (0.94) with a multi-class AUC (0.98) for distinguishing between NDPD, DPD, and HC on the testing set but also located the most discriminative regions for NDPD and DPD, including cortical regions, the cerebellum, the brainstem, the bilateral basal ganglia, and the thalamus and limbic regions. Conclusions: The proposed imaging technique based on tensor regression performs well without any prior feature information, facilitates a deeper understanding into the abnormalities in DPD and PD, and plays an essential role in the statistical analysis of high-dimensional complex MRI imaging data to support the radiological diagnosis of comorbidity of depression with PD.
Collapse
|
50
|
Ghannam M, Nkurunziza S. Improved estimation in tensor regression with multiple change-points. Electron J Stat 2022. [DOI: 10.1214/22-ejs2035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Mai Ghannam
- University of Windsor, Mathematics and Statistics department 401 Sunset Avenue, Windsor, ON, N9B 3P4
| | - Sévérien Nkurunziza
- University of Windsor, Mathematics and Statistics department 401 Sunset Avenue, Windsor, ON, N9B 3P4
| |
Collapse
|