1
|
Tian X, Wang Y, Wang S, Zhao Y, Zhao Y. Bayesian mixed model inference for genetic association under related samples with brain network phenotype. Biostatistics 2024:kxae008. [PMID: 38494649 DOI: 10.1093/biostatistics/kxae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 01/22/2024] [Accepted: 02/19/2024] [Indexed: 03/19/2024] Open
Abstract
Genetic association studies for brain connectivity phenotypes have gained prominence due to advances in noninvasive imaging techniques and quantitative genetics. Brain connectivity traits, characterized by network configurations and unique biological structures, present distinct challenges compared to other quantitative phenotypes. Furthermore, the presence of sample relatedness in the most imaging genetics studies limits the feasibility of adopting existing network-response modeling. In this article, we fill this gap by proposing a Bayesian network-response mixed-effect model that considers a network-variate phenotype and incorporates population structures including pedigrees and unknown sample relatedness. To accommodate the inherent topological architecture associated with the genetic contributions to the phenotype, we model the effect components via a set of effect network configurations and impose an inter-network sparsity and intra-network shrinkage to dissect the phenotypic network configurations affected by the risk genetic variant. A Markov chain Monte Carlo (MCMC) algorithm is further developed to facilitate uncertainty quantification. We evaluate the performance of our model through extensive simulations. By further applying the method to study, the genetic bases for brain structural connectivity using data from the Human Connectome Project with excessive family structures, we obtain plausible and interpretable results. Beyond brain connectivity genetic studies, our proposed model also provides a general linear mixed-effect regression framework for network-variate outcomes.
Collapse
Affiliation(s)
- Xinyuan Tian
- Department of Biostatistics, Yale University, 60 College St, New Haven, CT 06520, United States
| | - Yiting Wang
- Department of Biostatistics, Yale University, 60 College St, New Haven, CT 06520, United States
| | - Selena Wang
- Department of Biostatistics, Yale University, 60 College St, New Haven, CT 06520, United States
| | - Yi Zhao
- Department of Biostatistics and Health Data Science, Indiana University, 410W. 10th St, Indianapolis, IN 46202, United States
| | - Yize Zhao
- Department of Biostatistics, Yale University, 60 College St, New Haven, CT 06520, United States
| |
Collapse
|
2
|
Chen S, He K, He S, Ni Y, Wong RKW. Bayesian Nonlinear Tensor Regression with Functional Fused Elastic Net Prior. Technometrics 2023. [DOI: 10.1080/00401706.2023.2197471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/01/2023]
|
3
|
Liu B, Zhang Q, Xue L, Song PXK, Kang J. Robust High-Dimensional Regression with Coefficient Thresholding and its Application to Imaging Data Analysis. J Am Stat Assoc 2022; 119:715-729. [PMID: 38818252 PMCID: PMC11136478 DOI: 10.1080/01621459.2022.2142590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Accepted: 10/18/2022] [Indexed: 11/06/2022]
Abstract
It is important to develop statistical techniques to analyze high-dimensional data in the presence of both complex dependence and possible heavy tails and outliers in real-world applications such as imaging data analyses. We propose a new robust high-dimensional regression with coefficient thresholding, in which an efficient nonconvex estimation procedure is proposed through a thresholding function and the robust Huber loss. The proposed regularization method accounts for complex dependence structures in predictors and is robust against heavy tails and outliers in outcomes. Theoretically, we rigorously analyze the landscape of the population and empirical risk functions for the proposed method. The fine landscape enables us to establish both statistical consistency and computational convergence under the high-dimensional setting. We also present an extension to incorporate spatial information into the proposed method. Finite-sample properties of the proposed methods are examined by extensive simulation studies. An application concerns a scalar-on-image regression analysis for an association of psychiatric disorder measured by the general factor of psychopathology with features extracted from the task functional MRI data in the Adolescent Brain Cognitive Development (ABCD) study.
Collapse
Affiliation(s)
| | - Qi Zhang
- The Pennsylvania State University
| | | | | | | |
Collapse
|
4
|
Morris EL, He K, Kang J. Scalar on network regression via boosting. Ann Appl Stat 2022; 16:2755-2773. [DOI: 10.1214/22-aoas1612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | - Kevin He
- Department of Biostatistics, University of Michigan
| | - Jian Kang
- Department of Biostatistics, University of Michigan
| |
Collapse
|
5
|
Zhao Y, Chen T, Cai J, Lichenstein S, Potenza MN, Yip SW. Bayesian network mediation analysis with application to the brain functional connectome. Stat Med 2022; 41:3991-4005. [PMID: 35795965 PMCID: PMC10131252 DOI: 10.1002/sim.9488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 04/12/2022] [Accepted: 05/18/2022] [Indexed: 11/10/2022]
Abstract
The brain functional connectome, the collection of interconnected neural circuits along functional networks, facilitates a cutting-edge understanding of brain functioning, and has a potential to play a mediating role within the effect pathway between an exposure and an outcome. While existing mediation analytic approaches are capable of providing insight into complex processes, they mainly focus on a univariate mediator or mediator vector, without considering network-variate mediators. To fill the methodological gap and accomplish this exciting and urgent application, in the article, we propose an integrative mediation analysis under a Bayesian paradigm with networks entailing the mediation effect. To parameterize the network measurements, we introduce individually specified stochastic block models with unknown block allocation, and naturally bridge effect elements through the latent network mediators induced by the connectivity weights across network modules. To enable the identification of truly active mediating components, we simultaneously impose a feature selection across network mediators. We show the superiority of our model in estimating different effect components and selecting active mediating network structures. As a practical illustration of this approach's application to network neuroscience, we characterize the relationship between a therapeutic intervention and opioid abstinence as mediated by brain functional sub-networks.
Collapse
Affiliation(s)
- Yize Zhao
- Department of Biostatistics, Yale University School of Public Health, New Haven, Connecticut, USA
- Yale Center for Analytical Sciences, Yale University School of Public Health, New Haven, Connecticut, USA
| | - Tianqi Chen
- Department of Biostatistics, Yale University School of Public Health, New Haven, Connecticut, USA
| | - Jiachen Cai
- Department of Biostatistics, Yale University School of Public Health, New Haven, Connecticut, USA
| | - Sarah Lichenstein
- Department of Psychiatry, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Marc N Potenza
- Department of Psychiatry, Yale University School of Medicine, New Haven, Connecticut, USA
- Child Study Center, Yale University School of Medicine, New Haven, Connecticut, USA
- Department of Neuroscience, Yale University School of Medicine, New Haven, Connecticut, USA
- Connecticut Mental Health Center, New Haven, Connecticut, USA
- Connecticut Council on Problem Gambling, Wethersfield, Connecticut, USA
- Wu Tsai Institute, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Sarah W Yip
- Department of Psychiatry, Yale University School of Medicine, New Haven, Connecticut, USA
- Child Study Center, Yale University School of Medicine, New Haven, Connecticut, USA
| |
Collapse
|
6
|
Yu W, Wade S, Bondell HD, Azizi L. Non-stationary Gaussian process discriminant analysis with variable selection for high-dimensional functional data. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2098136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Weichang Yu
- Melbourne Centre for Data Science, University of Melbourne
| | - Sara Wade
- School of Mathematics, University of Edinburgh
| | | | - Lamiae Azizi
- School of Mathematics and Statistics, University of Sydney
| |
Collapse
|
7
|
Song Y, Zhou X, Kang J, Aung MT, Zhang M, Zhao W, Needham BL, Kardia SLR, Liu Y, Meeker JD, Smith JA, Mukherjee B. Bayesian hierarchical models for high-dimensional mediation analysis with coordinated selection of correlated mediators. Stat Med 2021; 40:6038-6056. [PMID: 34404112 PMCID: PMC9257993 DOI: 10.1002/sim.9168] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Revised: 07/30/2021] [Accepted: 08/05/2021] [Indexed: 01/18/2023]
Abstract
We consider Bayesian high-dimensional mediation analysis to identify among a large set of correlated potential mediators the active ones that mediate the effect from an exposure variable to an outcome of interest. Correlations among mediators are commonly observed in modern data analysis; examples include the activated voxels within connected regions in brain image data, regulatory signals driven by gene networks in genome data, and correlated exposure data from the same source. When correlations are present among active mediators, mediation analysis that fails to account for such correlation can be suboptimal and may lead to a loss of power in identifying active mediators. Building upon a recent high-dimensional mediation analysis framework, we propose two Bayesian hierarchical models, one with a Gaussian mixture prior that enables correlated mediator selection and the other with a Potts mixture prior that accounts for the correlation among active mediators in mediation analysis. We develop efficient sampling algorithms for both methods. Various simulations demonstrate that our methods enable effective identification of correlated active mediators, which could be missed by using existing methods that assume prior independence among active mediators. The proposed methods are applied to the LIFECODES birth cohort and the Multi-Ethnic Study of Atherosclerosis (MESA) and identified new active mediators with important biological implications.
Collapse
Affiliation(s)
- Yanyi Song
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan USA
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan USA
| | - Max T. Aung
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan USA
| | - Min Zhang
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan USA
| | - Wei Zhao
- Department of Epidemiology, University of Michigan, Ann Arbor, Michigan USA
| | - Belinda L. Needham
- Department of Epidemiology, University of Michigan, Ann Arbor, Michigan USA
| | | | - Yongmei Liu
- Division of Cardiology, Department of Medicine, Duke University School of Medicine, Durham, North Carolina USA
| | - John D. Meeker
- Department of Environmental Health Sciences, University of Michigan, Ann Arbor, Michigan USA
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, Michigan USA
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan USA
| |
Collapse
|
8
|
Li H, Wang Y, Yan G, Sun Y, Tanabe S, Liu CC, Quigg MS, Zhang T. A Bayesian State-Space Approach to Mapping Directional Brain Networks. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2020.1865985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Huazhang Li
- Department of Statistics, University of Virginia, Charlottesville, VA
| | - Yaotian Wang
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA
| | - Guofen Yan
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA
| | - Yinge Sun
- Department of Statistics, University of Virginia, Charlottesville, VA
| | - Seiji Tanabe
- Department of Psychology, University of Virginia, Charlottesville, VA
| | - Chang-Chia Liu
- Department of Neurosurgery, University of Virginia, Charlottesville, VA
| | - Mark S. Quigg
- Department of Neurology, University of Virginia, Charlottesville, VA
| | - Tingting Zhang
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA
| |
Collapse
|
9
|
Roy A, Reich BJ, Guinness J, Shinohara RT, Staicu AM. Spatial Shrinkage Via the Product Independent Gaussian Process Prior. J Comput Graph Stat 2021. [DOI: 10.1080/10618600.2021.1923512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Arkaprava Roy
- Department of Biostatistics, University of Florida, Gainesville, FL
| | - Brian J. Reich
- Department of Statistics, North Carolina State University, Raleigh, NC
| | - Joseph Guinness
- Department of Statistics and Data Science, Cornell University, Ithaca, NY
| | | | - Ana-Maria Staicu
- Department of Statistics, North Carolina State University, Raleigh, NC
| |
Collapse
|
10
|
Wang B, Sudijono T, Kirveslahti H, Gao T, Boyer DM, Mukherjee S, Crawford L. A statistical pipeline for identifying physical features that differentiate classes of 3D shapes. Ann Appl Stat 2021. [DOI: 10.1214/20-aoas1430] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Bruce Wang
- Lewis-Sigler Institute for Integrative Genomics, Princeton University
| | | | | | - Tingran Gao
- Committee on Computational and Applied Mathematics, Department of Statistics, University of Chicago
| | | | - Sayan Mukherjee
- Department of Statistical Science, Department of Computer Science, Department of Mathematics, and Department of Bioinformatics & Biostatistics, Duke University
| | | |
Collapse
|
11
|
Zhou Y, He K. An improved tensor regression model via location smoothing. Stat (Int Stat Inst) 2021. [DOI: 10.1002/sta4.377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Ya Zhou
- Center for Applied Statistics and Institute of Statistics and Big Data Renmin University of China Beijing China
| | - Kejun He
- Center for Applied Statistics and Institute of Statistics and Big Data Renmin University of China Beijing China
| |
Collapse
|
12
|
Cai Q, Kang J, Yu T. Bayesian Network Marker Selection via the Thresholded Graph Laplacian Gaussian Prior. BAYESIAN ANALYSIS 2020; 15:79-102. [PMID: 32802246 PMCID: PMC7428197 DOI: 10.1214/18-ba1142] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Selecting informative nodes over large-scale networks becomes increasingly important in many research areas. Most existing methods focus on the local network structure and incur heavy computational costs for the large-scale problem. In this work, we propose a novel prior model for Bayesian network marker selection in the generalized linear model (GLM) framework: the Thresholded Graph Laplacian Gaussian (TGLG) prior, which adopts the graph Laplacian matrix to characterize the conditional dependence between neighboring markers accounting for the global network structure. Under mild conditions, we show the proposed model enjoys the posterior consistency with a diverging number of edges and nodes in the network. We also develop a Metropolis-adjusted Langevin algorithm (MALA) for efficient posterior computation, which is scalable to large-scale networks. We illustrate the superiorities of the proposed method compared with existing alternatives via extensive simulation studies and an analysis of the breast cancer gene expression dataset in the Cancer Genome Atlas (TCGA).
Collapse
Affiliation(s)
- Qingpo Cai
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Tianwei Yu
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
13
|
Tang X, Bi X, Qu A. Individualized Multilayer Tensor Learning With an Application in Imaging Analysis. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2019.1585254] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Xiwei Tang
- Department of Statistics, University of Virginia, Charlottesville, VA
| | - Xuan Bi
- Department of Information and Decision Sciences, Carlson School of Management, University of Minnesota, Minneapolis, MN
| | - Annie Qu
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL
| |
Collapse
|
14
|
NPBayes-fMRI: Non-parametric Bayesian General Linear Models for Single- and Multi-Subject fMRI Data. STATISTICS IN BIOSCIENCES 2019. [DOI: 10.1007/s12561-017-9205-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
15
|
Jhuang AT, Fuentes M, Jones JL, Esteves G, Fancher CM, Furman M, Reich BJ. Spatial Signal Detection Using Continuous Shrinkage Priors. Technometrics 2019; 61:494-506. [PMID: 31723308 PMCID: PMC6853616 DOI: 10.1080/00401706.2018.1546622] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Revised: 11/01/2018] [Accepted: 11/06/2018] [Indexed: 10/27/2022]
Abstract
Motivated by the problem of detecting changes in two-dimensional X-ray diffraction data, we propose a Bayesian spatial model for sparse signal detection in image data. Our model places considerable mass near zero and has heavy tails to reflect the prior belief that the image signal is zero for most pixels and large for an important subset. We show that the spatial prior places mass on nearby locations simultaneously being zero, and also allows for nearby locations to simultaneously be large signals. The form of the prior also facilitates efficient computing for large images. We conduct a simulation study to evaluate the properties of the proposed prior and show that it outperforms other spatial models. We apply our method in the analysis of X-ray diffraction data from a two-dimensional area detector to detect changes in the pattern when the material is exposed to an electric field.
Collapse
Affiliation(s)
- An-Ting Jhuang
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| | - Montserrat Fuentes
- College of Humanities and Sciences, Virginia Commonwealth University, Richmond, VA 23284
| | - Jacob L Jones
- Department of Materials Science and Engineering, North Carolina State University, Raleigh, NC 27695
| | - Giovanni Esteves
- Department of Materials Science and Engineering, North Carolina State University, Raleigh, NC 27695
| | - Chris M Fancher
- Neutron Scattering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
| | - Marschall Furman
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| | - Brian J Reich
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| |
Collapse
|
16
|
Abstract
Neuroimaging data often take the form of high dimensional arrays, also known as tensors. Addressing scientific questions arising from such data demands new regression models that take multidimensional arrays as covariates. Simply turning an image array into a vector would both cause extremely high dimensionality and destroy the inherent spatial structure of the array. In a recent work, Zhou et al. (2013) proposed a family of generalized linear tensor regression models based upon the CP (CANDECOMP/PARAFAC) decomposition of regression coefficient array. Low rank approximation brings the ultrahigh dimensionality to a manageable level and leads to efficient estimation. In this article, we propose a tensor regression model based on the more flexible Tucker decomposition. Compared to the CP model, Tucker regression model allows different number of factors along each mode. Such flexibility leads to several advantages that are particularly suited to neuroimaging analysis, including further reduction of the number of free parameters, accommodation of images with skewed dimensions, explicit modeling of interactions, and a principled way of image downsizing. We also compare the Tucker model with CP numerically on both simulated data and a real magnetic resonance imaging data, and demonstrate its effectiveness in finite sample performance.
Collapse
Affiliation(s)
| | - Da Xu
- University of California, Berkeley
| | - Hua Zhou
- University of California, Los Angeles
| | - Lexin Li
- University of California, Berkeley
| |
Collapse
|
17
|
Teng M, Nathoo FS, Johnson TD. Bayesian analysis of functional magnetic resonance imaging data with spatially varying auto‐regressive orders. J R Stat Soc Ser C Appl Stat 2018. [DOI: 10.1111/rssc.12320] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Ming Teng
- University of Michigan Ann Arbor USA
| | | | | |
Collapse
|
18
|
Bharath K, Kurtek S, Rao A, Baladandayuthapani V. Radiologic image-based statistical shape analysis of brain tumours. J R Stat Soc Ser C Appl Stat 2018; 67:1357-1378. [PMID: 30420787 PMCID: PMC6225782 DOI: 10.1111/rssc.12272] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
We propose a curve-based Riemannian geometric approach for general shape-based statistical analyses of tumours obtained from radiologic images. A key component of the framework is a suitable metric that enables comparisons of tumour shapes, provides tools for computing descriptive statistics and implementing principal component analysis on the space of tumour shapes and allows for a rich class of continuous deformations of a tumour shape. The utility of the framework is illustrated through specific statistical tasks on a data set of radiologic images of patients diagnosed with glioblastoma multiforme, a malignant brain tumour with poor prognosis. In particular, our analysis discovers two patient clusters with very different survival, subtype and genomic characteristics. Furthermore, it is demonstrated that adding tumour shape information to survival models containing clinical and genomic variables results in a significant increase in predictive power.
Collapse
Affiliation(s)
| | | | - Arvind Rao
- University of Texas MD Anderson Cancer Center, Houston, USA
| | | |
Collapse
|
19
|
Happ C, Greven S, Schmid VJ. The impact of model assumptions in scalar-on-image regression. Stat Med 2018; 37:4298-4317. [PMID: 30132932 DOI: 10.1002/sim.7915] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 06/20/2018] [Accepted: 06/27/2018] [Indexed: 11/11/2022]
Abstract
Complex statistical models such as scalar-on-image regression often require strong assumptions to overcome the issue of nonidentifiability. While in theory, it is well understood that model assumptions can strongly influence the results, this seems to be underappreciated, or played down, in practice. This article gives a systematic overview of the main approaches for scalar-on-image regression with a special focus on their assumptions. We categorize the assumptions and develop measures to quantify the degree to which they are met. The impact of model assumptions and the practical usage of the proposed measures are illustrated in a simulation study and in an application to neuroimaging data. The results show that different assumptions indeed lead to quite different estimates with similar predictive ability, raising the question of their interpretability. We give recommendations for making modeling and interpretation decisions in practice based on the new measures and simulations using hypothetic coefficient images and the observed data.
Collapse
Affiliation(s)
- Clara Happ
- Department of Statistics, LMU Munich, Munich, Germany
| | - Sonja Greven
- Department of Statistics, LMU Munich, Munich, Germany
| | | |
Collapse
|
20
|
|
21
|
Kang J, Reich BJ, Staicu AM. Scalar-on-Image Regression via the Soft-Thresholded Gaussian Process. Biometrika 2018; 105:165-184. [PMID: 30686828 PMCID: PMC6345249 DOI: 10.1093/biomet/asx075] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
This work concerns spatial variable selection for scalar-on-image regression. We propose a new class of Bayesian nonparametric models and develop an efficient posterior computational aigorithm. The proposed soft-thresholded Gaussian process provides large prior support over the class of piecewise-smooth, sparse, and continuous spatially-varying regression coefficient functions. In addition, under some mild regularity conditions the soft-thresholded Gaussian proess prior leads to the posterior consistency for parameter estimation and variable selection for scalar-on-image regression, even when the number of predictors is larger than the sample size. The proposed method is compared to alternatives via simulation and applied to an electroen-cephalography study of alcoholism.
Collapse
Affiliation(s)
- Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, U.S.A.
| | - Brian J Reich
- Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695, U.S.A.
| | - Ana-Maria Staicu
- Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695, U.S.A.
| |
Collapse
|
22
|
Chiang S, Guindani M, Yeh HJ, Dewar S, Haneef Z, Stern JM, Vannucci M. A Hierarchical Bayesian Model for the Identification of PET Markers Associated to the Prediction of Surgical Outcome after Anterior Temporal Lobe Resection. Front Neurosci 2017; 11:669. [PMID: 29259537 PMCID: PMC5723403 DOI: 10.3389/fnins.2017.00669] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Accepted: 11/17/2017] [Indexed: 01/19/2023] Open
Abstract
We develop an integrative Bayesian predictive modeling framework that identifies individual pathological brain states based on the selection of fluoro-deoxyglucose positron emission tomography (PET) imaging biomarkers and evaluates the association of those states with a clinical outcome. We consider data from a study on temporal lobe epilepsy (TLE) patients who subsequently underwent anterior temporal lobe resection. Our modeling framework looks at the observed profiles of regional glucose metabolism in PET as the phenotypic manifestation of a latent individual pathologic state, which is assumed to vary across the population. The modeling strategy we adopt allows the identification of patient subgroups characterized by latent pathologies differentially associated to the clinical outcome of interest. It also identifies imaging biomarkers characterizing the pathological states of the subjects. In the data application, we identify a subgroup of TLE patients at high risk for post-surgical seizure recurrence after anterior temporal lobe resection, together with a set of discriminatory brain regions that can be used to distinguish the latent subgroups. We show that the proposed method achieves high cross-validated accuracy in predicting post-surgical seizure recurrence.
Collapse
Affiliation(s)
- Sharon Chiang
- Department of Statistics, Rice University, Houston, TX, United States.,School of Medicine, Baylor College of Medicine, Houston, TX, United States
| | - Michele Guindani
- Department of Statistics, University of California, Irvine, Irvine, CA, United States
| | - Hsiang J Yeh
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Sandra Dewar
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Zulfi Haneef
- Department of Neurology, Baylor College of Medicine, Houston, TX, United States
| | - John M Stern
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, United States
| | - Marina Vannucci
- Department of Statistics, Rice University, Houston, TX, United States
| |
Collapse
|
23
|
Zhu H, Shen D, Peng X, Liu LY. MWPCR: Multiscale Weighted Principal Component Regression for High-dimensional Prediction. J Am Stat Assoc 2016; 112:1009-1021. [PMID: 29151657 DOI: 10.1080/01621459.2016.1261710] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
We propose a multiscale weighted principal component regression (MWPCR) framework for the use of high dimensional features with strong spatial features (e.g., smoothness and correlation) to predict an outcome variable, such as disease status. This development is motivated by identifying imaging biomarkers that could potentially aid detection, diagnosis, assessment of prognosis, prediction of response to treatment, and monitoring of disease status, among many others. The MWPCR can be regarded as a novel integration of principal components analysis (PCA), kernel methods, and regression models. In MWPCR, we introduce various weight matrices to prewhitten high dimensional feature vectors, perform matrix decomposition for both dimension reduction and feature extraction, and build a prediction model by using the extracted features. Examples of such weight matrices include an importance score weight matrix for the selection of individual features at each location and a spatial weight matrix for the incorporation of the spatial pattern of feature vectors. We integrate the importance score weights with the spatial weights in order to recover the low dimensional structure of high dimensional features. We demonstrate the utility of our methods through extensive simulations and real data analyses of the Alzheimer's disease neuroimaging initiative (ADNI) data set.
Collapse
Affiliation(s)
- Hongtu Zhu
- Professor of Biostatistics, Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77230, and University of North Carolina, Chapel Hill, NC, 27599
| | - Dan Shen
- Assistant Professor in Interdisciplinary Data Sciences Consortium and Department of Mathematics and Statistics, University of South Florida, Tampa, FL 33620
| | | | - Leo Yufeng Liu
- Doctoral student under the supervision of Dr. Hongtu Zhu
| |
Collapse
|
24
|
Chekouo T, Stingo FC, Guindani M, Do KA. A Bayesian predictive model for imaging genetics with application to schizophrenia. Ann Appl Stat 2016. [DOI: 10.1214/16-aoas948] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
25
|
Zhang L, Guindani M, Versace F, Engelmann JM, Vannucci M. A spatiotemporal nonparametric Bayesian model of multi-subject fMRI data. Ann Appl Stat 2016. [DOI: 10.1214/16-aoas926] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
26
|
Feng W, Sarkar A, Lim CY, Maiti T. Variable selection for binary spatial regression: Penalized quasi-likelihood approach. Biometrics 2016; 72:1164-1172. [PMID: 27061299 DOI: 10.1111/biom.12525] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Revised: 01/01/2016] [Accepted: 03/01/2016] [Indexed: 11/29/2022]
Abstract
We consider the problem of selecting covariates in a spatial regression model when the response is binary. Penalized likelihood-based approach is proved to be effective for both variable selection and estimation simultaneously. In the context of a spatially dependent binary variable, an uniquely interpretable likelihood is not available, rather a quasi-likelihood might be more suitable. We develop a penalized quasi-likelihood with spatial dependence for simultaneous variable selection and parameter estimation along with an efficient computational algorithm. The theoretical properties including asymptotic normality and consistency are studied under increasing domain asymptotics framework. An extensive simulation study is conducted to validate the methodology. Real data examples are provided for illustration and applicability. Although theoretical justification has not been made, we also investigate empirical performance of the proposed penalized quasi-likelihood approach for spatial count data to explore suitability of this method to a general exponential family of distributions.
Collapse
Affiliation(s)
| | - Abdhi Sarkar
- Michigan State University, East Lansing, Michigan 48824, U.S.A
| | | | - Tapabrata Maiti
- Michigan State University, East Lansing, Michigan 48824, U.S.A
| |
Collapse
|
27
|
Li F, Zhang T, Wang Q, Gonzalez MZ, Maresh EL, Coan JA. Spatial Bayesian variable selection and grouping for high-dimensional scalar-on-image regression. Ann Appl Stat 2015. [DOI: 10.1214/15-aoas818] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|