1
|
Wang L, Wang G, Gao AS. Exploring heterogeneity and dynamics of meteorological influences on US PM 2.5: A distributed learning approach with spatiotemporal varying coefficient models. SPATIAL STATISTICS 2024; 61:100826. [PMID: 38779141 PMCID: PMC11108057 DOI: 10.1016/j.spasta.2024.100826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Particulate matter (PM) has emerged as a primary air quality concern due to its substantial impact on human health. Many recent research works suggest that PM2.5 concentrations depend on meteorological conditions. Enhancing current pollution control strategies necessitates a more holistic comprehension of PM2.5 dynamics and the precise quantification of spatiotemporal heterogeneity in the relationship between meteorological factors and PM2.5 levels. The spatiotemporal varying coefficient model stands as a prominent spatial regression technique adept at addressing this heterogeneity. Amidst the challenges posed by the substantial scale of modern spatiotemporal datasets, we propose a pioneering distributed estimation method (DEM) founded on multivariate spline smoothing across a domain's triangulation. This DEM algorithm ensures an easily implementable, highly scalable, and communication-efficient strategy, demonstrating almost linear speedup potential. We validate the effectiveness of our proposed DEM through extensive simulation studies, demonstrating that it achieves coefficient estimations akin to those of global estimators derived from complete datasets. Applying the proposed model and method to the US daily PM2.5 and meteorological data, we investigate the influence of meteorological variables on PM2.5 concentrations, revealing both spatial and seasonal variations in this relationship.
Collapse
Affiliation(s)
- Lily Wang
- Department of Statistics, George Mason University, 4400 University Drive, MS 4A7, Fairfax, 22030, VA, USA
| | - Guannan Wang
- Department of Mathematics, William & Mary, 120 Jones Hall, Williamsburg, 23185, VA, USA
| | - Annie S. Gao
- McLean High School, 1633 Davidson Rd, McLean, 22101, VA, USA
| |
Collapse
|
2
|
Areed WD, Price A, Thompson H, Malseed R, Mengersen K. Spatial non-parametric Bayesian clustered coefficients. Sci Rep 2024; 14:9677. [PMID: 38678077 PMCID: PMC11055928 DOI: 10.1038/s41598-024-59973-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 04/17/2024] [Indexed: 04/29/2024] Open
Abstract
In the field of population health research, understanding the similarities between geographical areas and quantifying their shared effects on health outcomes is crucial. In this paper, we synthesise a number of existing methods to create a new approach that specifically addresses this goal. The approach is called a Bayesian spatial Dirichlet process clustered heterogeneous regression model. This non-parametric framework allows for inference on the number of clusters and the clustering configurations, while simultaneously estimating the parameters for each cluster. We demonstrate the efficacy of the proposed algorithm using simulated data and further apply it to analyse influential factors affecting children's health development domains in Queensland. The study provides valuable insights into the contributions of regional similarities in education and demographics to health outcomes, aiding targeted interventions and policy design.
Collapse
Affiliation(s)
- Wala Draidi Areed
- School of Mathematical Science, Centre for Data Science, Queensland University of Technology, Brisbane, QLD, Australia.
| | - Aiden Price
- School of Mathematical Science, Centre for Data Science, Queensland University of Technology, Brisbane, QLD, Australia
| | - Helen Thompson
- School of Mathematical Science, Centre for Data Science, Queensland University of Technology, Brisbane, QLD, Australia
| | - Reid Malseed
- Children's Health Queensland, Brisbane, QLD, Australia
| | - Kerrie Mengersen
- School of Mathematical Science, Centre for Data Science, Queensland University of Technology, Brisbane, QLD, Australia
| |
Collapse
|
3
|
Lin Z, Si Y, Kang J. LATENT SUBGROUP IDENTIFICATION IN IMAGE-ON-SCALAR REGRESSION. Ann Appl Stat 2024; 18:468-486. [PMID: 38846637 PMCID: PMC11156244 DOI: 10.1214/23-aoas1797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2024]
Abstract
Image-on-scalar regression has been a popular approach to modeling the association between brain activities and scalar characteristics in neuroimaging research. The associations could be heterogeneous across individuals in the population, as indicated by recent large-scale neuroimaging studies, for example, the Adolescent Brain Cognitive Development (ABCD) Study. The ABCD data can inform our understanding of heterogeneous associations and how to leverage the heterogeneity and tailor interventions to increase the number of youths who benefit. It is of great interest to identify subgroups of individuals from the population such that: (1) within each subgroup the brain activities have homogeneous associations with the clinical measures; (2) across subgroups the associations are heterogeneous, and (3) the group allocation depends on individual characteristics. Existing image-on-scalar regression methods and clustering methods cannot directly achieve this goal. We propose a latent subgroup image-on-scalar regression model (LASIR) to analyze large-scale, multisite neuroimaging data with diverse sociode-mographics. LASIR introduces the latent subgroup for each individual and group-specific, spatially varying effects, with an efficient stochastic expectation maximization algorithm for inferences. We demonstrate that LASIR outperforms existing alternatives for subgroup identification of brain activation patterns with functional magnetic resonance imaging data via comprehensive simulations and applications to the ABCD study. We have released our reproducible codes for public use with the software package available on Github.
Collapse
Affiliation(s)
- Zikai Lin
- Department of Biostatistics, University of Michigan
| | - Yajuan Si
- Survey Research Center, Institute for Social Research, University of Michigan
| | - Jian Kang
- Department of Biostatistics, University of Michigan
| |
Collapse
|
4
|
Menacher A, Nichols TE, Holmes C, Ganjgahi H. Bayesian Lesion Estimation with a Structured Spike-and-Slab Prior. J Am Stat Assoc 2024; 119:66-80. [PMID: 39132605 PMCID: PMC11315456 DOI: 10.1080/01621459.2023.2278201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 10/24/2023] [Indexed: 08/13/2024]
Abstract
Neural demyelination and brain damage accumulated in white matter appear as hyperintense areas on T2-weighted MRI scans in the form of lesions. Modeling binary images at the population level, where each voxel represents the existence of a lesion, plays an important role in understanding aging and inflammatory diseases. We propose a scalable hierarchical Bayesian spatial model, called BLESS, capable of handling binary responses by placing continuous spike-and-slab mixture priors on spatially-varying parameters and enforcing spatial dependency on the parameter dictating the amount of sparsity within the probability of inclusion. The use of mean-field variational inference with dynamic posterior exploration, which is an annealing-like strategy that improves optimization, allows our method to scale to large sample sizes. Our method also accounts for underestimation of posterior variance due to variational inference by providing an approximate posterior sampling approach based on Bayesian bootstrap ideas and spike-and-slab priors with random shrinkage targets. Besides accurate uncertainty quantification, this approach is capable of producing novel cluster size based imaging statistics, such as credible intervals of cluster size, and measures of reliability of cluster occurrence. Lastly, we validate our results via simulation studies and an application to the UK Biobank, a large-scale lesion mapping study with a sample size of 40,000 subjects.
Collapse
Affiliation(s)
| | | | | | - Habib Ganjgahi
- Department of Statistics, University of Oxford
- Nuffield Department of Population Health, University of Oxford
| |
Collapse
|
5
|
Zhang D, Li L, Sripada C, Kang J. Image response regression via deep neural networks. J R Stat Soc Series B Stat Methodol 2023; 85:1589-1614. [PMID: 38584801 PMCID: PMC10994199 DOI: 10.1093/jrsssb/qkad073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 06/22/2023] [Accepted: 06/28/2023] [Indexed: 04/09/2024]
Abstract
Delineating associations between images and covariates is a central aim of imaging studies. To tackle this problem, we propose a novel non-parametric approach in the framework of spatially varying coefficient models, where the spatially varying functions are estimated through deep neural networks. Our method incorporates spatial smoothness, handles subject heterogeneity, and provides straightforward interpretations. It is also highly flexible and accurate, making it ideal for capturing complex association patterns. We establish estimation and selection consistency and derive asymptotic error bounds. We demonstrate the method's advantages through intensive simulations and analyses of two functional magnetic resonance imaging data sets.
Collapse
Affiliation(s)
- Daiwei Zhang
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Lexin Li
- Department of Biostatistics and Epidemiology, University of California, Berkeley, CA, USA
| | - Chandra Sripada
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
- Department of Philosophy, University of Michigan, Ann Arbor, MI, USA
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
6
|
Zhu H, Li T, Zhao B. Statistical Learning Methods for Neuroimaging Data Analysis with Applications. Annu Rev Biomed Data Sci 2023; 6:73-104. [PMID: 37127052 DOI: 10.1146/annurev-biodatasci-020722-100353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
The aim of this review is to provide a comprehensive survey of statistical challenges in neuroimaging data analysis, from neuroimaging techniques to large-scale neuroimaging studies and statistical learning methods. We briefly review eight popular neuroimaging techniques and their potential applications in neuroscience research and clinical translation. We delineate four themes of neuroimaging data and review major image processing analysis methods for processing neuroimaging data at the individual level. We briefly review four large-scale neuroimaging-related studies and a consortium on imaging genomics and discuss four themes of neuroimaging data analysis at the population level. We review nine major population-based statistical analysis methods and their associated statistical challenges and present recent progress in statistical methodology to address these challenges.
Collapse
Affiliation(s)
- Hongtu Zhu
- Department of Biostatistics, Department of Statistics, Department of Genetics, and Department of Computer Science, University of North Carolina, Chapel Hill, North Carolina, USA;
- Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Tengfei Li
- Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, North Carolina, USA
- Department of Radiology, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Bingxin Zhao
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
7
|
Morris EL, He K, Kang J. Scalar on network regression via boosting. Ann Appl Stat 2022; 16:2755-2773. [DOI: 10.1214/22-aoas1612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | - Kevin He
- Department of Biostatistics, University of Michigan
| | - Jian Kang
- Department of Biostatistics, University of Michigan
| |
Collapse
|