1
|
Wang L, Wang G, Gao AS. Exploring heterogeneity and dynamics of meteorological influences on US PM 2.5: A distributed learning approach with spatiotemporal varying coefficient models. SPATIAL STATISTICS 2024; 61:100826. [PMID: 38779141 PMCID: PMC11108057 DOI: 10.1016/j.spasta.2024.100826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Particulate matter (PM) has emerged as a primary air quality concern due to its substantial impact on human health. Many recent research works suggest that PM2.5 concentrations depend on meteorological conditions. Enhancing current pollution control strategies necessitates a more holistic comprehension of PM2.5 dynamics and the precise quantification of spatiotemporal heterogeneity in the relationship between meteorological factors and PM2.5 levels. The spatiotemporal varying coefficient model stands as a prominent spatial regression technique adept at addressing this heterogeneity. Amidst the challenges posed by the substantial scale of modern spatiotemporal datasets, we propose a pioneering distributed estimation method (DEM) founded on multivariate spline smoothing across a domain's triangulation. This DEM algorithm ensures an easily implementable, highly scalable, and communication-efficient strategy, demonstrating almost linear speedup potential. We validate the effectiveness of our proposed DEM through extensive simulation studies, demonstrating that it achieves coefficient estimations akin to those of global estimators derived from complete datasets. Applying the proposed model and method to the US daily PM2.5 and meteorological data, we investigate the influence of meteorological variables on PM2.5 concentrations, revealing both spatial and seasonal variations in this relationship.
Collapse
Affiliation(s)
- Lily Wang
- Department of Statistics, George Mason University, 4400 University Drive, MS 4A7, Fairfax, 22030, VA, USA
| | - Guannan Wang
- Department of Mathematics, William & Mary, 120 Jones Hall, Williamsburg, 23185, VA, USA
| | - Annie S. Gao
- McLean High School, 1633 Davidson Rd, McLean, 22101, VA, USA
| |
Collapse
|
2
|
Yu S, Li WV. spVC for the detection and interpretation of spatial gene expression variation. Genome Biol 2024; 25:103. [PMID: 38641849 PMCID: PMC11027374 DOI: 10.1186/s13059-024-03245-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Accepted: 04/10/2024] [Indexed: 04/21/2024] Open
Abstract
Spatially resolved transcriptomics technologies have opened new avenues for understanding gene expression heterogeneity in spatial contexts. However, existing methods for identifying spatially variable genes often focus solely on statistical significance, limiting their ability to capture continuous expression patterns and integrate spot-level covariates. To address these challenges, we introduce spVC, a statistical method based on a generalized Poisson model. spVC seamlessly integrates constant and spatially varying effects of covariates, facilitating comprehensive exploration of gene expression variability and enhancing interpretability. Simulation and real data applications confirm spVC's accuracy in these tasks, highlighting its versatility in spatial transcriptomics analysis.
Collapse
Affiliation(s)
- Shan Yu
- Department of Statistics, Unversity of Virginia, Charlottesville, 22903, VA, USA.
| | - Wei Vivian Li
- Department of Statistics, University of California, Riverside, 92521, CA, USA.
| |
Collapse
|
3
|
Jiang S, Cao J, Colditz GA, Rosner B. Predicting the onset of breast cancer using mammogram imaging data with irregular boundary. Biostatistics 2023; 24:358-371. [PMID: 34435196 PMCID: PMC10102887 DOI: 10.1093/biostatistics/kxab032] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 06/26/2021] [Accepted: 07/29/2021] [Indexed: 11/12/2022] Open
Abstract
With mammography being the primary breast cancer screening strategy, it is essential to make full use of the mammogram imaging data to better identify women who are at higher and lower than average risk. Our primary goal in this study is to extract mammogram-based features that augment the well-established breast cancer risk factors to improve prediction accuracy. In this article, we propose a supervised functional principal component analysis (sFPCA) over triangulations method for extracting features that are ordered by the magnitude of association with the failure time outcome. The proposed method accommodates the irregular boundary issue posed by the breast area within the mammogram imaging data with flexible bivariate splines over triangulations. We also provide an eigenvalue decomposition algorithm that is computationally efficient. Compared to the conventional unsupervised FPCA method, the proposed method results in a lower Brier Score and higher area under the ROC curve (AUC) in simulation studies. We apply our method to data from the Joanne Knight Breast Health Cohort at Siteman Cancer Center. Our approach not only obtains the best prediction performance comparing to unsupervised FPCA and benchmark models but also reveals important risk patterns within the mammogram images. This demonstrates the importance of utilizing additional supervised image-based features to clarify breast cancer risk.
Collapse
Affiliation(s)
- Shu Jiang
- Division of Public Health Sciences, Washington University School of Medicine, MO, USA, 63110
| | - Jiguo Cao
- Department of Statistics and Actuarial Science, Simon Fraser University, BC, Canada, V5A 1S6
| | - Graham A Colditz
- Division of Public Health Sciences, Washington University School of Medicine, MO, USA, 63110
| | - Bernard Rosner
- Channing Division of Network Medicine, Brigham and Women’ s Hospital and Harvard Medical School, MA, USA, 02115 Department of Biostatistics, Harvard T.H. Chan School of Public Health, MA, USA, 02115
| |
Collapse
|
4
|
Yu S, Kusmec AM, Wang L, Nettleton D. Fusion Learning of Functional Linear Regression with Application to Genotype-by-Environment Interaction Studies. JOURNAL OF AGRICULTURAL, BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2023. [DOI: 10.1007/s13253-023-00529-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
5
|
Hu L, Tang Y, Xu Q. Generalized varying-coefficient additive model for locally stationary time series. J STAT COMPUT SIM 2022. [DOI: 10.1080/00949655.2022.2135708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Affiliation(s)
- Lixia Hu
- School of Statistics and Mathematics, Interdisciplinary Research Institute of Data Science, Shanghai Lixin University of Accounting and Finance, Shanghai, People's Republic of China
| | - Yiming Tang
- School of Statistics and Mathematics, Interdisciplinary Research Institute of Data Science, Shanghai Lixin University of Accounting and Finance, Shanghai, People's Republic of China
| | - Qunfang Xu
- Business School of Ningbo University, Ningbo, People's Republic of China
| |
Collapse
|
6
|
White PA, Frye H, Christensen MF, Gelfand AE, Silander JA. Spatial functional data modeling of plant reflectances. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | - Henry Frye
- Department of Ecology and Evolutionary Biology, University of Connecticut
| | | | | | - John A. Silander
- Department of Ecology and Evolutionary Biology, University of Connecticut
| |
Collapse
|
7
|
Estimation for partially linear additive regression with spatial data. Stat Pap (Berl) 2022. [DOI: 10.1007/s00362-022-01326-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
8
|
Li Y, Qiu Y, Xu Y. From multivariate to functional data analysis: fundamentals, recent developments, and emerging areas. J MULTIVARIATE ANAL 2022; 188:104806. [PMID: 39040141 PMCID: PMC11261241 DOI: 10.1016/j.jmva.2021.104806] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Functional data analysis (FDA), which is a branch of statistics on modeling infinite dimensional random vectors resided in functional spaces, has become a major research area for Journal of Multivariate Analysis. We review some fundamental concepts of FDA, their origins and connections from multivariate analysis, and some of its recent developments, including multi-level functional data analysis, high-dimensional functional regression, and dependent functional data analysis. We also discuss the impact of these new methodology developments on genetics, plant science, wearable device data analysis, image data analysis, and business analytics. Two real data examples are provided to motivate our discussions.
Collapse
Affiliation(s)
- Yehua Li
- University of California - Riverside, Riverside, CA 92521, USA
| | - Yumou Qiu
- Iowa State University, Ames, IA 50011, USA
| | - Yuhang Xu
- Bowling Green State University, Bowling Green, OH 43403, USA
| |
Collapse
|
9
|
Wang Y, Kim M, Yu S, Li X, Wang G, Wang L. Nonparametric estimation and inference for spatiotemporal epidemic models. J Nonparametr Stat 2021. [DOI: 10.1080/10485252.2021.1988084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Yueying Wang
- Department of Statistics, Iowa State University, Ames, IA, USA
| | - Myungjin Kim
- Department of Statistics, Iowa State University, Ames, IA, USA
| | - Shan Yu
- Department of Statistics, University of Virginia, Charlottesville, VA, USA
| | - Xinyi Li
- School of Mathematical and Statistical Sciences, Clemson University, Clemson, SC USA
| | - Guannan Wang
- Department of Mathematics, William & Mary College, Williamsburg, VA, USA
| | - Li Wang
- Department of Statistics, Iowa State University, Ames, IA, USA
| |
Collapse
|
10
|
Spatially Varying Coefficient Models with Sign Preservation of the Coefficient Functions. JOURNAL OF AGRICULTURAL, BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2021. [DOI: 10.1007/s13253-021-00443-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
11
|
Xu G, Bai Y. Estimation of nonparametric additive models with high order spatial autoregressive errors. CAN J STAT 2020. [DOI: 10.1002/cjs.11565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Guoying Xu
- Department of Statistics and Management Shanghai University of Finance of Economics Shanghai China
| | - Yang Bai
- Department of Statistics and Management Shanghai University of Finance of Economics Shanghai China
| |
Collapse
|
12
|
Zhang J, Wei Sun W, Li L. Mixed-Effect Time-Varying Network Model and Application in Brain Connectivity Analysis. J Am Stat Assoc 2020; 115:2022-2036. [PMID: 34321703 DOI: 10.1080/01621459.2019.1677242] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Time-varying networks are fast emerging in a wide range of scientific and business applications. Most existing dynamic network models are limited to a single-subject and discrete-time setting. In this article, we propose a mixed-effect network model that characterizes the continuous time-varying behavior of the network at the population level, meanwhile taking into account both the individual subject variability as well as the prior module information. We develop a multistep optimization procedure for a constrained likelihood estimation and derive the associated asymptotic properties. We demonstrate the effectiveness of our method through both simulations and an application to a study of brain development in youth. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Jingfei Zhang
- Department of Management Science, Miami Business School, University of Miami, Miami, FL
| | - Will Wei Sun
- Krannert School of Management, Purdue University, West Lafayette, IN
| | - Lexin Li
- Department of Biostatistics and Epidemiology, School of Public Health, University of California at Berkeley, Berkeley, CA
| |
Collapse
|