1
|
Liang M, Koslovsky MD, Hébert ET, Businelle MS, Vannucci M. Functional Concurrent Regression Mixture Models Using Spiked Ewens-Pitman Attraction Priors. BAYESIAN ANALYSIS 2024; 19:1067-1095. [PMID: 39465034 PMCID: PMC11507269 DOI: 10.1214/23-ba1380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
Functional concurrent, or varying-coefficient, regression models are a form of functional data analysis methods in which functional covariates and outcomes are collected concurrently. Two active areas of research for this class of models are identifying influential functional covariates and clustering their relations across observations. In various applications, researchers have applied and developed methods to address these objectives separately. However, no approach currently performs both tasks simultaneously. In this paper, we propose a fully Bayesian functional concurrent regression mixture model that simultaneously performs functional variable selection and clustering for subject-specific trajectories. Our approach introduces a novel spiked Ewens-Pitman attraction prior that identifies and clusters subjects' trajectories marginally for each functional covariate while using similarities in subjects' auxiliary covariate patterns to inform clustering allocation. Using simulated data, we evaluate the clustering, variable selection, and parameter estimation performance of our approach and compare its performance with alternative spiked processes. We then apply our method to functional data collected in a novel, smartphone-based smoking cessation intervention study to investigate individual-level dynamic relations between smoking behaviors and potential risk factors.
Collapse
Affiliation(s)
- Mingrui Liang
- Department of Statistics, Rice University, Houston, TX, USA
| | | | - Emily T Hébert
- Department of Health Promotion and Behavioral Sciences, School of Public Health, University of Texas Health Science Center, Austin, TX 78701, USA
| | - Michael S Businelle
- Department of Family and Preventive Medicine, College of Medicine, University of Oklahoma, Oklahoma City, OK 73104, USA
| | | |
Collapse
|
2
|
Garcia NL, Rodrigues-Motta M, Migon HS, Petkova E, Tarpey T, Ogden RT, Giordano JO, Perez MM. Unsupervised Bayesian classification for models with scalar and functional covariates. J R Stat Soc Ser C Appl Stat 2024; 73:658-681. [PMID: 39072300 PMCID: PMC11271982 DOI: 10.1093/jrsssc/qlae006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 10/23/2023] [Accepted: 01/16/2024] [Indexed: 07/30/2024]
Abstract
We consider unsupervised classification by means of a latent multinomial variable which categorizes a scalar response into one of the L components of a mixture model which incorporates scalar and functional covariates. This process can be thought as a hierarchical model with the first level modelling a scalar response according to a mixture of parametric distributions and the second level modelling the mixture probabilities by means of a generalized linear model with functional and scalar covariates. The traditional approach of treating functional covariates as vectors not only suffers from the curse of dimensionality, since functional covariates can be measured at very small intervals leading to a highly parametrized model, but also does not take into account the nature of the data. We use basis expansions to reduce the dimensionality and a Bayesian approach for estimating the parameters while providing predictions of the latent classification vector. The method is motivated by two data examples that are not easily handled by existing methods. The first example concerns identifying placebo responders on a clinical trial (normal mixture model) and the other predicting illness for milking cows (zero-inflated mixture of the Poisson model).
Collapse
Affiliation(s)
- Nancy L Garcia
- Department of Statistics, Universidade Estadual de Campinas, Campinas, Brazil
| | | | - Helio S Migon
- Department of Statistics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Eva Petkova
- Department of Population Health, Grossman School of Medicine, New York University, New York, USA
- Department of Child and Adolescent Psychiatry, Grossman School of Medicine, New York University, New York, USA
| | - Thaddeus Tarpey
- Department of Population Health, Grossman School of Medicine, New York University, New York, USA
| | - R Todd Ogden
- Department of Biostatistics, Columbia University, New York, USA
| | - Julio O Giordano
- College of Agriculture and Life Sciences, Cornell University, Cornell, USA
| | - Martin M Perez
- College of Agriculture and Life Sciences, Cornell University, Cornell, USA
| |
Collapse
|
3
|
Wang S, Kim S, Ryan Cho H, Chang W. Nonparametric predictive model for sparse and irregular longitudinal data. Biometrics 2024; 80:ujad023. [PMID: 38372401 DOI: 10.1093/biomtc/ujad023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Revised: 07/07/2023] [Accepted: 12/06/2023] [Indexed: 02/20/2024]
Abstract
We propose a kernel-based estimator to predict the mean response trajectory for sparse and irregularly measured longitudinal data. The kernel estimator is constructed by imposing weights based on the subject-wise similarity on L2 metric space between predictor trajectories, where we assume that an analogous fashion in predictor trajectories over time would result in a similar trend in the response trajectory among subjects. In order to deal with the curse of dimensionality caused by the multiple predictors, we propose an appealing multiplicative model with multivariate Gaussian kernels. This model is capable of achieving dimension reduction as well as selecting functional covariates with predictive significance. The asymptotic properties of the proposed nonparametric estimator are investigated under mild regularity conditions. We illustrate the robustness and flexibility of our proposed method via extensive simulation studies and an application to the Framingham Heart Study.
Collapse
Affiliation(s)
- Shixuan Wang
- Department of Statistics, Miami University, Oxford, OH 45056, United States
| | - Seonjin Kim
- Department of Statistics, Miami University, Oxford, OH 45056, United States
| | - Hyunkeun Ryan Cho
- Department of Biostatistics, University of Iowa, Iowa City, IA 52246, United States
| | - Won Chang
- Department of Mathematical Science, University of Cincinnati, Cincinnati, OH 45221, United States
| |
Collapse
|
4
|
Shape-constrained estimation in functional regression with Bernstein polynomials. Comput Stat Data Anal 2023. [DOI: 10.1016/j.csda.2022.107614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
5
|
Lee KY, Li L. Functional sufficient dimension reduction through average Fréchet derivatives. Ann Stat 2022; 50:904-929. [PMID: 37041758 PMCID: PMC10085580 DOI: 10.1214/21-aos2131] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Sufficient dimension reduction (SDR) embodies a family of methods that aim for reduction of dimensionality without loss of information in a regression setting. In this article, we propose a new method for nonparametric function-on-function SDR, where both the response and the predictor are a function. We first develop the notions of functional central mean subspace and functional central subspace, which form the population targets of our functional SDR. We then introduce an average Fréchet derivative estimator, which extends the gradient of the regression function to the operator level and enables us to develop estimators for our functional dimension reduction spaces. We show the resulting functional SDR estimators are unbiased and exhaustive, and more importantly, without imposing any distributional assumptions such as the linearity or the constant variance conditions that are commonly imposed by all existing functional SDR methods. We establish the uniform convergence of the estimators for the functional dimension reduction spaces, while allowing both the number of Karhunen-Loève expansions and the intrinsic dimension to diverge with the sample size. We demonstrate the efficacy of the proposed methods through both simulations and two real data examples.
Collapse
Affiliation(s)
- Kuang-Yao Lee
- Department of Statistical Science, Temple University
| | - Lexin Li
- Department of Biostatistics, University of California, Berkeley
| |
Collapse
|
6
|
Wang Z, Dong H, Ma P, Wang Y. Estimation and model selection for nonparametric function-on-function regression. J Comput Graph Stat 2022; 31:835-845. [PMID: 36594047 PMCID: PMC9802009 DOI: 10.1080/10618600.2022.2037434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 10/23/2021] [Accepted: 01/21/2022] [Indexed: 01/07/2023]
Abstract
Regression models with a functional response and functional covariate have received significant attention recently. While various nonparametric and semiparametric models have been developed, there is an urgent need for model selection and diagnostic methods. In this article, we develop a unified framework for estimation and model selection in nonparametric function-on-function regression. We propose a general nonparametric functional regression model with the model space constructed through smoothing spline analysis of variance (SS ANOVA). The proposed model reduces to some of the existing models when selected components in the SS ANOVA decomposition are eliminated. We propose new estimation procedures under either L 1 or L 2 penalty and show that the combination of the SS ANOVA decomposition and L 1 penalty provides powerful tools for model selection and diagnostics. We establish consistency and convergence rates for estimates of the regression function and each component in its decomposition under both the L 1 and L 2 penalties. Simulation studies and real examples show that the proposed methods perform well. Technical details and additional simulation results are available in online supplementary materials.
Collapse
Affiliation(s)
- Zhanfeng Wang
- International Institute of Finance, The School of Management, University of Science and Technology of China
| | - Hao Dong
- Department of Statistics and Applied Probability, University of California, Santa Barbara
| | - Ping Ma
- Department of Statistics, University of Georgia
| | - Yuedong Wang
- Department of Statistics and Applied Probability, University of California, Santa Barbara
| |
Collapse
|
7
|
Jeon JM, Park BU, Van Keilegom I. Additive regression for non-Euclidean responses and predictors. Ann Stat 2021. [DOI: 10.1214/21-aos2048] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
8
|
Ghosal R, Maity A. Variable selection in nonparametric functional concurrent regression. CAN J STAT 2021. [DOI: 10.1002/cjs.11654] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Rahul Ghosal
- Department of Biostatistics Johns Hopkins University Baltimore MD U.S.A
| | - Arnab Maity
- Department of Statistics North Carolina State University Raleigh NC U.S.A
| |
Collapse
|
9
|
Cai X, Xue L, Cao J. Robust penalized M‐estimation for function‐on‐function linear regression. Stat (Int Stat Inst) 2021. [DOI: 10.1002/sta4.390] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Xiong Cai
- College of Statistics and Data Science, Faculty of Science Beijing University of Technology Beijing 100124 China
| | - Liugen Xue
- College of Statistics and Data Science, Faculty of Science Beijing University of Technology Beijing 100124 China
| | - Jiguo Cao
- Department of Statistics and Actuarial Science Simon Fraser University Burnaby BC V5A1S6 Canada
| |
Collapse
|
10
|
Meyer MJ, Malloy EJ, Coull BA. Bayesian Wavelet-packet Historical Functional Linear Models. STATISTICS AND COMPUTING 2021; 31:14. [PMID: 36324372 PMCID: PMC9624484 DOI: 10.1007/s11222-020-09981-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Accepted: 10/21/2020] [Indexed: 06/16/2023]
Abstract
Historical Functional Linear Models (HFLM) quantify associations between a functional predictor and functional outcome where the predictor is an exposure variable that occurs before, or at least concurrently with, the outcome. Prior work on the HFLM has largely focused on estimation of a surface that represents a time-varying association between the functional outcome and the functional exposure. This existing work has employed frequentist and spline-based estimation methods, with little attention paid to formal inference or adjustment for multiple testing and no approaches that implement wavelet-bases. In this work, we propose a new functional regression model that estimates the time-varying, lagged association between a functional outcome and a functional exposure. Building off of recently developed function-on-function regression methods, the model employs a novel use the wavelet-packet decomposition of the exposure and outcome functions that allows us to strictly enforce the temporal ordering of exposure and outcome, which is not possible with existing wavelet-based functional models. Using a fully Bayesian approach, we conduct formal inference on the time-varying lagged association, while adjusting for multiple testing. We investigate the operating characteristics of our wavelet-packet HFLM and compare them to those of two existing estimation procedures in simulation. We also assess several inference techniques and use the model to analyze data on the impact of lagged exposure to particulate matter finer than 2.5μg, or PM2.5, on heart rate variability in a cohort of journeyman boilermakers during the morning of a typical day's shift.
Collapse
Affiliation(s)
- Mark J Meyer
- Department of Mathematics and Statistics, Georgetown University
| | | | - Brent A Coull
- Department of Biostatistics, Harvard T. H. Chan School of Public Health
| |
Collapse
|
11
|
Cao G, Wang S, Wang L. Estimation and inference for functional linear regression models with partially varying regression coefficients. Stat (Int Stat Inst) 2020. [DOI: 10.1002/sta4.286] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Guanqun Cao
- Department of Mathematics and Statistics Auburn University Auburn 36849 AL USA
| | - Shuoyang Wang
- Department of Mathematics and Statistics Auburn University Auburn 36849 AL USA
| | - Lily Wang
- Department of Statistics Iowa State University Ames 50011 IA USA
| |
Collapse
|
12
|
McCauley SR, Clark SD, Quest BW, Streeter RM, Oxford EM. Review of canine dilated cardiomyopathy in the wake of diet-associated concerns. J Anim Sci 2020; 98:skaa155. [PMID: 32542359 PMCID: PMC7447921 DOI: 10.1093/jas/skaa155] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 05/04/2020] [Indexed: 12/12/2022] Open
Abstract
Dilated cardiomyopathy (DCM) has been in the literature and news because of the recent opinion-based journal articles and public releases by regulatory agencies. DCM is commonly associated with a genetic predisposition in certain dog breeds and can also occur secondary to other diseases and nutritional deficiencies. Recent communications in veterinary journals have discussed a potential relationship between grain-free and/or novel protein diets to DCM, citing a subjective increase in DCM in dog breeds that are not known to have a genetic predisposition for the disease. This literature review describes clinical presentations of DCM, common sequelae, treatment and preventative measures, histopathologic features, and a discussion of the varied etiological origins of the disease. In addition, current literature limitations are addressed, in order to ascertain multiple variables leading to the development of DCM. Future studies are needed to evaluate one variable at a time and to minimize confounding variables and speculation. Furthermore, to prevent sampling bias with the current FDA reports, the veterinary community should be asked to provide information for all cases of DCM in dogs. This should include cases during the same time period, regardless of the practitioner's proposed etiology, due to no definitive association between diets with specific characteristics, such as, but not limited to, grain-free diets and those containing legumes, novel protein diets, and those produced by small manufacturers to DCM in dogs. In summary, in order to determine if certain ingredients, categories of diets, or manufacturing processes are related to an increased risk of DCM, further studies investigating these variables are necessary.
Collapse
|
13
|
Staicu AM, Islam MN, Dumitru R, van Heugten E. Longitudinal dynamic functional regression. J R Stat Soc Ser C Appl Stat 2020; 69:25-46. [PMID: 31929657 PMCID: PMC6953745 DOI: 10.1111/rssc.12376] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The paper develops a parsimonious modelling framework to study the time-varying association between scalar outcomes and functional predictors observed at many instances, in longitudinal studies. The methods enable us to reconstruct the full trajectory of the response and are applicable to Gaussian and non-Gaussian responses. The idea is to model the time-varying functional predictors by using orthogonal basis functions and to expand the time-varying regression coefficient by using the same basis. Numerical investigation through simulation studies and data analysis show excellent performance in terms of accurate prediction and efficient computations, when compared with existing alternatives. The methods are inspired and applied to an animal science application, where of interest is to study the association between the feed intake of lactating sows and the minute-by-minute temperature throughout the 21 days of their lactation period. R code and an R illustration are provided.
Collapse
|
14
|
A Novel, Dose-Adjusted Tacrolimus Trough-Concentration Model for Predicting and Estimating Variance After Kidney Transplantation. Drugs R D 2019; 19:201-212. [PMID: 31073875 PMCID: PMC6544741 DOI: 10.1007/s40268-019-0271-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background and Objective Given that a high intrapatient variability (IPV) of tacrolimus whole blood concentration increases the risk for a poor kidney transplant outcome, some experts advocate routine IPV monitoring for detection of high-risk patients. However, attempts to estimate the variance of tacrolimus trough concentrations (TTC) are limited by the need for patients to receive a fixed dose over time and/or the use of linear statistical models. A goal of this study is to overcome the current limitations through the novel application of statistical methodology generalizing the relationship between TTC and dose through the use of nonparametric functional regression modeling. Methods With TTC as a response and dose as a covariate, the model employs an unknown bivariate function, allowing for the potentially complex, nonlinear relationship between the two parameters. A dose-adjusted variance of TTC is then derived based on standard functional principal component analysis (FPCA). To assess the model, it was compared against an FPCA-based model and linear mixed-effects models using prediction error, bias, and coverage probabilities for simulated data as well as phase III data from the Astellas new drug application studies for extended-release tacrolimus. Results Our numerical investigation indicates that the new model better predicts dose-adjusted TTCs compared with the prediction of linear mixed effects models. Estimated coverage probabilities also indicate that the new model accurately accounts for the variance of TTC during the periods of large fluctuation in dose, whereas the linear mixed effects model consistently underestimates the coverage probabilities because of the inaccurate characterization of TTC fluctuation. Conclusion This is the first known application of a functional regression model to assess complex relationships between TTC and dose in a real clinical setting. This new method has applicability in future clinical trials including real-world data sets due to flexibility of the nonparametric modeling approach. Electronic supplementary material The online version of this article (10.1007/s40268-019-0271-2) contains supplementary material, which is available to authorized users.
Collapse
|
15
|
Conditional Analysis for Mixed Covariates, with Application to Feed Intake of Lactating Sows. JOURNAL OF PROBABILITY AND STATISTICS 2019. [DOI: 10.1155/2019/3743762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We propose a novel modeling framework to study the effect of covariates of various types on the conditional distribution of the response. The methodology accommodates flexible model structure, allows for joint estimation of the quantiles at all levels, and provides a computationally efficient estimation algorithm. Extensive numerical investigation confirms good performance of the proposed method. The methodology is motivated by and applied to a lactating sow study, where the primary interest is to understand how the dynamic change of minute-by-minute temperature in the farrowing rooms within a day (functional covariate) is associated with low quantiles of feed intake of lactating sows, while accounting for other sow-specific information (vector covariate).
Collapse
|
16
|
Wong RKW, Li Y, Zhu Z. Partially Linear Functional Additive Models for Multivariate Functional Data. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2017.1411268] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
| | - Yehua Li
- Department of Statistics, University of California, Riverside, CA
| | - Zhengyuan Zhu
- Department of Statistics & Statistical Laboratory, Iowa State University, Ames, IA
| |
Collapse
|
17
|
Reimherr M, Sriperumbudur B, Taoufik B. Optimal prediction for additive function-on-function regression. Electron J Stat 2018. [DOI: 10.1214/18-ejs1505] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|