1
|
Yu Y, Tang L, Ren K, Chen Z, Chen S, Shi J. Bayesian Regression Analysis for Dependent Data with an Elliptical Shape. ENTROPY (BASEL, SWITZERLAND) 2024; 26:1072. [PMID: 39766700 PMCID: PMC11675188 DOI: 10.3390/e26121072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2024] [Revised: 11/21/2024] [Accepted: 12/05/2024] [Indexed: 01/11/2025]
Abstract
This paper proposes a parametric hierarchical model for functional data with an elliptical shape, using a Gaussian process prior to capturing the data dependencies that reflect systematic errors while modeling the underlying curved shape through a von Mises-Fisher distribution. The model definition, Bayesian inference, and MCMC algorithm are discussed. The effectiveness of the model is demonstrated through the reconstruction of curved trajectories using both simulated and real-world examples. The discussion in this paper focuses on two-dimensional problems, but the framework can be extended to higher-dimensional spaces, making it adaptable to a wide range of applications.
Collapse
Affiliation(s)
- Yian Yu
- Department of Statistics and Data Science, College of Science, Southern University of Science and Technology, Shenzhen 518055, China; (Y.Y.); (L.T.)
| | - Long Tang
- Department of Statistics and Data Science, College of Science, Southern University of Science and Technology, Shenzhen 518055, China; (Y.Y.); (L.T.)
| | - Kang Ren
- HUST-GYENNO CNS Intelligent Digital Medicine Technology Center, Wuhan 430074, China; (K.R.); (Z.C.)
| | - Zhonglue Chen
- HUST-GYENNO CNS Intelligent Digital Medicine Technology Center, Wuhan 430074, China; (K.R.); (Z.C.)
| | - Shengdi Chen
- Department of Neurology and Institute of Neurology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China;
| | - Jianqing Shi
- Department of Statistics and Data Science, College of Science, Southern University of Science and Technology, Shenzhen 518055, China; (Y.Y.); (L.T.)
- National Center for Applied Mathematics, Shenzhen 518000, China
| |
Collapse
|
2
|
Mekaoussi H, Heddam S, Bouslimanni N, Kim S, Zounemat-Kermani M. Predicting biochemical oxygen demand in wastewater treatment plant using advance extreme learning machine optimized by Bat algorithm. Heliyon 2023; 9:e21351. [PMID: 37954260 PMCID: PMC10637896 DOI: 10.1016/j.heliyon.2023.e21351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 10/08/2023] [Accepted: 10/19/2023] [Indexed: 11/14/2023] Open
Abstract
Wastewater quality modelling plays a vital role in planning and management of wastewater treatment plants (WWTP). This paper develops a new hybrid machine learning model based on extreme learning machine (ELM) optimized by Bat algorithm (ELM-Bat) for modelling five day effluent biochemical oxygen demand (BOD5). Specifically, this hybrid model combines the Bat algorithm for model parameters optimization and the standalone ELM. The proposed model was developed using historical measured effluents wastewater quality variables, i.e., the chemical oxygen demand (COD), temperature, pH, total suspended solid (TSS), specific conductance (SC) and the wastewater flow (Q). The performances of the hybrid ELM-Bat were compared with those of the multilayer perceptron neural network (MLPNN), the random forest regression (RFR), the Gaussian process regression (GPR), the random vector functional link network (RVFL), and the multiple linear regression (MLR) models. By comparing several input variables combination, the improvement achieved in the accuracy of prediction through the hybrid ELM-Bat was quantified. All models were first calibrated using training dataset and later tested using validation and based on four performances metrics namely, root mean square error (RMSE), mean absolute error (MAE), the correlation coefficient (R), and the Nash-Sutcliffe model efficiency (NSE). In all, it is concluded that the ELM-Bat is the most accurate model when all the six input were included as input variables, and it outperforms all other benchmark models in terms of predictive accuracy, exhibiting RMSE, MAE, R and NSE values of approximately, 0.885, 0.781, 2.621, and 1.989, respectively.
Collapse
Affiliation(s)
- Hayat Mekaoussi
- Institute of veterinary and agronomic sciences, Agronomy Department, Hydraulics Division, University Batna 1-Hadj Lakhdar- Allées 19 mai, Route de Biskra Batna, 05000 Algeria
- Laboratory of Research in Biodiversity Interaction Ecosystem and Biotechnology (LRIBEB) University 20 Août 1955 Skikda, Algeria
| | - Salim Heddam
- Laboratory of Research in Biodiversity Interaction Ecosystem and Biotechnology (LRIBEB), Faculty of Science, Agronomy Department, University 20 Août 1955-Skikda, Route El Hadaik, BP 26, Skikda, Algeria
| | - Nouri Bouslimanni
- Institute of veterinary and agronomic sciences, Agronomy Department, Chemical Division, University Batna 1-Hadj Lakhdar- Allées 19 mai, Route de Biskra Batna, 05000 Algeria
| | - Sungwon Kim
- Department of Railroad Construction and Safety Engineering, Dongyang University, Yeongju 36040, Republic of Korea
| | | |
Collapse
|
3
|
SHIEH DENISE, OGDEN RTODD. Permutation-Based Inference for Function-on-Scalar Regression With an Application in PET Brain Imaging. J Nonparametr Stat 2023; 35:820-838. [PMID: 38046382 PMCID: PMC10688779 DOI: 10.1080/10485252.2023.2206926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 04/19/2023] [Indexed: 12/05/2023]
Abstract
The density of various proteins throughout the human brain can be studied through the use of positron emission tomography (PET) imaging. We report here on data from a study of serotonin transporter (5-HTT) binding. While PET imaging data analysis is most commonly performed on data that are aggregated into several discrete a priori regions of interest, in this study, primary interest is on measures of 5-HTT binding potential that are made at many locations along a continuous anatomically defined tract, one that was chosen to follow serotonergic axons. Our goal is to characterize the binding patterns along this tract and also to determine how such patterns differ between control subjects and depressed patients. Due to the nature of our data, we utilize function-on-scalar regression modeling to make optimal use of our data. Inference on both main effects (position along the tract; diagnostic group) and their interactions is made using permutation testing strategies that do not require distributional assumptions. Also, to investigate the question of homogeneity we implement a permutation testing strategy, which adapts a "block bootstrapping" approach from time series analysis to the functional data setting.
Collapse
Affiliation(s)
- DENISE SHIEH
- Department of Biostatistics, Columbia University, New York, NY 10032, USA
| | - R TODD OGDEN
- Department of Biostatistics, Columbia University, New York, NY 10032, USA
| |
Collapse
|
4
|
Leroy A, Latouche P, Guedj B, Gey S. MAGMA: inference and prediction using multi-task Gaussian processes with common mean. Mach Learn 2022. [DOI: 10.1007/s10994-022-06172-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
AbstractA novel multi-task Gaussian process (GP) framework is proposed, by using a common mean process for sharing information across tasks. In particular, we investigate the problem of time series forecasting, with the objective to improve multiple-step-ahead predictions. The common mean process is defined as a GP for which the hyper-posterior distribution is tractable. Therefore an EM algorithm is derived for handling both hyper-parameters optimisation and hyper-posterior computation. Unlike previous approaches in the literature, the model fully accounts for uncertainty and can handle irregular grids of observations while maintaining explicit formulations, by modelling the mean process in a unified GP framework. Predictive analytical equations are provided, integrating information shared across tasks through a relevant prior mean. This approach greatly improves the predictive performances, even far from observations, and may reduce significantly the computational complexity compared to traditional multi-task GP models. Our overall algorithm is called Magma (standing for Multi tAsk GPs with common MeAn). The quality of the mean process estimation, predictive performances, and comparisons to alternatives are assessed in various simulated scenarios and on real datasets.
Collapse
|
5
|
Meyer MJ, Morris JS, Gazes RP, Coull BA. Ordinal probit functional outcome regression with application to computer-use behavior in rhesus monkeys. Ann Appl Stat 2022; 16:537-550. [DOI: 10.1214/21-aoas1513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Mark J. Meyer
- Department of Mathematics and Statistics, Georgetown University
| | - Jeffrey S. Morris
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania
| | - Regina Paxton Gazes
- Department of Psychology and Program in Animal Behavior, Bucknell University
| | - Brent A. Coull
- Department of Biostatistics, Harvard T.H. Chan School of Public Health
| |
Collapse
|
6
|
Bhattacharjee S, Müller HG. Concurrent object regression. Electron J Stat 2022. [DOI: 10.1214/22-ejs2040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | - Hans-Georg Müller
- Department of Statistics, University of California, Davis Davis, CA 95616 USA
| |
Collapse
|
7
|
Robust Non-Parametric Mortality and Fertility Modelling and Forecasting: Gaussian Process Regression Approaches. FORECASTING 2021. [DOI: 10.3390/forecast3010013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
A rapid decline in mortality and fertility has become major issues in many developed countries over the past few decades. An accurate model for forecasting demographic movements is important for decision making in social welfare policies and resource budgeting among the government and many industry sectors. This article introduces a novel non-parametric approach using Gaussian process regression with a natural cubic spline mean function and a spectral mixture covariance function for mortality and fertility modelling and forecasting. Unlike most of the existing approaches in demographic modelling literature, which rely on time parameters to determine the movements of the whole mortality or fertility curve shifting from one year to another over time, we consider the mortality and fertility curves from their components of all age-specific mortality and fertility rates and assume each of them following a Gaussian process over time to fit the whole curves in a discrete but intensive style. The proposed Gaussian process regression approach shows significant improvements in terms of forecast accuracy and robustness compared to other mainstream demographic modelling approaches in the short-, mid- and long-term forecasting using the mortality and fertility data of several developed countries in the numerical examples.
Collapse
|
8
|
Wang Z, Noh M, Lee Y, Shi JQ. A general robust t-process regression model. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2020.107093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
9
|
An Experimental and Statistical Study on Rebar Corrosion Considering the Temperature Effect Using Gaussian Process Regression. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10175937] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Temperature is an important factor that affects corrosion potential in rebars. The temperature effect must be removed from the corrosion potential for precise measurement of corrosion rates. To separate the temperature effect from the corrosion potential, in this study rebar specimens were not embedded in concrete but, instead, were placed in an uncontrolled air environment. Gaussian process regression (GPR) was applied to the temperature and the non-corrosion potential data in order to remove the temperature effect from the corrosion potential. The results indicated that the corrosion potential was affected by the temperature. Furthermore, the GPR models of all the experimental cases showed high coefficients of determination (R2 > 0.90) and low root mean square errors (RMSE < 0.08), meaning that these models had high reliability. The fitted GPR models were used to successfully remove the temperature effect from the corrosion potential. This demonstrates that the GPR method can be appropriately used to assess the temperature effect on rebar corrosion.
Collapse
|
10
|
SOH and RUL Prediction of Lithium-Ion Batteries Based on Gaussian Process Regression with Indirect Health Indicators. ENERGIES 2020. [DOI: 10.3390/en13020375] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The state of health (SOH) and remaining useful life (RUL) of lithium-ion batteries are two important factors which are normally predicted using the battery capacity. However, it is difficult to directly measure the capacity of lithium-ion batteries for online applications. In this paper, indirect health indicators (IHIs) are extracted from the curves of voltage, current, and temperature in the process of charging and discharging lithium-ion batteries, which respond to the battery capacity degradation process. A few reasonable indicators are selected as the inputs of SOH prediction by the grey relation analysis method. The short-term SOH prediction is carried out by combining the Gaussian process regression (GPR) method with probability predictions. Then, considering that there is a certain mapping relationship between SOH and RUL, three IHIs and the present SOH value are utilized to predict RUL of lithium-ion batteries through the GPR model. The results show that the proposed method has high prediction accuracy.
Collapse
|
11
|
Gaussian process methods for nonparametric functional regression with mixed predictors. Comput Stat Data Anal 2019. [DOI: 10.1016/j.csda.2018.07.009] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
12
|
Cao C, Shi JQ, Lee Y. Robust functional regression model for marginal mean and subject-specific inferences. Stat Methods Med Res 2018; 27:3236-3254. [PMID: 29298601 DOI: 10.1177/0962280217695346] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
We introduce flexible robust functional regression models, using various heavy-tailed processes, including a Student t-process. We propose efficient algorithms in estimating parameters for the marginal mean inferences and in predicting conditional means as well as interpolation and extrapolation for the subject-specific inferences. We develop bootstrap prediction intervals (PIs) for conditional mean curves. Numerical studies show that the proposed model provides a robust approach against data contamination or distribution misspecification, and the proposed PIs maintain the nominal confidence levels. A real data application is presented as an illustrative example.
Collapse
Affiliation(s)
- Chunzheng Cao
- 1 School of Mathematics and Statistics, Nanjing University of Information Science and Technology, China
- 2 Department of Statistics, Seoul National University, Korea
| | - Jian Qing Shi
- 3 School of Mathematics and Statistics, Newcastle University, UK
| | - Youngjo Lee
- 2 Department of Statistics, Seoul National University, Korea
| |
Collapse
|
13
|
Lithium-Ion Battery Prognostics with Hybrid Gaussian Process Function Regression. ENERGIES 2018. [DOI: 10.3390/en11061420] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
14
|
Wu D, Ma J. A Two-Layer Mixture Model of Gaussian Process Functional Regressions and Its MCMC EM Algorithm. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4894-4904. [PMID: 29993960 DOI: 10.1109/tnnls.2017.2782711] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The mixture of Gaussian processes (GPs) is capable of learning any general stochastic process based on a given set of (sample) curves for the regression and prediction problems. However, it is ineffective for curve clustering and prediction, when the sample curves are derived from different stochastic processes as independent sources linearly mixed together. In this paper, we propose a two-layer mixture model of GP functional regressions (GPFRs) to describe such a mixture of general stochastic processes or independent sources, especially for curve clustering and prediction. Specifically, in the lower layer, the mixture of GPFRs (MGPFRs) is developed for a cluster (or class) of curves within the input space. In the higher layer, the mixture of MGPFRs is further established to divide the curves into clusters according to its components in the output space. For the parameter estimation of the two-layer mixture of GPFRs, we develop a Monte Carlo EM algorithm based on a Monte Carlo Markov chain (MCMC) method, in short, the MCMC EM algorithm. We validate the hierarchical mixture of GPFRs and MCMC EM algorithm using synthetic and real-world data sets. Our results show that our new model outperforms the conventional mixture models in curve clustering and prediction.
Collapse
|
15
|
Yang J, Cox DD, Lee JS, Ren P, Choi T. Efficient Bayesian hierarchical functional data analysis with basis function approximations using Gaussian-Wishart processes. Biometrics 2017; 73:1082-1091. [PMID: 28395117 PMCID: PMC5634932 DOI: 10.1111/biom.12705] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 03/01/2017] [Accepted: 03/01/2017] [Indexed: 11/28/2022]
Abstract
Functional data are defined as realizations of random functions (mostly smooth functions) varying over a continuum, which are usually collected on discretized grids with measurement errors. In order to accurately smooth noisy functional observations and deal with the issue of high-dimensional observation grids, we propose a novel Bayesian method based on the Bayesian hierarchical model with a Gaussian-Wishart process prior and basis function representations. We first derive an induced model for the basis-function coefficients of the functional data, and then use this model to conduct posterior inference through Markov chain Monte Carlo methods. Compared to the standard Bayesian inference that suffers serious computational burden and instability in analyzing high-dimensional functional data, our method greatly improves the computational scalability and stability, while inheriting the advantage of simultaneously smoothing raw observations and estimating the mean-covariance functions in a nonparametric way. In addition, our method can naturally handle functional data observed on random or uncommon grids. Simulation and real studies demonstrate that our method produces similar results to those obtainable by the standard Bayesian inference with low-dimensional common grids, while efficiently smoothing and estimating functional data with random and high-dimensional observation grids when the standard Bayesian inference fails. In conclusion, our method can efficiently smooth and estimate high-dimensional functional data, providing one way to resolve the curse of dimensionality for Bayesian functional data analysis with Gaussian-Wishart processes.
Collapse
Affiliation(s)
- Jingjing Yang
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, U.S.A
| | - Dennis D Cox
- Department of Statistics, Rice University, Houston, Texas 77005, U.S.A
| | - Jong Soo Lee
- Department of Mathematical Sciences, University of Massachusetts Lowell, Lowell, Massachusetts 01854, U.S.A
| | - Peng Ren
- Suntrust Banks Inc, Atlanta, Georgia 30308, U.S.A
| | - Taeryon Choi
- Department of Statistics, Korea University, Seoul 136-701, Republic of Korea
| |
Collapse
|
16
|
Abstract
Researchers are increasingly interested in regression models for functional data. This article discusses a comprehensive framework for additive (mixed) models for functional responses and/or functional covariates based on the guiding principle of reframing functional regression in terms of corresponding models for scalar data, allowing the adaptation of a large body of existing methods for these novel tasks. The framework encompasses many existing as well as new models. It includes regression for ‘generalized’ functional data, mean regression, quantile regression as well as generalized additive models for location, shape and scale (GAMLSS) for functional data. It admits many flexible linear, smooth or interaction terms of scalar and functional covariates as well as (functional) random effects and allows flexible choices of bases—particularly splines and functional principal components—and corresponding penalties for each term. It covers functional data observed on common (dense) or curve-specific (sparse) grids. Penalized-likelihood-based and gradient-boosting-based inference for these models are implemented in R packages refund and FDboost , respectively. We also discuss identifiability and computational complexity for the functional regression models covered. A running example on a longitudinal multiple sclerosis imaging study serves to illustrate the flexibility and utility of the proposed model class. Reproducible code for this case study is made available online.
Collapse
Affiliation(s)
- Sonja Greven
- Department of Statistics, Ludwig-Maximilians-Universität München, Germany
| | - Fabian Scheipl
- Department of Statistics, Ludwig-Maximilians-Universität München, Germany
| |
Collapse
|
17
|
Xu Y, Müller P, Wahed AS, Thall PF. Bayesian Nonparametric Estimation for Dynamic Treatment Regimes with Sequential Transition Times. J Am Stat Assoc 2016; 111:921-935. [PMID: 28018015 PMCID: PMC5175473 DOI: 10.1080/01621459.2015.1086353] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Revised: 06/01/2015] [Indexed: 10/23/2022]
Abstract
We analyze a dataset arising from a clinical trial involving multi-stage chemotherapy regimes for acute leukemia. The trial design was a 2 × 2 factorial for frontline therapies only. Motivated by the idea that subsequent salvage treatments affect survival time, we model therapy as a dynamic treatment regime (DTR), that is, an alternating sequence of adaptive treatments or other actions and transition times between disease states. These sequences may vary substantially between patients, depending on how the regime plays out. To evaluate the regimes, mean overall survival time is expressed as a weighted average of the means of all possible sums of successive transitions times. We assume a Bayesian nonparametric survival regression model for each transition time, with a dependent Dirichlet process prior and Gaussian process base measure (DDP-GP). Posterior simulation is implemented by Markov chain Monte Carlo (MCMC) sampling. We provide general guidelines for constructing a prior using empirical Bayes methods. The proposed approach is compared with inverse probability of treatment weighting, including a doubly robust augmented version of this approach, for both single-stage and multi-stage regimes with treatment assignment depending on baseline covariates. The simulations show that the proposed nonparametric Bayesian approach can substantially improve inference compared to existing methods. An R program for implementing the DDP-GP-based Bayesian nonparametric analysis is freely available at https://www.ma.utexas.edu/users/yxu/.
Collapse
Affiliation(s)
- Yanxun Xu
- Division of Statistics and Scientific Computing, The University of
Texas at Austin, Austin, TX
| | - Peter Müller
- Department of Mathematics, The University of Texas at Austin,
Austin, TX
| | - Abdus S. Wahed
- Department of Biostatistics, University of Pittsburgh, Pittsburgh,
PA
| | - Peter F. Thall
- Department of Biostatistics, The University of Texas M.D. Anderson
Cancer Center, Houston, TX
| |
Collapse
|
18
|
Bao J, Hanson T, McMillan GP, Knight K. Assessment of DPOAE test-retest difference curves via hierarchical Gaussian processes. Biometrics 2016; 73:334-343. [PMID: 27332505 DOI: 10.1111/biom.12550] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Revised: 03/01/2016] [Accepted: 03/01/2016] [Indexed: 11/25/2022]
Abstract
Distortion product otoacoustic emissions (DPOAE) testing is a promising alternative to behavioral hearing tests and auditory brainstem response testing of pediatric cancer patients. The central goal of this study is to assess whether significant changes in the DPOAE frequency/emissions curve (DP-gram) occur in pediatric patients in a test-retest scenario. This is accomplished through the construction of normal reference charts, or credible regions, that DP-gram differences lie in, as well as contour probabilities that measure how abnormal (or in a certain sense rare) a test-retest difference is. A challenge is that the data were collected over varying frequencies, at different time points from baseline, and on possibly one or both ears. A hierarchical structural equation Gaussian process model is proposed to handle the different sources of correlation in the emissions measurements, wherein both subject-specific random effects and variance components governing the smoothness and variability of each child's Gaussian process are coupled together.
Collapse
Affiliation(s)
- Junshu Bao
- Department of Mathematics and Computer Science, Duquesne University, Pittsburgh, Pennsylvania, U.S.A
| | - Timothy Hanson
- Department of Statistics, University of South Carolina, Columbia, South Carolina, U.S.A
| | - Garnett P McMillan
- National Center for Rehabilitative Auditory Research, VA Rehabilitation Research & Development, Portland, Oregon, U.S.A
| | - Kristin Knight
- Oregon Health and Science University, Pediatric Audiology, Portland, Oregon, U.S.A
| |
Collapse
|
19
|
Tang X, Hong Z, Hu Y, Lian H. Gaussian Process Models for Non Parametric Functional Regression with Functional Responses. COMMUN STAT-THEOR M 2015. [DOI: 10.1080/03610926.2013.847101] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
20
|
Choi T, Woo Y. A Partially Linear Model Using a Gaussian Process Prior. COMMUN STAT-SIMUL C 2015. [DOI: 10.1080/03610918.2013.833226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
21
|
Tan T, Choi JY, Hwang H. Fuzzy Clusterwise Functional Extended Redundancy Analysis. ACTA ACUST UNITED AC 2015. [DOI: 10.2333/bhmk.42.37] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
22
|
Wang B, Shi JQ. Generalized Gaussian Process Regression Model for Non-Gaussian Functional Data. J Am Stat Assoc 2014. [DOI: 10.1080/01621459.2014.889021] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
23
|
Huang M, Li R, Wang H, Yao W. Estimating Mixture of Gaussian Processes by Kernel Smoothing. JOURNAL OF BUSINESS & ECONOMIC STATISTICS : A PUBLICATION OF THE AMERICAN STATISTICAL ASSOCIATION 2014; 32:259-270. [PMID: 24976675 PMCID: PMC4068740 DOI: 10.1080/07350015.2013.868084] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
When the functional data are not homogeneous, e.g., there exist multiple classes of functional curves in the dataset, traditional estimation methods may fail. In this paper, we propose a new estimation procedure for the Mixture of Gaussian Processes, to incorporate both functional and inhomogeneous properties of the data. Our method can be viewed as a natural extension of high-dimensional normal mixtures. However, the key difference is that smoothed structures are imposed for both the mean and covariance functions. The model is shown to be identifiable, and can be estimated efficiently by a combination of the ideas from EM algorithm, kernel regression, and functional principal component analysis. Our methodology is empirically justified by Monte Carlo simulations and illustrated by an analysis of a supermarket dataset.
Collapse
Affiliation(s)
- Mian Huang
- School of Statistics and Management and Key Laboratory of Mathematical Economics at SHUFE, Ministry of Education, Shanghai University of Finance and Economics (SHUFE), Shanghai, 200433, P. R. China
| | - Runze Li
- Department of Statistics and The Methodology Center, The Pennsylvania State University, University Park, PA 16802-2111
| | - Hansheng Wang
- Department of Business Statistics and Econometrics, Guanghua School of Management, Peking University, Beijing, 100871, P. R. China
| | - Weixin Yao
- Department of Statistics, Kansas State University, Manhattan, Kansas 66506
| |
Collapse
|
24
|
Chamroukhi F, Glotin H, Samé A. Model-based functional mixture discriminant analysis with hidden process regression for curve classification. Neurocomputing 2013. [DOI: 10.1016/j.neucom.2012.10.030] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
25
|
Woo Y, Choi T, Kim W. A Comparative Study on the Performance of Bayesian Partially Linear Models. COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS 2012. [DOI: 10.5351/ckss.2012.19.6.885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
26
|
Shi JQ, Wang B, Will EJ, West RM. Mixed-effects Gaussian process functional regression models with application to dose-response curve prediction. Stat Med 2012; 31:3165-77. [PMID: 22865484 DOI: 10.1002/sim.4502] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Accepted: 11/24/2011] [Indexed: 11/07/2022]
Abstract
We propose a new semiparametric model for functional regression analysis, combining a parametric mixed-effects model with a nonparametric Gaussian process regression model, namely a mixed-effects Gaussian process functional regression model. The parametric component can provide explanatory information between the response and the covariates, whereas the nonparametric component can add nonlinearity. We can model the mean and covariance structures simultaneously, combining the information borrowed from other subjects with the information collected from each individual subject. We apply the model to dose-response curves that describe changes in the responses of subjects for differing levels of the dose of a drug or agent and have a wide application in many areas. We illustrate the method for the management of renal anaemia. An individual dose-response curve is improved when more information is included by this mechanism from the subject/patient over time, enabling a patient-specific treatment regime.
Collapse
Affiliation(s)
- J Q Shi
- School of Mathematics and Statistics, Newcastle University, Newcastle, NE1 7RU, U.K
| | | | | | | |
Collapse
|
27
|
|
28
|
Maximum likelihood ratio test for the stability of sequence of Gaussian random processes. Comput Stat Data Anal 2011. [DOI: 10.1016/j.csda.2011.01.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
29
|
Yi G, Shi JQ, Choi T. Penalized gaussian process regression and classification for high-dimensional nonlinear data. Biometrics 2011; 67:1285-94. [PMID: 21385168 DOI: 10.1111/j.1541-0420.2011.01576.x] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
The model based on Gaussian process (GP) prior and a kernel covariance function can be used to fit nonlinear data with multidimensional covariates. It has been used as a flexible nonparametric approach for curve fitting, classification, clustering, and other statistical problems, and has been widely applied to deal with complex nonlinear systems in many different areas particularly in machine learning. However, it is a challenging problem when the model is used for the large-scale data sets and high-dimensional data, for example, for the meat data discussed in this article that have 100 highly correlated covariates. For such data, it suffers from large variance of parameter estimation and high predictive errors, and numerically, it suffers from unstable computation. In this article, penalized likelihood framework will be applied to the model based on GPs. Different penalties will be investigated, and their ability in application given to suit the characteristics of GP models will be discussed. The asymptotic properties will also be discussed with the relevant proofs. Several applications to real biomechanical and bioinformatics data sets will be reported.
Collapse
Affiliation(s)
- G Yi
- School of Mathematics & Statistics, Newcastle University, United Kingdom Department of Statistics, Korea University, South Korea
| | | | | |
Collapse
|
30
|
Choi T, Shi JQ, Wang B. A Gaussian process regression approach to a single-index model. J Nonparametr Stat 2011. [DOI: 10.1080/10485251003768019] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
31
|
Chen T, Wang B. Bayesian variable selection for Gaussian process regression: Application to chemometric calibration of spectrometers. Neurocomputing 2010. [DOI: 10.1016/j.neucom.2010.04.014] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
32
|
|