1
|
Ramli M, Budiantara IN, Ratnasari V. A method for parameter hypothesis testing in nonparametric regression with Fourier series approach. MethodsX 2023; 11:102468. [PMID: 37964783 PMCID: PMC10641682 DOI: 10.1016/j.mex.2023.102468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 10/29/2023] [Indexed: 11/16/2023] Open
Abstract
Nonparametric regression model with the Fourier series approach was first introduced by Bilodeau in 1994. In the later years, several researchers developed a nonparametric regression model with the Fourier series approach. However, these researches are limited to parameter estimation and there is no research related to parameter hypothesis testing. Parameter hypothesis testing is a statistical method used to test the significance of the parameters. In nonparametric regression model with the Fourier series approach, parameter hypothesis testing is used to determine whether the estimated parameters have significance influence on the model or not. Therefore, the purpose of this research is for parameter hypothesis testing in the nonparametric regression model with the Fourier series approach. The method that we use for hypothesis testing is the LRT method. The LRT method is a method that compares the likelihood functions under the parameter space of the null hypothesis and the hypothesis. By using the LRT method, we obtain the form of the statistical test and its distribution as well as the rejection region of the null hypothesis. To apply the method, we use ROA data from 47 go public banks that are listed on the Indonesia stock exchange in 2020. The highlights of this research are:•The Fourier series function is assumed as a non-smooth function.•The form of the statistical test is obtained using the LRT method and is distributed as F distribution.•The estimated parameters on modelling ROA data have a significant influence on the model.
Collapse
Affiliation(s)
- Mustain Ramli
- Department of Statistics, Faculty of Science and Data Analytics, Institut Teknologi Sepuluh Nopember, Kampus ITS-Sukolilo, Surabaya 60111, Indonesia
| | - I Nyoman Budiantara
- Department of Statistics, Faculty of Science and Data Analytics, Institut Teknologi Sepuluh Nopember, Kampus ITS-Sukolilo, Surabaya 60111, Indonesia
| | - Vita Ratnasari
- Department of Statistics, Faculty of Science and Data Analytics, Institut Teknologi Sepuluh Nopember, Kampus ITS-Sukolilo, Surabaya 60111, Indonesia
| |
Collapse
|
2
|
Putra R, Fadhlurrahman MG, Gunardi. Determination of the best knot and bandwidth in geographically weighted truncated spline nonparametric regression using generalized cross validation. MethodsX 2023; 10:101994. [PMID: 36691670 PMCID: PMC9860359 DOI: 10.1016/j.mex.2022.101994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Accepted: 12/30/2022] [Indexed: 01/05/2023] Open
Abstract
This study proposes the development of nonparametric regression for data containing spatial heterogeneity with local parameter estimates for each observation location. GWTSNR combines Truncated Spline Nonparametric Regression (TSNR) and Geographically Weighted Regression (GWR). So it is necessary to determine the optimum knot point from TSNR and determine the best geographic weighting (bandwidth) from GWR by deciding the best knot point and bandwidth using Generalized Cross Validation (GCV). The case study analyzed the Morbidity Rate in North Sumatra in 2020. This study will estimate the model using knot points 1, 2, and 3 and geographic weighting of the Kernel Function, Gaussian, Bisquare, Tricube, and Exponential. Based on data analysis, we obtained that the best model for Morbidity Rate data in North Sumatra 2020 based on the minimum GCV value is the model using knots 1 and the Kernel Function of Bisquare. Based on the GWTSNR model, the significant predictors in each district/city were grouped into eight groups. Furthermore, the GWTSNR is better at modeling morbidity rates in North Sumatra 2020 by obtaining adjusted R-square = 96.235 than the TSNR by obtaining adjusted R-squared = 70.159. Some of the highlights of the proposed approach are:•The method combines nonparametric and spatial regression in determining morbidity rate modeling.•There were three-knot points tested in the truncated spline nonparametric regression and four geographic weightings in the spatial regression and then to determine the best knot and bandwidth using Generalized Cross Validation.•This paper will determine regional groupings in North Sumatra 2020 based on significant predictors in modeling morbidity rates.
Collapse
|
3
|
Weaver C, Xiao L, Lindquist MA. Single-index models with functional connectivity network predictors. Biostatistics 2022; 24:52-67. [PMID: 33948617 DOI: 10.1093/biostatistics/kxab015] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Revised: 03/22/2021] [Accepted: 03/25/2021] [Indexed: 12/16/2022] Open
Abstract
Functional connectivity is defined as the undirected association between two or more functional magnetic resonance imaging (fMRI) time series. Increasingly, subject-level functional connectivity data have been used to predict and classify clinical outcomes and subject attributes. We propose a single-index model wherein response variables and sparse functional connectivity network valued predictors are linked by an unspecified smooth function in order to accommodate potentially nonlinear relationships. We exploit the network structure of functional connectivity by imposing meaningful sparsity constraints, which lead not only to the identification of association of interactions between regions with the response but also the assessment of whether or not the functional connectivity associated with a brain region is related to the response variable. We demonstrate the effectiveness of the proposed model in simulation studies and in an application to a resting-state fMRI data set from the Human Connectome Project to model fluid intelligence and sex and to identify predictive links between brain regions.
Collapse
Affiliation(s)
- Caleb Weaver
- Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, NC 27606, USA
| | - Luo Xiao
- Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, NC 27606, USA
| | - Martin A Lindquist
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N. Wolfe Street, Baltimore, MD 21205, USA
| |
Collapse
|
4
|
Cai X, Coffman DL, Piper ME, Li R. Estimation and inference for the mediation effect in a time-varying mediation model. BMC Med Res Methodol 2022; 22:113. [PMID: 35436861 PMCID: PMC9014585 DOI: 10.1186/s12874-022-01585-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 03/17/2022] [Indexed: 11/19/2022] Open
Abstract
Background Traditional mediation analysis typically examines the relations among an intervention, a time-invariant mediator, and a time-invariant outcome variable. Although there may be a total effect of the intervention on the outcome, there is a need to understand the process by which the intervention affects the outcome (i.e., the indirect effect through the mediator). This indirect effect is frequently assumed to be time-invariant. With improvements in data collection technology, it is possible to obtain repeated assessments over time resulting in intensive longitudinal data. This calls for an extension of traditional mediation analysis to incorporate time-varying variables as well as time-varying effects. Methods We focus on estimation and inference for the time-varying mediation model, which allows mediation effects to vary as a function of time. We propose a two-step approach to estimate the time-varying mediation effect. Moreover, we use a simulation-based approach to derive the corresponding point-wise confidence band for the time-varying mediation effect. Results Simulation studies show that the proposed procedures perform well when comparing the confidence band and the true underlying model. We further apply the proposed model and the statistical inference procedure to data collected from a smoking cessation study. Conclusions We present a model for estimating time-varying mediation effects that allows both time-varying outcomes and mediators. Simulation-based inference is also proposed and implemented in a user-friendly R package. Supplementary Information The online version contains supplementary material available at (10.1186/s12874-022-01585-x).
Collapse
Affiliation(s)
- Xizhen Cai
- Department of Mathematics and Statistics, Williams College, Williamstown, MA, USA
| | - Donna L Coffman
- Department of Epidemiology and Biostatistics, Temple University, Philadelphia, PA, USA.
| | - Megan E Piper
- Center for Tobacco Research and Intervention, School of Medicine and Public Health, University of Wisconsin, Madison, WI., USA.,Department of Medicine, School of Medicine and Public Health, University of Wisconsin, Madison, WI., USA
| | - Runze Li
- Department of Statistics, Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
5
|
Meng C, Yu J, Chen Y, Zhong W, Ma P. Smoothing splines approximation using Hilbert curve basis selection. J Comput Graph Stat 2022; 31:802-812. [PMID: 36407675 PMCID: PMC9674117 DOI: 10.1080/10618600.2021.2002161] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Revised: 05/21/2021] [Accepted: 09/23/2021] [Indexed: 01/14/2023]
Abstract
Smoothing splines have been used pervasively in nonparametric regressions. However, the computational burden of smoothing splines is significant when the sample size n is large. When the number of predictors d ≥ 2 , the computational cost for smoothing splines is at the order of O(n 3) using the standard approach. Many methods have been developed to approximate smoothing spline estimators by using q basis functions instead of n ones, resulting in a computational cost of the order O(nq 2). These methods are called the basis selection methods. Despite algorithmic benefits, most of the basis selection methods require the assumption that the sample is uniformly-distributed on a hyper-cube. These methods may have deteriorating performance when such an assumption is not met. To overcome the obstacle, we develop an efficient algorithm that is adaptive to the unknown probability density function of the predictors. Theoretically, we show the proposed estimator has the same convergence rate as the full-basis estimator when q is roughly at the order of O[n 2d/{(pr+1)(d +2)}] , where p ∈[1, 2] and r ≈ 4 are some constants depend on the type of the spline. Numerical studies on various synthetic datasets demonstrate the superior performance of the proposed estimator in comparison with mainstream competitors.
Collapse
Affiliation(s)
- Cheng Meng
- Institute of Statistics and Big Data, Renmin University of China
| | - Jun Yu
- School of Mathematics and Statistics, Beijing Institute of Technology
| | | | | | - Ping Ma
- Department of Statistics, University of Georgia
| |
Collapse
|
6
|
He Y, Lan X, Zhou Z, Wang F. Analyzing the spatial network structure of agricultural greenhouse gases in China. Environ Sci Pollut Res Int 2021; 28:7929-7944. [PMID: 33043424 DOI: 10.1007/s11356-020-10945-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 09/21/2020] [Indexed: 06/11/2023]
Abstract
Investigating the regional correlation and factors affecting agricultural greenhouse gas (GHG) emissions can help establish a regional mechanism for the synergistic reduction of emissions and produce chain-like reductions. Different from the traditional geographical relationship analysis framework, linear analysis ideas, we use social network analysis to discern the regional correlations in agricultural GHG emissions from a relational network viewpoint, clarify the network functions of each node, and explain agricultural GHG correlation from a spatial, economic, and technological viewpoint by nonparametric regression. The results indicate that (1) the emission network is stable and there is a relationship of control between regions, (2) Central China is the most important region in agricultural GHG networks; however, the importance of the northwest and southwest has increased; the northeast has remained relatively independent, (3) influencers are mainly concentrated in the middle of the Yangtze River and the northwest, while dependentors are concentrated in municipalities such as Beijing and Tianjin, and the coastal regions in the southeast, and (4) the interprovincial agricultural GHG correlation can be enhanced by shortening the spatial distance, strengthening economic ties, and increasing the diffusion of technology. Implementing a "leader-follower" strategy according to the role of each region and enhancing the intermediator's "conduit" role will ultimately lead to the formation of an interprovincial interactive and cooperative emission reduction mechanism.
Collapse
Affiliation(s)
- Yanqiu He
- College of Management, Sichuan Agricultural University, Chengdu, 611130, China.
| | - Xiang Lan
- Sichuan Provincial Bureau of Statistics, Sichuan, China
| | - Zuoang Zhou
- Sichuan Provincial Bureau of Statistics, Sichuan, China
| | - Fang Wang
- College of Management, Sichuan Agricultural University, Chengdu, 611130, China.
| |
Collapse
|
7
|
Cox LA. Implications of nonlinearity, confounding, and interactions for estimating exposure concentration-response functions in quantitative risk analysis. Environ Res 2020; 187:109638. [PMID: 32450424 PMCID: PMC7235595 DOI: 10.1016/j.envres.2020.109638] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 05/04/2020] [Accepted: 05/05/2020] [Indexed: 05/04/2023]
Abstract
Recent advances in understanding of biological mechanisms and adverse outcome pathways for many exposure-related diseases show that certain common mechanisms involve thresholds and nonlinearities in biological exposure concentration-response (C-R) functions. These range from ultrasensitive molecular switches in signaling pathways, to assembly and activation of inflammasomes, to rupture of lysosomes and pyroptosis of cells. Realistic dose-response modeling and risk analysis must confront the reality of nonlinear C-R functions. This paper reviews several challenges for traditional statistical regression modeling of C-R functions with thresholds and nonlinearities, together with methods for overcoming them. Statistically significantly positive exposure-response regression coefficients can arise from many non-causal sources such as model specification errors, incompletely controlled confounding, exposure estimation errors, attribution of interactions to factors, associations among explanatory variables, or coincident historical trends. If so, the unadjusted regression coefficients do not necessarily predict how or whether reducing exposure would reduce risk. We discuss statistical options for controlling for such threats, and advocate causal Bayesian networks and dynamic simulation models as potentially valuable complements to nonparametric regression modeling for assessing causally interpretable nonlinear C-R functions and understanding how time patterns of exposures affect risk. We conclude that these approaches are promising for extending the great advances made in statistical C-R modeling methods in recent decades to clarify how to design regulations that are more causally effective in protecting human health.
Collapse
|
8
|
Thomas S, Johnson JA, Xie F. 3125 steps to perfect health: a nonparametric approach to developing the EQ-5D-5L value set. Qual Life Res 2020; 29:3109-3118. [PMID: 32705459 DOI: 10.1007/s11136-020-02589-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2020] [Indexed: 11/26/2022]
Abstract
PURPOSE The EQ-5D-5L is a commonly used instrument for assessing the utility of different health states. Health state utility values are a key component of health technology evaluations. Such evaluations are used to support evidence-based decisions surrounding health resource allocations and therefore rely on the accuracy of the valuation set used. This paper takes an alternative approach to developing an EQ-5D-5L value set for Canada. The aim is to introduce a robust method that is likely to generate a value set with improved accuracy and that can be used to generate value sets for other populations without the need for modification. METHODS The common approach to developing a valuation set for preference-based instruments is to ask a population sample to value a subset of the health states using an established preference elicitation technique. The relationship between the elicited health states and the preferences is used to inform a model to predict the utility values for the unsampled health states described by the instrument. The true relationship is unknown and the functional forms chosen in the modelling process vary across valuation studies. We use nonparametric local constant regression to estimate an EQ-5D-5L value set for Canada and propose this method as an alternative for value set development because it does not require the specification of a functional form at the outset. RESULTS Compared to the existing valuation model for Canada, the nonparametric method improves in-sample fit, reducing the average squared prediction error by 94.46% and the mean absolute error by 79.37%. In four of five sets of out-of-sample studies, this new approach performs significantly better than 9 comparison models. Despite lacking any restriction on the functional form of the resulting valuations, the valuation set generated by this new approach is logically consistent. 100% of the pairs of health states in which one state is dominant have health state values which respect this ordering. The value set also appears to differ substantially from the comparators. CONCLUSIONS Overall, the results suggest that nonparametric regression is a promising tool for the estimation of EQ-5D-5L valuation sets and may be a good option in a standardised methodology for value set development.
Collapse
Affiliation(s)
- Stephanie Thomas
- Economics, Finance and Property, Curtin University, Building 408, Room 3015, Kent Street, Bentley, WA, 6102, USA.
| | - Jeffrey A Johnson
- School of Public Health, University of Alberta, 2-040 Li Ka Shing Centre for Health Research Innovation, Edmonton, AB, T6G 2E1, USA
| | - Feng Xie
- Health Research Methods, Evidence and Impact Communications Research Lab (CRL) 223, McMaster University, 1280 Main Street West, Hamilton, ON, L8S 4K1, USA
| |
Collapse
|
9
|
Madrid Padilla OH, Sharpnack J, Chen Y, Witten DM. Adaptive nonparametric regression with the K-nearest neighbour fused lasso. Biometrika 2020; 107:293-310. [PMID: 32454528 DOI: 10.1093/biomet/asz071] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Indexed: 11/12/2022] Open
Abstract
The fused lasso, also known as total-variation denoising, is a locally adaptive function estimator over a regular grid of design points. In this article, we extend the fused lasso to settings in which the points do not occur on a regular grid, leading to a method for nonparametric regression. This approach, which we call the [Formula: see text]-nearest-neighbours fused lasso, involves computing the [Formula: see text]-nearest-neighbours graph of the design points and then performing the fused lasso over this graph. We show that this procedure has a number of theoretical advantages over competing methods: specifically, it inherits local adaptivity from its connection to the fused lasso, and it inherits manifold adaptivity from its connection to the [Formula: see text]-nearest-neighbours approach. In a simulation study and an application to flu data, we show that excellent results are obtained. For completeness, we also study an estimator that makes use of an [Formula: see text]-graph rather than a [Formula: see text]-nearest-neighbours graph and contrast it with the [Formula: see text]-nearest-neighbours fused lasso.
Collapse
Affiliation(s)
| | - James Sharpnack
- Department of Statistics, University of California, One Shields Avenue, Davis, California, U.S.A
| | - Yanzhen Chen
- Department of Information Systems, Business Statistics and Operations Management, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Daniela M Witten
- Department of Statistics, University of Washington, Seattle, Washington, U.S.A
| |
Collapse
|
10
|
Meng C, Zhang X, Zhang J, Zhong W, Ma P. More efficient approximation of smoothing splines via space-filling basis selection. Biometrika 2020; 107:723-735. [PMID: 32831354 DOI: 10.1093/biomet/asaa019] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Indexed: 11/15/2022] Open
Abstract
We consider the problem of approximating smoothing spline estimators in a nonparametric regression model. When applied to a sample of size [Formula: see text], the smoothing spline estimator can be expressed as a linear combination of [Formula: see text] basis functions, requiring [Formula: see text] computational time when the number [Formula: see text] of predictors is two or more. Such a sizeable computational cost hinders the broad applicability of smoothing splines. In practice, the full-sample smoothing spline estimator can be approximated by an estimator based on [Formula: see text] randomly selected basis functions, resulting in a computational cost of [Formula: see text]. It is known that these two estimators converge at the same rate when [Formula: see text] is of order [Formula: see text], where [Formula: see text] depends on the true function and [Formula: see text] depends on the type of spline. Such a [Formula: see text] is called the essential number of basis functions. In this article, we develop a more efficient basis selection method. By selecting basis functions corresponding to approximately equally spaced observations, the proposed method chooses a set of basis functions with great diversity. The asymptotic analysis shows that the proposed smoothing spline estimator can decrease [Formula: see text] to around [Formula: see text] when [Formula: see text]. Applications to synthetic and real-world datasets show that the proposed method leads to a smaller prediction error than other basis selection methods.
Collapse
Affiliation(s)
- Cheng Meng
- Department of Statistics, University of Georgia, 310 Herty Dr., Athens, Georgia 30602, U.S.A
| | - Xinlian Zhang
- Department of Statistics, University of Georgia, 310 Herty Dr., Athens, Georgia 30602, U.S.A
| | - Jingyi Zhang
- Department of Statistics, University of Georgia, 310 Herty Dr., Athens, Georgia 30602, U.S.A
| | - Wenxuan Zhong
- Department of Statistics, University of Georgia, 310 Herty Dr., Athens, Georgia 30602, U.S.A
| | - Ping Ma
- Department of Statistics, University of Georgia, 310 Herty Dr., Athens, Georgia 30602, U.S.A
| |
Collapse
|
11
|
Hayakawa S, Suzuki T. On the minimax optimality and superiority of deep neural network learning over sparse parameter spaces. Neural Netw 2019; 123:343-361. [PMID: 31901565 DOI: 10.1016/j.neunet.2019.12.014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Revised: 09/20/2019] [Accepted: 12/11/2019] [Indexed: 10/25/2022]
Abstract
Deep learning has been applied to various tasks in the field of machine learning and has shown superiority to other common procedures such as kernel methods. To provide a better theoretical understanding of the reasons for its success, we discuss the performance of deep learning and other methods on a nonparametric regression problem with a Gaussian noise. Whereas existing theoretical studies of deep learning have been based mainly on mathematical theories of well-known function classes such as Hölder and Besov classes, we focus on function classes with discontinuity and sparsity, which are those naturally assumed in practice. To highlight the effectiveness of deep learning, we compare deep learning with a class of linear estimators representative of a class of shallow estimators. It is shown that the minimax risk of a linear estimator on the convex hull of a target function class does not differ from that of the original target function class. This results in the suboptimality of linear methods over a simple but non-convex function class, on which deep learning can attain nearly the minimax-optimal rate. In addition to this extreme case, we consider function classes with sparse wavelet coefficients. On these function classes, deep learning also attains the minimax rate up to log factors of the sample size, and linear methods are still suboptimal if the assumed sparsity is strong. We also point out that the parameter sharing of deep neural networks can remarkably reduce the complexity of the model in our setting.
Collapse
Affiliation(s)
- Satoshi Hayakawa
- Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Japan.
| | - Taiji Suzuki
- Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Japan; Center for Advanced Intelligence Project, RIKEN, Japan.
| |
Collapse
|
12
|
Paszulewicz J, Wolski P, Gajdek M. Is laterality adaptive? Pitfalls in disentangling the laterality-performance relationship. Cortex 2019; 125:175-189. [PMID: 31999962 DOI: 10.1016/j.cortex.2019.11.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 10/13/2019] [Accepted: 11/25/2019] [Indexed: 10/25/2022]
Abstract
Unlike non-human animal studies that have progressively demonstrated the advantages of being asymmetrical at an individual, group and population level, human studies show a quite inconsistent picture. Specifically, it is hardly clear if and how the strength of lateralization that an individual is equipped with relates to their cognitive performance. While some of these inconsistencies can be attributed to procedural and conceptual differences, the issue is aggravated by the fact that the intrinsic mathematical interdependence of the measures of laterality and performance produces spurious correlations that can be mistaken for evidence of an adaptive advantage of asymmetry. Leask and Crow [Leask, S. J., & Crow, T. J. (1997), How far does the brain lateralize?: an unbiased method for determining the optimum degree of hemispheric specialization. Neuropsychologia, 35(10), 1381-1387] devised a method of overcoming this problem that has been subsequently used in several large-sample studies investigating the asymmetry-performance relationship. In our paper we show that the original Leask and Crow method and its later variants fall victim to inherent nonlinear dependencies and produce artifacts. By applying the Leask and Crow method to random data and with mathematical analysis, we demonstrate that what has been believed to describe the true asymmetry-performance relation in fact only reflects the idiosyncrasies of the method itself. We think that the approach taken by Leask in his later paper [Leask, S. (2003), Principal curve analysis avoids assumptions of dependence between measures of hand skill. Laterality, 8(4), 307-316. doi:10.1080/13576500342000004] might be preferable.
Collapse
Affiliation(s)
| | - Piotr Wolski
- Institute of Psychology, Jagiellonian University, Kraków, Poland.
| | - Marek Gajdek
- Emeritus Associate Professor of Kielce University of Technology, Kielce, Poland
| |
Collapse
|
13
|
Hammell AE, Helwig NE, Kaczkurkin AN, Sponheim SR, Lissek S. The temporal course of over-generalized conditioned threat expectancies in posttraumatic stress disorder. Behav Res Ther 2019; 124:103513. [PMID: 31864116 DOI: 10.1016/j.brat.2019.103513] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2019] [Revised: 10/28/2019] [Accepted: 11/08/2019] [Indexed: 11/30/2022]
Abstract
One key conditioning abnormality in posttraumatic stress disorder (PTSD) is heightened generalization of fear from a conditioned danger-cue (CS+) to similarly appearing safe stimuli. The present work represents the first effort to track the time-course of heightened generalization in PTSD with the prediction of heightened PTSD-related over-generalization in earlier but not later trials. This prediction derives from past discriminative fear-conditioning studies providing incidental evidence that over-generalization in PTSD may be reduced with sufficient learning trials. In the current study, we re-analyzed previously published conditioned fear-generalization data (Kaczkurkin et al., 2017) including combat veterans with PTSD (n = 15) or subthreshold PTSD (SubPTSD: n = 18), and trauma controls (TC: n = 19). This re-analysis aimed to identify the trial-by-trial course of group differences in generalized perceived risk across three classes of safe generalization stimuli (GSs) parametrically varying in similarity to a CS+ paired with shock. Those with PTSD and SubPTSD, relative to TC, displayed significantly elevated generalization to all GSs combined in early but not late generalization trials. Additionally, over-generalization in PTSD and SubPTSD persisted across trials to a greater extent for classes of GSs bearing higher resemblance to CS+. Such results suggest that PTSD-related over-generalization of conditioned threat expectancies can be reduced with sufficient exposure to unreinforced GSs and accentuate the importance of analyzing trial-by-trial changes when assessing over-generalization in clinical populations.
Collapse
Affiliation(s)
- Abbey E Hammell
- Department of Psychology, University of Minnesota, Elliot Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA
| | - Nathaniel E Helwig
- Department of Psychology, University of Minnesota, Elliot Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA; School of Statistics, University of Minnesota, Ford Hall, 224 Church Street SE, Minneapolis, MN, 55455, USA
| | - Antonia N Kaczkurkin
- Department of Psychological Sciences, Vanderbilt University, 2301 Vanderbilt Place, Nashville, TN, 37240-7817, USA
| | - Scott R Sponheim
- Minneapolis Veterans Affairs Health Care System, 1 Veterans Drive, Minneapolis, MN, 55417, USA; Department of Psychiatry, University of Minnesota, F282/2A West Building, 2450 Riverside Avenue S, Minneapolis, MN, 55454, USA
| | - Shmuel Lissek
- Department of Psychology, University of Minnesota, Elliot Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA.
| |
Collapse
|
14
|
Haris A, Shojaie A, Simon N. Nonparametric regression with adaptive truncation via a convex hierarchical penalty. Biometrika 2019; 106:87-107. [PMID: 31427821 DOI: 10.1093/biomet/asy056] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2017] [Indexed: 11/13/2022] Open
Abstract
We consider the problem of nonparametric regression with a potentially large number of covariates. We propose a convex, penalized estimation framework that is particularly well suited to high-dimensional sparse additive models and combines the appealing features of finite basis representation and smoothing penalties. In the case of additive models, a finite basis representation provides a parsimonious representation for fitted functions but is not adaptive when component functions possess different levels of complexity. In contrast, a smoothing spline-type penalty on the component functions is adaptive but does not provide a parsimonious representation. Our proposal simultaneously achieves parsimony and adaptivity in a computationally efficient way. We demonstrate these properties through empirical studies and show that our estimator converges at the minimax rate for functions within a hierarchical class. We further establish minimax rates for a large class of sparse additive models. We also develop an efficient algorithm that scales similarly to the lasso with the number of covariates and sample size.
Collapse
Affiliation(s)
- Asad Haris
- Department of Biostatistics, University of Washington, 1705 NE Pacific Street, Seattle, Washington, USA
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, 1705 NE Pacific Street, Seattle, Washington, USA
| | - Noah Simon
- Department of Biostatistics, University of Washington, 1705 NE Pacific Street, Seattle, Washington, USA
| |
Collapse
|
15
|
Eckle K, Schmidt-Hieber J. A comparison of deep networks with ReLU activation function and linear spline-type methods. Neural Netw 2018; 110:232-242. [PMID: 30616095 DOI: 10.1016/j.neunet.2018.11.005] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Revised: 11/16/2018] [Accepted: 11/20/2018] [Indexed: 10/27/2022]
Abstract
Deep neural networks (DNNs) generate much richer function spaces than shallow networks. Since the function spaces induced by shallow networks have several approximation theoretic drawbacks, this explains, however, not necessarily the success of deep networks. In this article we take another route by comparing the expressive power of DNNs with ReLU activation function to linear spline methods. We show that MARS (multivariate adaptive regression splines) is improper learnable by DNNs in the sense that for any given function that can be expressed as a function in MARS with M parameters there exists a multilayer neural network with O(Mlog(M∕ε)) parameters that approximates this function up to sup-norm error ε. We show a similar result for expansions with respect to the Faber-Schauder system. Based on this, we derive risk comparison inequalities that bound the statistical risk of fitting a neural network by the statistical risk of spline-based methods. This shows that deep networks perform better or only slightly worse than the considered spline methods. We provide a constructive proof for the function approximations.
Collapse
Affiliation(s)
- Konstantin Eckle
- Leiden University, Mathematical Institute, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands.
| | - Johannes Schmidt-Hieber
- Leiden University, Mathematical Institute, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands.
| |
Collapse
|
16
|
Li P, Li X, Chen L. The asymptotic normality of internal estimator for nonparametric regression. J Inequal Appl 2018; 2018:231. [PMID: 30839660 PMCID: PMC6132389 DOI: 10.1186/s13660-018-1832-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 09/01/2018] [Indexed: 06/09/2023]
Abstract
In this paper, we aim to study the asymptotic properties of internal estimator of nonparametric regression with independent and dependent data. Under some weak conditions, we present some results on asymptotic normality of the estimator. Our results extend some corresponding ones.
Collapse
Affiliation(s)
- Penghua Li
- Automotive Electronics Engineering Research Center, College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Xiaoqin Li
- School of Mathematical Sciences, Anhui University, Hefei, China
| | - Liping Chen
- School of Electrical Engineering and Automation, Hefei University of Technology, Hefei, China
| |
Collapse
|
17
|
Abstract
Quantile regression estimates conditional quantiles and has wide applications in the real world. Estimating high conditional quantiles is an important problem. The regular quantile regression (QR) method often designs a linear or non-linear model, then estimates the coefficients to obtain the estimated conditional quantiles. This approach may be restricted by the linear model setting. To overcome this problem, this paper proposes a direct nonparametric quantile regression method with five-step algorithm. Monte Carlo simulations show good efficiency for the proposed direct QR estimator relative to the regular QR estimator. The paper also investigates two real-world examples of applications by using the proposed method. Studies of the simulations and the examples illustrate that the proposed direct nonparametric quantile regression model fits the data set better than the regular quantile regression method.
Collapse
Affiliation(s)
- Mei Ling Huang
- Department of Mathematics & Statistics, Brock University, St. Catharines, Ontario, L2S 3A1 Canada
| | | |
Collapse
|
18
|
Joshi RR. Diversity and motif conservation in protein 3D structural landscape: exploration by a new multivariate simulation method. J Mol Model 2018; 24:76. [PMID: 29500695 DOI: 10.1007/s00894-018-3614-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Accepted: 01/31/2018] [Indexed: 11/29/2022]
Abstract
In this paper, diversity and conservation in the 'landscape' of random variation of protein tertiary structures are explored for quantitative feature-vector models of major types of functionally important 3D structural motifs. For this, I have deployed a recently developed nonparametric regression (NPR)-based multidimensional copula method of simulation. Apart from improved accuracy of multidimensional random sample generation, the simulation provides additional insight into diversity in the protein structural landscape in terms of random variation in the feature-vector. It shows the relative importance of several features, with biological implications, in conservation of motifs. Mapping of this landscape in distance-preserving 2D eigenspace also shows consistency in demarcation of different motif classes and preservation of their characteristic patterns in this 2D space.
Collapse
Affiliation(s)
- Rajani R Joshi
- Department of Mathematics, Indian Institute of Technology Bombay, Mumbai, India.
| |
Collapse
|
19
|
Abstract
Characterizing the correspondence between an ordinal measurement and a continuous measurement is often of interest in mental health studies. To this end, Peng, Li, Guo, and Manatunga (2011) introduced the concept of broad sense agreement (BSA) and developed nonparametric estimation and inference for a BSA measure. In this work, we propose a non-parametric regression framework for BSA, which provides a robust tool to further investigate population heterogeneity in BSA. We develop inferential procedures including regression function estimation and hypothesis testing. Extensive simulation studies demonstrate satisfactory performance of the proposed method. We also apply the new method to a recent Grady Trauma Study and reveal an interesting impact of depression severity on the alignment between a self-reported symptom instrument and clinician diagnosis in posttraumatic stress disorder (PSTD) patients.
Collapse
Affiliation(s)
- Akm Fazlur Rahman
- Department of Biostatistics and Bioinformatics, Emory University Atlanta, GA 30322, U.S.A
| | - Limin Peng
- Department of Biostatistics and Bioinformatics, Emory University Atlanta, GA 30322, U.S.A
| | - Amita Manatunga
- Department of Biostatistics and Bioinformatics, Emory University Atlanta, GA 30322, U.S.A
| | - Ying Guo
- Department of Biostatistics and Bioinformatics, Emory University Atlanta, GA 30322, U.S.A
| |
Collapse
|
20
|
Bhadra A, Carroll RJ. Exact sampling of the unobserved covariates in Bayesian spline models for measurement error problems. Stat Comput 2016; 26:827-840. [PMID: 27418743 PMCID: PMC4941830 DOI: 10.1007/s11222-015-9572-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Accepted: 04/24/2015] [Indexed: 06/06/2023]
Abstract
In truncated polynomial spline or B-spline models where the covariates are measured with error, a fully Bayesian approach to model fitting requires the covariates and model parameters to be sampled at every Markov chain Monte Carlo iteration. Sampling the unobserved covariates poses a major computational problem and usually Gibbs sampling is not possible. This forces the practitioner to use a Metropolis-Hastings step which might suffer from unacceptable performance due to poor mixing and might require careful tuning. In this article we show for the cases of truncated polynomial spline or B-spline models of degree equal to one, the complete conditional distribution of the covariates measured with error is available explicitly as a mixture of double-truncated normals, thereby enabling a Gibbs sampling scheme. We demonstrate via a simulation study that our technique performs favorably in terms of computational efficiency and statistical performance. Our results indicate up to 62 and 54 % increase in mean integrated squared error efficiency when compared to existing alternatives while using truncated polynomial splines and B-splines respectively. Furthermore, there is evidence that the gain in efficiency increases with the measurement error variance, indicating the proposed method is a particularly valuable tool for challenging applications that present high measurement error. We conclude with a demonstration on a nutritional epidemiology data set from the NIH-AARP study and by pointing out some possible extensions of the current work.
Collapse
Affiliation(s)
- Anindya Bhadra
- Department of Statistics, Purdue University, 250 N. University Street, West Lafayette, IN 47907-2066, USA
| | - Raymond J. Carroll
- Department of Statistics, Texas A&M University, 3143 TAMU, College Station, TX 77843-3143, USA
| |
Collapse
|
21
|
Abstract
The appearance of massive data has become increasingly common in contemporary scientific research. When sample size n is huge, classical learning methods become computationally costly for the regression purpose. Recently, the orthogonal greedy algorithm (OGA) has been revitalized as an efficient alternative in the context of kernel-based statistical learning. In a learning problem, accurate and fast prediction is often of interest. This makes an appropriate termination crucial for OGA. In this paper, we propose a new termination rule for OGA via investigating its predictive performance. The proposed rule is conceptually simple and convenient for implementation, which suggests an [Formula: see text] number of essential updates in an OGA process. It therefore provides an appealing route to conduct efficient learning for massive data. With a sample dependent kernel dictionary, we show that the proposed method is strongly consistent with an [Formula: see text] convergence rate to the oracle prediction. The promising performance of the method is supported by both simulation and real data examples.
Collapse
Affiliation(s)
- Chen Xu
- The Pennsylvania State University
| | | | | | - Runze Li
- The Pennsylvania State University
| |
Collapse
|
22
|
Abstract
For non-stationary processes, the time-varying correlation structure provides useful insights into the underlying model dynamics. We study estimation and inferences for local autocorrelation process in locally stationary time series. Our constructed simultaneous confidence band can be used to address important hypothesis testing problems, such as whether the local autocorrelation process is indeed time-varying and whether the local autocorrelation is zero. In particular, our result provides an important generalization of the R function acf() to locally stationary Gaussian processes. Simulation studies and two empirical applications are developed. For the global temperature series, we find that the local autocorrelations are time-varying and have a "V" shape during 1910-1960. For the S&P 500 index, we conclude that the returns satisfy the efficient-market hypothesis whereas the magnitudes of returns show significant local autocorrelations.
Collapse
|
23
|
Kim S, Zhao Z, Shao X. Nonparametric Functional Central Limit Theorem for Time Series Regression with Application to Self-normalized Confidence Interval. J MULTIVARIATE ANAL 2015; 133:277-290. [PMID: 25386031 PMCID: PMC4223815 DOI: 10.1016/j.jmva.2014.09.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
This paper is concerned with the inference of nonparametric mean function in a time series context. The commonly used kernel smoothing estimate is asymptotically normal and the traditional inference procedure then consistently estimates the asymptotic variance function and relies upon normal approximation. Consistent estimation of the asymptotic variance function involves another level of nonparametric smoothing. In practice, the choice of the extra bandwidth parameter can be difficult, the inference results can be sensitive to bandwidth selection and the normal approximation can be quite unsatisfactory in small samples leading to poor coverage. To alleviate the problem, we propose to extend the recently developed self-normalized approach, which is a bandwidth free inference procedure developed for parametric inference, to construct point-wise confidence interval for nonparametric mean function. To justify asymptotic validity of the self-normalized approach, we establish a functional central limit theorem for recursive nonparametric mean regression function estimates under primitive conditions and show that the limiting process is a Gaussian process with non-stationary and dependent increments. The superior finite sample performance of the new approach is demonstrated through simulation studies.
Collapse
Affiliation(s)
- Seonjin Kim
- Department of Statistics, Miami University, 311 Upham Hall, Oxford, OH 45056
| | - Zhibiao Zhao
- Department of Statistics, Penn State University, 326 Thomas, University Park, PA 16802
| | - Xiaofeng Shao
- Department of Statistics, University of Illinois, 725 South Wright Street, Urbana, IL 61801
| |
Collapse
|
24
|
Abstract
We propose a new multivariate generalized Cp (MGCp) criterion for tuning parameter selection in nonparametric regression, applicable when there are multiple covariates whose values may be irregularly spaced. Apart from an asymptotically negligible remainder, the MGCp criterion has expected value equal to the sum of squared errors of a fitted derivative (rather than of a fitted mean response). Thus, unlike traditional criteria for tuning parameter selection, MGCp is not prone to undersmoothed derivative estimation. We illustrate a scientific application in a case study that explores the relationship among three measures of liver function. Since recent technological developments hold promise for assessing two of these measures outside of medical and laboratory facilities, better understanding of the aforementioned relationship may allow enhanced monitoring of liver function, especially in developing countries and among persons for whom access to medical and laboratory facilities is limited.
Collapse
Affiliation(s)
- Richard Charnigo
- Department of Statistics, University of Kentucky, 725 Rose Street, Lexington, KY 40536, USA
| | - Cidambi Srinivasan
- Department of Statistics, University of Kentucky, 725 Rose Street, Lexington, KY 40536, USA
| |
Collapse
|
25
|
Chen H, Wang Y, Li R, Shear K. A note on a nonparametric regression test through penalized splines. Stat Sin 2014; 24:1143-1160. [PMID: 25076817 DOI: 10.5705/ss.2012.230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We examine a test of a nonparametric regression function based on penalized spline smoothing. We show that, similarly to a penalized spline estimator, the asymptotic power of the penalized spline test falls into a small- K or a large-K scenarios characterized by the number of knots K and the smoothing parameter. However, the optimal rate of K and the smoothing parameter maximizing power for testing is different from the optimal rate minimizing the mean squared error for estimation. Our investigation reveals that compared to estimation, some under-smoothing may be desirable for the testing problems. Furthermore, we compare the proposed test with the likelihood ratio test (LRT). We show that when the true function is more complicated, containing multiple modes, the test proposed here may have greater power than LRT. Finally, we investigate the properties of the test through simulations and apply it to two data examples.
Collapse
|
26
|
Kruppa J, Liu Y, Biau G, Kohler M, König IR, Malley JD, Ziegler A. Probability estimation with machine learning methods for dichotomous and multicategory outcome: theory. Biom J 2014; 56:534-63. [PMID: 24478134 DOI: 10.1002/bimj.201300068] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2013] [Revised: 09/27/2013] [Accepted: 10/01/2013] [Indexed: 01/08/2023]
Abstract
Probability estimation for binary and multicategory outcome using logistic and multinomial logistic regression has a long-standing tradition in biostatistics. However, biases may occur if the model is misspecified. In contrast, outcome probabilities for individuals can be estimated consistently with machine learning approaches, including k-nearest neighbors (k-NN), bagged nearest neighbors (b-NN), random forests (RF), and support vector machines (SVM). Because machine learning methods are rarely used by applied biostatisticians, the primary goal of this paper is to explain the concept of probability estimation with these methods and to summarize recent theoretical findings. Probability estimation in k-NN, b-NN, and RF can be embedded into the class of nonparametric regression learning machines; therefore, we start with the construction of nonparametric regression estimates and review results on consistency and rates of convergence. In SVMs, outcome probabilities for individuals are estimated consistently by repeatedly solving classification problems. For SVMs we review classification problem and then dichotomous probability estimation. Next we extend the algorithms for estimating probabilities using k-NN, b-NN, and RF to multicategory outcomes and discuss approaches for the multicategory probability estimation problem using SVM. In simulation studies for dichotomous and multicategory dependent variables we demonstrate the general validity of the machine learning methods and compare it with logistic regression. However, each method fails in at least one simulation scenario. We conclude with a discussion of the failures and give recommendations for selecting and tuning the methods. Applications to real data and example code are provided in a companion article (doi:10.1002/bimj.201300077).
Collapse
Affiliation(s)
- Jochen Kruppa
- Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Campus Lübeck, Ratzeburger Allee 160, Haus 24, 23562 Lübeck, Germany
| | | | | | | | | | | | | |
Collapse
|
27
|
Abstract
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Collapse
|
28
|
Abstract
We provide a novel and completely different approach to dimension-reduction problems from the existing literature. We cast the dimension-reduction problem in a semiparametric estimation framework and derive estimating equations. Viewing this problem from the new angle allows us to derive a rich class of estimators, and obtain the classical dimension reduction techniques as special cases in this class. The semiparametric approach also reveals that in the inverse regression context while keeping the estimation structure intact, the common assumption of linearity and/or constant variance on the covariates can be removed at the cost of performing additional nonparametric regression. The semiparametric estimators without these common assumptions are illustrated through simulation studies and a real data example. This article has online supplementary material.
Collapse
Affiliation(s)
- Yanyuan Ma
- Department of Statistics, Texas A&M University, 3143 TAMU, College Station, TX 77843-3143 ( )
| | | |
Collapse
|
29
|
Abstract
Bayesian nonparametric regression with dependent wavelets has dual shrinkage properties: there is shrinkage through a dependent prior put on functional differences, and shrinkage through the setting of most of the wavelet coefficients to zero through Bayesian variable selection methods. The methodology can deal with unequally spaced data and is efficient because of the existence of fast moves in model space for the MCMC computation. The methodology is illustrated on the problem of modeling the oscillations of Cepheid variable stars; these are a class of pulsating variable stars with the useful property that their periods of variability are strongly correlated with their absolute luminosity. Once this relationship has been calibrated, knowledge of the period gives knowledge of the luminosity. This makes these stars useful as "standard candles" for estimating distances in the universe.
Collapse
|