1
|
The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models. Ecol Evol 2023; 13:e10747. [PMID: 38020673 PMCID: PMC10659948 DOI: 10.1002/ece3.10747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Revised: 10/29/2023] [Accepted: 11/06/2023] [Indexed: 12/01/2023] Open
Abstract
How to effectively obtain species-related low-dimensional data from massive environmental variables has become an urgent problem for species distribution models (SDMs). In this study, we will explore whether dimensionality reduction on environmental variables can improve the predictive performance of SDMs. We first used two linear (i.e., principal component analysis (PCA) and independent components analysis) and two nonlinear (i.e., kernel principal component analysis (KPCA) and uniform manifold approximation and projection) dimensionality reduction techniques (DRTs) to reduce the dimensionality of high-dimensional environmental data. Then, we established five SDMs based on the environmental variables of dimensionality reduction for 23 real plant species and nine virtual species, and compared the predictive performance of those with the SDMs based on the selected environmental variables through Pearson's correlation coefficient (PCC). In addition, we studied the effects of DRTs, model complexity, and sample size on the predictive performance of SDMs. The predictive performance of SDMs under DRTs other than KPCA is better than using PCC. And the predictive performance of SDMs using linear DRTs is better than using nonlinear DRTs. In addition, using DRTs to deal with environmental variables has no less impact on the predictive performance of SDMs than model complexity and sample size. When the model complexity is at the complex level, PCA can improve the predictive performance of SDMs the most by 2.55% compared with PCC. At the middle level of sample size, the PCA improved the predictive performance of SDMs by 2.68% compared with the PCC. Our study demonstrates that DRTs have a significant effect on the predictive performance of SDMs. Specifically, linear DRTs, especially PCA, are more effective at improving model predictive performance under relatively complex model complexity or large sample sizes.
Collapse
|
2
|
An Interactive Image Segmentation Method Based on Multi-Level Semantic Fusion. SENSORS (BASEL, SWITZERLAND) 2023; 23:6394. [PMID: 37514688 PMCID: PMC10383896 DOI: 10.3390/s23146394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 07/10/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023]
Abstract
Understanding and analyzing 2D/3D sensor data is crucial for a wide range of machine learning-based applications, including object detection, scene segmentation, and salient object detection. In this context, interactive object segmentation is a vital task in image editing and medical diagnosis, involving the accurate separation of the target object from its background based on user annotation information. However, existing interactive object segmentation methods struggle to effectively leverage such information to guide object-segmentation models. To address these challenges, this paper proposes an interactive image-segmentation technique for static images based on multi-level semantic fusion. Our method utilizes user-guidance information both inside and outside the target object to segment it from the static image, making it applicable to both 2D and 3D sensor data. The proposed method introduces a cross-stage feature aggregation module, enabling the effective propagation of multi-scale features from previous stages to the current stage. This mechanism prevents the loss of semantic information caused by multiple upsampling and downsampling of the network, allowing the current stage to make better use of semantic information from the previous stage. Additionally, we incorporate a feature channel attention mechanism to address the issue of rough network segmentation edges. This mechanism captures richer feature details from the feature channel level, leading to finer segmentation edges. In the experimental evaluation conducted on the PASCAL Visual Object Classes (VOC) 2012 dataset, our proposed interactive image segmentation method based on multi-level semantic fusion demonstrates an intersection over union (IOU) accuracy approximately 2.1% higher than the currently popular interactive image segmentation method in static images. The comparative analysis highlights the improved performance and effectiveness of our method. Furthermore, our method exhibits potential applications in various fields, including medical imaging and robotics. Its compatibility with other machine learning methods for visual semantic analysis allows for integration into existing workflows. These aspects emphasize the significance of our contributions in advancing interactive image-segmentation techniques and their practical utility in real-world applications.
Collapse
|
3
|
Incomplete Tests of Conditional Association for the Assessment of Model Assumptions. PSYCHOMETRIKA 2022; 87:1214-1237. [PMID: 35124767 PMCID: PMC9636116 DOI: 10.1007/s11336-022-09841-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/10/2021] [Indexed: 05/28/2023]
Abstract
Many of the models that have been proposed for response data share the assumptions that define the monotone homogeneity (MH) model. Observable properties that are implied by the MH model allow for these assumptions to be tested. For binary response data, the most restrictive of these properties is called conditional association (CA). All the other properties considered can be considered incomplete tests of CA that alleviate the practical limitations encountered when assessing the MH model assumptions using CA. It is found that the assessment of the MH model assumptions with an incomplete test of CA, rather than CA, is generally associated with a substantial loss of information. We also look at the sensitivity of the observable properties to model violation and discuss the implications of the results. It is argued that more research is required about the extent to which the assumptions and the model specifications influence the inferences made from response data.
Collapse
|
4
|
Generalized quasi-linear mixed-effects model. Stat Methods Med Res 2022; 31:1280-1291. [PMID: 35286226 DOI: 10.1177/09622802221085864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The generalized linear mixed model (GLMM) is one of the most common method in the analysis of longitudinal and clustered data in biological sciences. However, issues of model complexity and misspecification can occur when applying the GLMM. To address these issues, we extend the standard GLMM to a nonlinear mixed-effects model based on quasi-linear modeling. An estimation algorithm for the proposed model is provided by extending the penalized quasi-likelihood and the restricted maximum likelihood which are known in the GLMM inference. Also, the conditional AIC is formulated for the proposed model. The proposed model should provide a more flexible fit than the GLMM when there is a nonlinear relation between fixed and random effects. Otherwise, the proposed model is reduced to the GLMM. The performance of the proposed model under model misspecification is evaluated in several simulation studies. In the analysis of respiratory illness data from a randomized controlled trial, we observe the proposed model can capture heterogeneity; that is, it can detect a patient subgroup with specific clinical character in which the treatment is effective.
Collapse
|
5
|
Mass and heat balances for biological nitrogen removal in an activated sludge process: to couple or not to couple? ENVIRONMENTAL TECHNOLOGY 2021; 42:4047-4056. [PMID: 32188337 DOI: 10.1080/09593330.2020.1744737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 03/12/2020] [Indexed: 06/10/2023]
Abstract
Models adapt constantly, usually increasing the degree of detail describing physical phenomena. In water resource recovery facilities, models based on mass and/or heat balances have been used to describe and improve operation. While both mass and heat balances have proven their worth individually, the question arises to which extent their coupling, which entails increased model complexity, warrants the supposedly more precise simulation results. In order to answer this question, the need for and effects of coupling mass and heat balances in modelling studies were evaluated in this work for a biological nitrogen removal process treating highly concentrated wastewater. This evaluation consisted on assessing the effect of the coupling of mass and heat balances on the prediction of: (1) nitrogen removal efficiency; (2) temperature; (3) heat recovery. In general, mass balances are sufficient for evaluating nitrogen removal efficiency and effluent nitrogen concentrations. If one desires to evaluate the effect of temperature changes (e.g. daily, weekly, seasonally) on nitrogen removal efficiency, the use of temperature profiles as an input variable to a mass balance-based model is recommended over the coupling of mass and heat balances. In terms of temperature prediction, considering a constant biological heat generation term in the heat balance model provides sufficient information - i.e. without the coupling of mass and heat balances. Also, for evaluating the heat recovery potential of the system, constant biological heat generation values provide valuable information, at least under normal operating conditions, i.e. when the solids retention time is large enough to maintain nitrification.
Collapse
|
6
|
SDMtune: An R package to tune and evaluate species distribution models. Ecol Evol 2020; 10:11488-11506. [PMID: 33144979 PMCID: PMC7593178 DOI: 10.1002/ece3.6786] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 07/17/2020] [Accepted: 08/19/2020] [Indexed: 01/28/2023] Open
Abstract
Balancing model complexity is a key challenge of modern computational ecology, particularly so since the spread of machine learning algorithms. Species distribution models are often implemented using a wide variety of machine learning algorithms that can be fine-tuned to achieve the best model prediction while avoiding overfitting. We have released SDMtune, a new R package that aims to facilitate training, tuning, and evaluation of species distribution models in a unified framework. The main innovations of this package are its functions to perform data-driven variable selection, and a novel genetic algorithm to tune model hyperparameters. Real-time and interactive charts are displayed during the execution of several functions to help users understand the effect of removing a variable or varying model hyperparameters on model performance. SDMtune supports three different metrics to evaluate model performance: the area under the receiver operating characteristic curve, the true skill statistic, and Akaike's information criterion corrected for small sample sizes. It implements four statistical methods: artificial neural networks, boosted regression trees, maximum entropy modeling, and random forest. Moreover, it includes functions to display the outputs and create a final report. SDMtune therefore represents a new, unified and user-friendly framework for the still-growing field of species distribution modeling.
Collapse
|
7
|
Abstract
Differential item functioning (DIF) is a pernicious statistical issue that can mask true group differences on a target latent construct. A considerable amount of research has focused on evaluating methods for testing DIF, such as using likelihood ratio tests in item response theory (IRT). Most of this research has focused on the asymptotic properties of DIF testing, in part because many latent variable methods require large samples to obtain stable parameter estimates. Much less research has evaluated these methods in small sample sizes despite the fact that many social and behavioral scientists frequently encounter small samples in practice. In this article, we examine the extent to which model complexity-the number of model parameters estimated simultaneously-affects the recovery of DIF in small samples. We compare three models that vary in complexity: logistic regression with sum scores, the 1-parameter logistic IRT model, and the 2-parameter logistic IRT model. We expected that logistic regression with sum scores and the 1-parameter logistic IRT model would more accurately estimate DIF because these models yielded more stable estimates despite being misspecified. Indeed, a simulation study and empirical example of adolescent substance use show that, even when data are generated from / assumed to be a 2-parameter logistic IRT, using parsimonious models in small samples leads to more powerful tests of DIF while adequately controlling for Type I error. We also provide evidence for minimum sample sizes needed to detect DIF, and we evaluate whether applying corrections for multiple testing is advisable. Finally, we provide recommendations for applied researchers who conduct DIF analyses in small samples.
Collapse
|
8
|
[Optimizing MaxEnt model in the prediction of species distribution.]. YING YONG SHENG TAI XUE BAO = THE JOURNAL OF APPLIED ECOLOGY 2019; 30:2116-2128. [PMID: 31257787 DOI: 10.13287/j.1001-9332.201906.029] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Maximum Entropy (MaxEnt) model has been widely used in recent years. However, MaxEnt is highly inclined to produce misleading results if it is not well optimized. We summarized the researches about the model optimization for sampling bias correction, model complexity tuning, presence-absence threshold selection, and model evaluation. Spatial filtering performs best for sampling bias correction, while restricted background method shows the lowest efficacy. Model complexi-ty is mainly determined by three factors: The number of environmental variables, model feature types, and regularization multiplier. Variables filtering is needed when sample size is less than the number of environment variables. The criterion of variables selection should focus on their ecological significance rather than the co-linearity between them. The choice of feature types has relatively limi-ted effects on predictive performance of the model, therefore it is advised to choose simpler models. To control overfitting, it is necessary to conduct species-specific tuning on regularization multiplier, which was usually bigger than the default setting. There are three criteria called objectivity, equality and discriminability for selecting threshold to convert continuous predication (e.g. probability of presence) into binary results. Maximizing the sum of sensitivity and specificity is a sound method for threshold selection. Model evaluation methods could be classified into two main types: Threshold-independent and threshold-dependent. Among the threshold-independent evaluations, information criteria may offer significant advantages over AUC and COR. True Skill Statistics is a better index for threshold-dependent evaluations, because it takes both omission and commission errors into account, and is robust to pseudo-absence assumption and species prevalence.
Collapse
|
9
|
A categorical perspective towards aerodynamic models for aeroelastic analyses of bridge decks. ROYAL SOCIETY OPEN SCIENCE 2019; 6:181848. [PMID: 31032037 PMCID: PMC6458369 DOI: 10.1098/rsos.181848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 02/08/2019] [Indexed: 06/09/2023]
Abstract
Reliable modelling in structural engineering is crucial for the serviceability and safety of structures. A huge variety of aerodynamic models for aeroelastic analyses of bridges poses natural questions on their complexity and thus, quality. Moreover, a direct comparison of aerodynamic models is typically either not possible or senseless, as the models can be based on very different physical assumptions. Therefore, to address the question of principal comparability and complexity of models, a more abstract approach, accounting for the effect of basic physical assumptions, is necessary. This paper presents an application of a recently introduced category theory-based modelling approach to a diverse set of models from bridge aerodynamics. Initially, the categorical approach is extended to allow an adequate description of aerodynamic models. Complexity of the selected aerodynamic models is evaluated, based on which model comparability is established. Finally, the utility of the approach for model comparison and characterization is demonstrated on an illustrative example from bridge aeroelasticity. The outcome of this study is intended to serve as an alternative framework for model comparison and impact future model assessment studies of mathematical models for engineering applications.
Collapse
|
10
|
Abstract
Bifactor and other hierarchical models have become central to representing and explaining observations in psychopathology, health, and other areas of clinical science, as well as in the behavioral sciences more broadly. This prominence comes after a relatively rapid period of rediscovery, however, and certain features remain poorly understood. Here, hierarchical models are compared and contrasted with other models of superordinate structure, with a focus on implications for model comparisons and interpretation. Issues pertaining to the specification and estimation of bifactor and other hierarchical models are reviewed in exploratory as well as confirmatory modeling scenarios, as are emerging findings about model fit and selection. Bifactor and other hierarchical models provide a powerful mechanism for parsing shared and unique components of variance, but care is required in specifying and making inferences about them.
Collapse
|
11
|
The Stochastic Complexity of Spin Models: Are Pairwise Models Really Simple? ENTROPY 2018; 20:e20100739. [PMID: 33265828 PMCID: PMC7512302 DOI: 10.3390/e20100739] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Revised: 09/18/2018] [Accepted: 09/18/2018] [Indexed: 11/16/2022]
Abstract
Models can be simple for different reasons: because they yield a simple and computationally efficient interpretation of a generic dataset (e.g., in terms of pairwise dependencies)—as in statistical learning—or because they capture the laws of a specific phenomenon—as e.g., in physics—leading to non-trivial falsifiable predictions. In information theory, the simplicity of a model is quantified by the stochastic complexity, which measures the number of bits needed to encode its parameters. In order to understand how simple models look like, we study the stochastic complexity of spin models with interactions of arbitrary order. We show that bijections within the space of possible interactions preserve the stochastic complexity, which allows to partition the space of all models into equivalence classes. We thus found that the simplicity of a model is not determined by the order of the interactions, but rather by their mutual arrangements. Models where statistical dependencies are localized on non-overlapping groups of few variables are simple, affording predictions on independencies that are easy to falsify. On the contrary, fully connected pairwise models, which are often used in statistical learning, appear to be highly complex, because of their extended set of interactions, and they are hard to falsify.
Collapse
|
12
|
Genomic Selection in Plant Breeding: Methods, Models, and Perspectives. TRENDS IN PLANT SCIENCE 2017; 22:961-975. [PMID: 28965742 DOI: 10.1016/j.tplants.2017.08.011] [Citation(s) in RCA: 550] [Impact Index Per Article: 78.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2016] [Revised: 08/05/2017] [Accepted: 08/28/2017] [Indexed: 05/17/2023]
Abstract
Genomic selection (GS) facilitates the rapid selection of superior genotypes and accelerates the breeding cycle. In this review, we discuss the history, principles, and basis of GS and genomic-enabled prediction (GP) as well as the genetics and statistical complexities of GP models, including genomic genotype×environment (G×E) interactions. We also examine the accuracy of GP models and methods for two cereal crops and two legume crops based on random cross-validation. GS applied to maize breeding has shown tangible genetic gains. Based on GP results, we speculate how GS in germplasm enhancement (i.e., prebreeding) programs could accelerate the flow of genes from gene bank accessions to elite lines. Recent advances in hyperspectral image technology could be combined with GS and pedigree-assisted breeding.
Collapse
|
13
|
Good-bye to tropical alpine plant giants under warmer climates? Loss of range and genetic diversity in Lobelia rhynchopetalum. Ecol Evol 2016; 6:8931-8941. [PMID: 28035281 PMCID: PMC5192889 DOI: 10.1002/ece3.2603] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2016] [Revised: 10/13/2016] [Accepted: 10/20/2016] [Indexed: 11/06/2022] Open
Abstract
The main aim of this paper is to address consequences of climate warming on loss of habitat and genetic diversity in the enigmatic tropical alpine giant rosette plants using the Ethiopian endemic Lobelia rhynchopetalum as a model. We modeled the habitat suitability of L. rhynchopetalum and assessed how its range is affected under two climate models and four emission scenarios. We used three statistical algorithms calibrated to represent two different complexity levels of the response. We analyzed genetic diversity using amplified fragment length polymorphisms and assessed the impact of the projected range loss. Under all model and scenario combinations and consistent across algorithms and complexity levels, this afro-alpine flagship species faces massive range reduction. Only 3.4% of its habitat seems to remain suitable on average by 2,080, resulting in loss of 82% (CI 75%-87%) of its genetic diversity. The remaining suitable habitat is projected to be fragmented among and reduced to four mountain peaks, further deteriorating the probability of long-term sustainability of viable populations. Because of the similar morphological and physiological traits developed through convergent evolution by tropical alpine giant rosette plants in response to diurnal freeze-thaw cycles, they most likely respond to climate change in a similar way as our study species. We conclude that specialized high-alpine giant rosette plants, such as L. rhynchopetalum, are likely to face very high risk of extinction following climate warming.
Collapse
|
14
|
Abstract
The field of disease ecology - the study of the spread and impact of parasites and pathogens within their host populations and communities - has a long history of using mathematical models. Dating back over 100 years, researchers have used mathematics to describe the spread of disease-causing agents, understand the relationship between host density and transmission and plan control strategies. The use of mathematical modelling in disease ecology exploded in the late 1970s and early 1980s through the work of Anderson and May (Anderson and May, 1978, 1981, 1992; May and Anderson, 1978), who developed the fundamental frameworks for studying microparasite (e.g. viruses, bacteria and protozoa) and macroparasite (e.g. helminth) dynamics, emphasizing the importance of understanding features such as the parasite's basic reproduction number (R 0) and critical community size that form the basis of disease ecology research to this day. Since the initial models of disease population dynamics, which primarily focused on human diseases, theoretical disease research has expanded hugely to encompass livestock and wildlife disease systems, and also to explore evolutionary questions such as the evolution of parasite virulence or drug resistance. More recently there have been efforts to broaden the field still further, to move beyond the standard 'one-host-one-parasite' paradigm of the original models, to incorporate many aspects of complexity of natural systems, including multiple potential host species and interactions among multiple parasite species.
Collapse
|
15
|
Less is more: an adaptive branch-site random effects model for efficient detection of episodic diversifying selection. Mol Biol Evol 2015; 32:1342-53. [PMID: 25697341 DOI: 10.1093/molbev/msv022] [Citation(s) in RCA: 423] [Impact Index Per Article: 47.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Over the past two decades, comparative sequence analysis using codon-substitution models has been honed into a powerful and popular approach for detecting signatures of natural selection from molecular data. A substantial body of work has focused on developing a class of "branch-site" models which permit selective pressures on sequences, quantified by the ω ratio, to vary among both codon sites and individual branches in the phylogeny. We develop and present a method in this class, adaptive branch-site random effects likelihood (aBSREL), whose key innovation is variable parametric complexity chosen with an information theoretic criterion. By applying models of different complexity to different branches in the phylogeny, aBSREL delivers statistical performance matching or exceeding best-in-class existing approaches, while running an order of magnitude faster. Based on simulated data analysis, we offer guidelines for what extent and strength of diversifying positive selection can be detected reliably and suggest that there is a natural limit on the optimal parametric complexity for "branch-site" models. An aBSREL analysis of 8,893 Euteleostomes gene alignments demonstrates that over 80% of branches in typical gene phylogenies can be adequately modeled with a single ω ratio model, that is, current models are unnecessarily complicated. However, there are a relatively small number of key branches, whose identities are derived from the data using a model selection procedure, for which it is essential to accurately model evolutionary complexity.
Collapse
|
16
|
On the Minimum Description Length Complexity of Multinomial Processing Tree Models. JOURNAL OF MATHEMATICAL PSYCHOLOGY 2010; 54:291-303. [PMID: 20514139 PMCID: PMC2875838 DOI: 10.1016/j.jmp.2010.02.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Multinomial processing tree (MPT) modeling is a statistical methodology that has been widely and successfully applied for measuring hypothesized latent cognitive processes in selected experimental paradigms. This paper concerns model complexity of MPT models. Complexity is a key and necessary concept to consider in the evaluation and selection of quantitative models. A complex model with many parameters often overfits data beyond and above the underlying regularities, and therefore, should be appropriately penalized. It has been well established and demonstrated in multiple studies that in addition to the number of parameters, a model's functional form, which refers to the way by which parameters are combined in the model equation, can also have significant effects on complexity. Given that MPT models vary greatly in their functional forms (tree structures and parameter/category assignments), it would be of interest to evaluate their effects on complexity. Addressing this issue from the minimum description length (MDL) viewpoint, we prove a series of propositions concerning various ways in which functional form contributes to the complexity of MPT models. Computational issues of complexity are also discussed.
Collapse
|
17
|
Abstract
Hodges & Sargent (2001) developed a measure of a hierarchical model's complexity, degrees of freedom (DF), that is consistent with definitions for scatterplot smoothers, interpretable in terms of simple models, and that enables control of a fit's complexity by means of a prior distribution on complexity. DF describes complexity of the whole fitted model but in general it is unclear how to allocate DF to individual effects. We give a new definition of DF for arbitrary normal-error linear hierarchical models, consistent with Hodges & Sargent's, that naturally partitions the n observations into DF for individual effects and for error. The new conception of an effect's DF is the ratio of the effect's modeled variance matrix to the total variance matrix. This gives a way to describe the sizes of different parts of a model (e.g., spatial clustering vs. heterogeneity), to place DF-based priors on smoothing parameters, and to describe how a smoothed effect competes with other effects. It also avoids difficulties with the most common definition of DF for residuals. We conclude by comparing DF to the effective number of parameters p(D) of Spiegelhalter et al (2002). Technical appendices and a dataset are available online as supplemental materials.
Collapse
|