1
|
Spatial+: A novel approach to spatial confounding. Biometrics 2022; 78:1279-1290. [PMID: 35258102 PMCID: PMC10084199 DOI: 10.1111/biom.13656] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 04/08/2021] [Indexed: 12/30/2022]
Abstract
In spatial regression models, collinearity between covariates and spatial effects can lead to significant bias in effect estimates. This problem, known as spatial confounding, is encountered modeling forestry data to assess the effect of temperature on tree health. Reliable inference is difficult as results depend on whether or not spatial effects are included in the model. We propose a novel approach, spatial+, for dealing with spatial confounding when the covariate of interest is spatially dependent but not fully determined by spatial location. Using a thin plate spline model formulation we see that, in this case, the bias in covariate effect estimates is a direct result of spatial smoothing. Spatial+ reduces the sensitivity of the estimates to smoothing by replacing the covariates by their residuals after spatial dependence has been regressed away. Through asymptotic analysis we show that spatial+ avoids the bias problems of the spatial model. This is also demonstrated in a simulation study. Spatial+ is straightforward to implement using existing software and, as the response variable is the same as that of the spatial model, standard model selection criteria can be used for comparisons. A major advantage of the method is also that it extends to models with non-Gaussian response distributions. Finally, while our results are derived in a thin plate spline setting, the spatial+ methodology transfers easily to other spatial model formulations.
Collapse
|
2
|
Rejoinder to the discussions of "Spatial+: A novel approach to spatial confounding". Biometrics 2022; 78:1309-1312. [PMID: 35363888 DOI: 10.1111/biom.13653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 02/14/2022] [Indexed: 12/27/2022]
Abstract
In this rejoinder, we set out some of the main points that we took from the discussions of our paper "Spatial+: A novel approach to spatial confounding." The comments provided by the discussants include excellent questions and suggestions for extensions and improvements to spatial+. The discussions also highlight the growing interest in understanding spatial confounding, underpinned by the many recent contributions to the literature on this topic.
Collapse
|
3
|
Relation of Incident Type 1 Diabetes to Recent COVID-19 Infection: Cohort Study Using e-Health Record Linkage in Scotland. Diabetes Care 2022; 46:921-928. [PMID: 35880797 DOI: 10.2337/dc22-0385] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 05/03/2022] [Indexed: 02/03/2023]
Abstract
OBJECTIVE Studies using claims databases reported that SARS-CoV-2 infection >30 days earlier was associated with an increase in the incidence of type 1 diabetes. Using exact dates of diabetes diagnosis from the national register in Scotland linked to virology laboratory data, we sought to replicate this finding. RESEARCH DESIGN AND METHODS A cohort of 1,849,411 individuals aged <35 years without diabetes, including all those in Scotland who subsequently tested positive for SARS-CoV-2, was followed from 1 March 2020 to 22 November 2021. Incident type 1 diabetes was ascertained from the national registry. Using Cox regression, we tested the association of time-updated infection with incident diabetes. Trends in incidence of type 1 diabetes in the population from 2015 through 2021 were also estimated in a generalized additive model. RESULTS There were 365,080 individuals who had at least one detected SARS-CoV-2 infection during follow-up and 1074 who developed type 1 diabetes. The rate ratio for incident type 1 diabetes associated with first positive test for SARS-CoV-2 (reference category: no previous infection) was 0.86 (95% CI 0.62, 1.21) for infection >30 days earlier and 2.62 (95% CI 1.81, 3.78) for infection in the previous 30 days. However, negative and positive SARS-CoV-2 tests were more frequent in the days surrounding diabetes presentation. In those aged 0-14 years, incidence of type 1 diabetes during 2020-2021 was 20% higher than the 7-year average. CONCLUSIONS Type 1 diabetes incidence in children increased during the pandemic. However, the cohort analysis suggests that SARS-CoV-2 infection itself was not the cause of this increase.
Collapse
|
4
|
Was R < 1 before the English lockdowns? On modelling mechanistic detail, causality and inference about Covid-19. PLoS One 2021; 16:e0257455. [PMID: 34550990 PMCID: PMC8457481 DOI: 10.1371/journal.pone.0257455] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 09/01/2021] [Indexed: 12/15/2022] Open
Abstract
Detail is a double edged sword in epidemiological modelling. The inclusion of mechanistic detail in models of highly complex systems has the potential to increase realism, but it also increases the number of modelling assumptions, which become harder to check as their possible interactions multiply. In a major study of the Covid-19 epidemic in England, Knock et al. (2020) fit an age structured SEIR model with added health service compartments to data on deaths, hospitalization and test results from Covid-19 in seven English regions for the period March to December 2020. The simplest version of the model has 684 states per region. One main conclusion is that only full lockdowns brought the pathogen reproduction number, R, below one, with R ≫ 1 in all regions on the eve of March 2020 lockdown. We critically evaluate the Knock et al. epidemiological model, and the semi-causal conclusions made using it, based on an independent reimplementation of the model designed to allow relaxation of some of its strong assumptions. In particular, Knock et al. model the effect on transmission of both non-pharmaceutical interventions and other effects, such as weather, using a piecewise linear function, b(t), with 12 breakpoints at selected government announcement or intervention dates. We replace this representation by a smoothing spline with time varying smoothness, thereby allowing the form of b(t) to be substantially more data driven, and we check that the corresponding smoothness assumption is not driving our results. We also reset the mean incubation time and time from first symptoms to hospitalisation, used in the model, to values implied by the papers cited by Knock et al. as the source of these quantities. We conclude that there is no sound basis for using the Knock et al. model and their analysis to make counterfactual statements about the number of deaths that would have occurred with different lockdown timings. However, if fits of this epidemiological model structure are viewed as a reasonable basis for inference about the time course of incidence and R, then without very strong modelling assumptions, the pathogen reproduction number was probably below one, and incidence in substantial decline, some days before either of the first two English national lockdowns. This result coincides with that obtained by more direct attempts to reconstruct incidence. Of course it does not imply that lockdowns had no effect, but it does suggest that other non-pharmaceutical interventions (NPIs) may have been much more effective than Knock et al. imply, and that full lockdowns were probably not the cause of R dropping below one.
Collapse
|
5
|
Additive stacking for disaggregate electricity demand forecasting. Ann Appl Stat 2021. [DOI: 10.1214/20-aoas1417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
6
|
Inferring UK COVID-19 fatal infection trajectories from daily mortality data: Were infections already in decline before the UK lockdowns? Biometrics 2021; 78:1127-1140. [PMID: 33783826 PMCID: PMC8251436 DOI: 10.1111/biom.13462] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 03/03/2021] [Accepted: 03/17/2021] [Indexed: 12/18/2022]
Abstract
The number of new infections per day is a key quantity for effective epidemic management. It can be estimated relatively directly by testing of random population samples. Without such direct epidemiological measurement, other approaches are required to infer whether the number of new cases is likely to be increasing or decreasing: for example, estimating the pathogen-effective reproduction number, R, using data gathered from the clinical response to the disease. For coronavirus disease 2019 (Covid-19/SARS-Cov-2), such R estimation is heavily dependent on modelling assumptions, because the available clinical case data are opportunistic observational data subject to severe temporal confounding. Given this difficulty, it is useful to retrospectively reconstruct the time course of infections from the least compromised available data, using minimal prior assumptions. A Bayesian inverse problem approach applied to UK data on first-wave Covid-19 deaths and the disease duration distribution suggests that fatal infections were in decline before full UK lockdown (24 March 2020), and that fatal infections in Sweden started to decline only a day or two later. An analysis of UK data using the model of Flaxman et al. gives the same result under relaxation of its prior assumptions on R, suggesting an enhanced role for non-pharmaceutical interventions short of full lockdown in the UK context. Similar patterns appear to have occurred in the subsequent two lockdowns.
Collapse
|
7
|
|
8
|
Rejoinder on: Inference and computation with Generalized Additive Models and their extensions. TEST-SPAIN 2020. [DOI: 10.1007/s11749-020-00716-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
9
|
Abstract
AbstractRegression models in which a response variable is related to smooth functions of some predictor variables are popular as a result of their appealing balance between flexibility and interpretability. Since the original generalized additive models of Hastie and Tibshirani (Generalized additive models. Chapman & Hall, Boca Raton, 1990) numerous model extensions have been proposed, and a variety of practically useful computational strategies have emerged. This paper provides an overview of some widely applicable frameworks for this type of modelling, emphasizing the similarities between the different approaches, and the equivalence of smoothing, Gaussian latent process models and Gaussian random effects. The focus is particularly on Bayes empirical smoother theory, fully Bayesian inference via stochastic simulation or integrated nested Laplace approximation and boosting.
Collapse
|
10
|
|
11
|
Abstract
Summary
Integrated nested Laplace approximation provides accurate and efficient approximations for marginal distributions in latent Gaussian random field models. Computational feasibility of the original Rue et al. (2009) methods relies on efficient approximation of Laplace approximations for the marginal distributions of the coefficients of the latent field, conditional on the data and hyperparameters. The computational efficiency of these approximations depends on the Gaussian field having a Markov structure. This note provides equivalent efficiency without requiring the Markov property, which allows for straightforward use of latent Gaussian fields without a sparse structure, such as reduced rank multi-dimensional smoothing splines. The method avoids the approximation for conditional modes used in Rue et al. (2009), and uses a log determinant approximation based on a simple quasi-Newton update. The latter has a desirable property not shared by the most commonly used variant of the original method.
Collapse
|
12
|
|
13
|
Analyzing the Time Course of Pupillometric Data. Trends Hear 2019; 23:2331216519832483. [PMID: 31081486 PMCID: PMC6535748 DOI: 10.1177/2331216519832483] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 11/30/2018] [Accepted: 12/12/2018] [Indexed: 11/21/2022] Open
Abstract
This article provides a tutorial for analyzing pupillometric data. Pupil dilation has become increasingly popular in psychological and psycholinguistic research as a measure to trace language processing. However, there is no general consensus about procedures to analyze the data, with most studies analyzing extracted features from the pupil dilation data instead of analyzing the pupil dilation trajectories directly. Recent studies have started to apply nonlinear regression and other methods to analyze the pupil dilation trajectories directly, utilizing all available information in the continuously measured signal. This article applies a nonlinear regression analysis, generalized additive mixed modeling, and illustrates how to analyze the full-time course of the pupil dilation signal. The regression analysis is particularly suited for analyzing pupil dilation in the fields of psychological and psycholinguistic research because generalized additive mixed models can include complex nonlinear interactions for investigating the effects of properties of stimuli (e.g., formant frequency) or participants (e.g., working memory score) on the pupil dilation signal. To account for the variation due to participants and items, nonlinear random effects can be included. However, one of the challenges for analyzing time series data is dealing with the autocorrelation in the residuals, which is rather extreme for the pupillary signal. On the basis of simulations, we explain potential causes of this extreme autocorrelation, and on the basis of the experimental data, we show how to reduce their adverse effects, allowing a much more coherent interpretation of pupillary data than possible with feature-based techniques.
Collapse
|
14
|
Model averaging in ecology: a review of Bayesian, information-theoretic, and tactical approaches for predictive inference. ECOL MONOGR 2018. [DOI: 10.1002/ecm.1309] [Citation(s) in RCA: 129] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
15
|
|
16
|
A Simultaneous Equation Approach to Estimating HIV Prevalence With Nonignorable Missing Responses. J Am Stat Assoc 2017. [DOI: 10.1080/01621459.2016.1224713] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
17
|
Computing AIC for black-box models using generalized degrees of freedom: A comparison with cross-validation. COMMUN STAT-SIMUL C 2017. [DOI: 10.1080/03610918.2017.1315728] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
18
|
|
19
|
A generalized Fellner-Schall method for smoothing parameter optimization with application to Tweedie location, scale and shape models. Biometrics 2017; 73:1071-1081. [PMID: 28192595 DOI: 10.1111/biom.12666] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 01/01/2017] [Accepted: 01/01/2017] [Indexed: 11/29/2022]
Abstract
We consider the optimization of smoothing parameters and variance components in models with a regular log likelihood subject to quadratic penalization of the model coefficients, via a generalization of the method of Fellner (1986) and Schall (1991). In particular: (i) we generalize the original method to the case of penalties that are linear in several smoothing parameters, thereby covering the important cases of tensor product and adaptive smoothers; (ii) we show why the method's steps increase the restricted marginal likelihood of the model, that it tends to converge faster than the EM algorithm, or obvious accelerations of this, and investigate its relation to Newton optimization; (iii) we generalize the method to any Fisher regular likelihood. The method represents a considerable simplification over existing methods of estimating smoothing parameters in the context of regular likelihoods, without sacrificing generality: for example, it is only necessary to compute with the same first and second derivatives of the log-likelihood required for coefficient estimation, and not with the third or fourth order derivatives required by alternative approaches. Examples are provided which would have been impossible or impractical with pre-existing Fellner-Schall methods, along with an example of a Tweedie location, scale and shape model which would be a challenge for alternative methods, and a sparse additive modeling example where the method facilitates computational efficiency gains of several orders of magnitude. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Collapse
|
20
|
|
21
|
|
22
|
Comment. J Am Stat Assoc 2017. [DOI: 10.1080/01621459.2016.1270050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
23
|
A Comparison of Inferential Methods for Highly Nonlinear State Space Models in Ecology and Epidemiology. Stat Sci 2016. [DOI: 10.1214/15-sts534] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
24
|
|
25
|
|
26
|
|
27
|
Estimation techniques used in studies of copepod population dynamics — A review of underlying assumptions. ACTA ACUST UNITED AC 2012. [DOI: 10.1080/00364827.1997.10413657] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
28
|
|
29
|
Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc Series B Stat Methodol 2010. [DOI: 10.1111/j.1467-9868.2010.00749.x] [Citation(s) in RCA: 3532] [Impact Index Per Article: 252.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
30
|
Statistical inference for noisy nonlinear ecological dynamic systems. Nature 2010; 466:1102-4. [DOI: 10.1038/nature09319] [Citation(s) in RCA: 268] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2010] [Accepted: 06/28/2010] [Indexed: 11/09/2022]
|
31
|
|
32
|
The effects of group size, leaf size, and density on the performance of a leaf-mining moth. J Anim Ecol 2009; 78:152-60. [DOI: 10.1111/j.1365-2656.2008.01469.x] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
33
|
|
34
|
Fast stable direct fitting and smoothness selection for generalized additive models. J R Stat Soc Series B Stat Methodol 2008. [DOI: 10.1111/j.1467-9868.2007.00646.x] [Citation(s) in RCA: 448] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
35
|
|
36
|
Abstract
A general method for constructing low-rank tensor product smooths for use as components of generalized additive models or generalized additive mixed models is presented. A penalized regression approach is adopted in which tensor product smooths of several variables are constructed from smooths of each variable separately, these "marginal" smooths being represented using a low-rank basis with an associated quadratic wiggliness penalty. The smooths offer several advantages: (i) they have one wiggliness penalty per covariate and are hence invariant to linear rescaling of covariates, making them useful when there is no "natural" way to scale covariates relative to each other; (ii) they have a useful tuneable range of smoothness, unlike single-penalty tensor product smooths that are scale invariant; (iii) the relatively low rank of the smooths means that they are computationally efficient; (iv) the penalties on the smooths are easily interpretable in terms of function shape; (v) the smooths can be generated completely automatically from any marginal smoothing bases and associated quadratic penalties, giving the modeler considerable flexibility to choose the basis penalty combination most appropriate to each modeling task; and (vi) the smooths can easily be written as components of a standard linear or generalized linear mixed model, allowing them to be used as components of the rich family of such models implemented in standard software, and to take advantage of the efficient and stable computational methods that have been developed for such models. A small simulation study shows that the methods can compare favorably with recently developed smoothing spline ANOVA methods.
Collapse
|
37
|
POPULATION CYCLES IN THE PINE LOOPER MOTH: DYNAMICAL TESTS OF MECHANISTIC HYPOTHESES. ECOL MONOGR 2005. [DOI: 10.1890/03-4056] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
38
|
|
39
|
Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models. J Am Stat Assoc 2004. [DOI: 10.1198/016214504000000980] [Citation(s) in RCA: 1161] [Impact Index Per Article: 58.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
40
|
|
41
|
|
42
|
GAMs with integrated model selection using penalized regression splines and applications to environmental modelling. Ecol Modell 2002. [DOI: 10.1016/s0304-3800(02)00193-x] [Citation(s) in RCA: 401] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
43
|
|
44
|
Abstract
Understanding spatial population dynamics is fundamental for many questions in ecology and conservation. Many theoretical mechanisms have been proposed whereby spatial structure can promote population persistence, in particular for exploiter-victim systems (host-parasite/pathogen, predator-prey) whose interactions are inherently oscillatory and therefore prone to extinction of local populations. Experiments have confirmed that spatial structure can extend persistence, but it has rarely been possible to identify the specific mechanisms involved. Here we use a model-based approach to identify the effects of spatial population processes in experimental systems of bean plants (Phaseolus lunatus), herbivorous mites (Tetranychus urticae) and predatory mites (Phytoseiulus persimilis). On isolated plants, and in a spatially undivided experimental system of 90 plants, prey and predator populations collapsed; however, introducing habitat structure allowed long-term persistence. Using mechanistic models, we determine that spatial population structure did not contribute to persistence, and spatially explicit models are not needed. Rather, habitat structure reduced the success of predators at locating prey outbreaks, allowing between-plant asynchrony of local population cycles due to random colonization events.
Collapse
|
45
|
Abstract
Objective functions that arise when fitting nonlinear models often contain local minima that are of little significance except for their propensity to trap minimization algorithms. The standard methods for attempting to deal with this problem treat the objective function as fixed and employ stochastic minimization approaches in the hope of randomly jumping out of local minima. This article suggests a simple trick for performing such minimizations that can be employed in conjunction with most conventional nonstochastic fitting methods. The trick is to stochastically perturb the objective function by bootstrapping the data to be fit. Each bootstrap objective shares the large-scale structure of the original objective but has different small-scale structure. Minimizations of bootstrap objective functions are alternated with minimizations of the original objective function starting from the parameter values with which minimization of the previous bootstrap objective terminated. An example is presented, fitting a nonlinear population dynamic model to population dynamic data and including a comparison of the suggested method with simulated annealing. Convergence diagnostics are discussed.
Collapse
|
46
|
|
47
|
|
48
|
|
49
|
|
50
|
Abstract
Epidemiological theory predicts that pathogens of high virulence should not become endemic. We show, using an empirically based lattice map model, that a pathogen that is too virulent to persist if its host population is spatially well mixed, can persist if the host population is spatially distributed, because of internally generated complex spatial dynamics, provided that the area occupied by the host population is sufficiently large. The dynamics are not an artefact of spatial or temporal discretization. The results uncover a mechanism for the persistence of virulent pathogens, suggesting a means by which pathogens of high virulence could achieve sustained as well as short-term biological pest control.
Collapse
|