1
|
Abstract
We propose an extensive framework for additive regression models for correlated functional responses, allowing for multiple partially nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data. Additionally, our framework includes linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. It accommodates densely or sparsely observed functional responses and predictors which may be observed with additional error and includes both spline-based and functional principal component-based terms. Estimation and inference in this framework is based on standard additive mixed models, allowing us to take advantage of established methods and robust, flexible algorithms. We provide easy-to-use open source software in the pffr() function for the R-package refund. Simulations show that the proposed method recovers relevant effects reliably, handles small sample sizes well and also scales to larger data sets. Applications with spatially and longitudinally observed functional data demonstrate the flexibility in modeling and interpretability of results of our approach.
Collapse
|
Journal Article |
10 |
112 |
2
|
Marron JS, Alonso AM. Overview of object oriented data analysis. Biom J 2014; 56:732-53. [PMID: 24421177 DOI: 10.1002/bimj.201300072] [Citation(s) in RCA: 99] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Revised: 10/28/2013] [Accepted: 11/02/2013] [Indexed: 11/09/2022]
Abstract
Object oriented data analysis is the statistical analysis of populations of complex objects. In the special case of functional data analysis, these data objects are curves, where a variety of Euclidean approaches, such as principal components analysis, have been very successful. Challenges in modern medical image analysis motivate the statistical analysis of populations of more complex data objects that are elements of mildly non-Euclidean spaces, such as lie groups and symmetric spaces, or of strongly non-Euclidean spaces, such as spaces of tree-structured data objects. These new contexts for object oriented data analysis create several potentially large new interfaces between mathematics and statistics. The notion of object oriented data analysis also impacts data analysis, through providing a framework for discussion of the many choices needed in many modern complex data analyses, especially in interdisciplinary contexts.
Collapse
|
Review |
11 |
99 |
3
|
McLean MW, Hooker G, Staicu AM, Scheipl F, Ruppert D. Functional Generalized Additive Models. J Comput Graph Stat 2014; 23:249-269. [PMID: 24729671 DOI: 10.1080/10618600.2012.729985] [Citation(s) in RCA: 71] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We introduce the functional generalized additive model (FGAM), a novel regression model for association studies between a scalar response and a functional predictor. We model the link-transformed mean response as the integral with respect to t of F{X(t), t} where F(·,·) is an unknown regression function and X(t) is a functional covariate. Rather than having an additive model in a finite number of principal components as in Müller and Yao (2008), our model incorporates the functional predictor directly and thus our model can be viewed as the natural functional extension of generalized additive models. We estimate F(·,·) using tensor-product B-splines with roughness penalties. A pointwise quantile transformation of the functional predictor is also considered to ensure each tensor-product B-spline has observed data on its support. The methods are evaluated using simulated data and their predictive performance is compared with other competing scalar-on-function regression alternatives. We illustrate the usefulness of our approach through an application to brain tractography, where X(t) is a signal from diffusion tensor imaging at position, t, along a tract in the brain. In one example, the response is disease-status (case or control) and in a second example, it is the score on a cognitive test. R code for performing the simulations and fitting the FGAM can be found in supplemental materials available online.
Collapse
|
Journal Article |
11 |
71 |
4
|
Abstract
Functional principal component analysis (FPCA) has become the most widely used dimension reduction tool for functional data analysis. We consider functional data measured at random, subject-specific time points, contaminated with measurement error, allowing for both sparse and dense functional data, and propose novel information criteria to select the number of principal component in such data. We propose a Bayesian information criterion based on marginal modeling that can consistently select the number of principal components for both sparse and dense functional data. For dense functional data, we also developed an Akaike information criterion (AIC) based on the expected Kullback-Leibler information under a Gaussian assumption. In connecting with factor analysis in multivariate time series data, we also consider the information criteria by Bai & Ng (2002) and show that they are still consistent for dense functional data, if a prescribed undersmoothing scheme is undertaken in the FPCA algorithm. We perform intensive simulation studies and show that the proposed information criteria vastly outperform existing methods for this type of data. Surprisingly, our empirical evidence shows that our information criteria proposed for dense functional data also perform well for sparse functional data. An empirical example using colon carcinogenesis data is also provided to illustrate the results.
Collapse
|
Research Support, Non-U.S. Gov't |
12 |
59 |
5
|
Wilson A, Chiu YHM, Hsu HHL, Wright RO, Wright RJ, Coull BA. Bayesian distributed lag interaction models to identify perinatal windows of vulnerability in children's health. Biostatistics 2018; 18:537-552. [PMID: 28334179 DOI: 10.1093/biostatistics/kxx002] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 01/13/2017] [Indexed: 01/09/2023] Open
Abstract
Epidemiological research supports an association between maternal exposure to air pollution during pregnancy and adverse children's health outcomes. Advances in exposure assessment and statistics allow for estimation of both critical windows of vulnerability and exposure effect heterogeneity. Simultaneous estimation of windows of vulnerability and effect heterogeneity can be accomplished by fitting a distributed lag model (DLM) stratified by subgroup. However, this can provide an incomplete picture of how effects vary across subgroups because it does not allow for subgroups to have the same window but different within-window effects or to have different windows but the same within-window effect. Because the timing of some developmental processes are common across subpopulations of infants while for others the timing differs across subgroups, both scenarios are important to consider when evaluating health risks of prenatal exposures. We propose a new approach that partitions the DLM into a constrained functional predictor that estimates windows of vulnerability and a scalar effect representing the within-window effect directly. The proposed method allows for heterogeneity in only the window, only the within-window effect, or both. In a simulation study we show that a model assuming a shared component across groups results in lower bias and mean squared error for the estimated windows and effects when that component is in fact constant across groups. We apply the proposed method to estimate windows of vulnerability in the association between prenatal exposures to fine particulate matter and each of birth weight and asthma incidence, and estimate how these associations vary by sex and maternal obesity status in a Boston-area prospective pre-birth cohort study.
Collapse
|
Journal Article |
7 |
51 |
6
|
Lindquist MA. Functional Causal Mediation Analysis With an Application to Brain Connectivity. J Am Stat Assoc 2012; 107:1297-1309. [PMID: 25076802 DOI: 10.1080/01621459.2012.695640] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Mediation analysis is often used in the behavioral sciences to investigate the role of intermediate variables that lie on the causal path between a randomized treatment and an outcome variable. Typically, mediation is assessed using structural equation models (SEMs), with model coefficients interpreted as causal effects. In this article, we present an extension of SEMs to the functional data analysis (FDA) setting that allows the mediating variable to be a continuous function rather than a single scalar measure, thus providing the opportunity to study the functional effects of the mediator on the outcome. We provide sufficient conditions for identifying the average causal effects of the functional mediators using the extended SEM, as well as weaker conditions under which an instrumental variable estimand may be interpreted as an effect. The method is applied to data from a functional magnetic resonance imaging (fMRI) study of thermal pain that sought to determine whether activation in certain brain regions mediated the effect of applied temperature on self-reported pain. Our approach provides valuable information about the timing of the mediating effect that is not readily available when using the standard nonfunctional approach. To the best of our knowledge, this work provides the first application of causal inference to the FDA framework.
Collapse
|
Journal Article |
13 |
49 |
7
|
Meyer MJ, Coull BA, Versace F, Cinciripini P, Morris JS. Bayesian function-on-function regression for multilevel functional data. Biometrics 2015; 71:563-74. [PMID: 25787146 PMCID: PMC4575250 DOI: 10.1111/biom.12299] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2013] [Revised: 12/01/2014] [Accepted: 01/01/2015] [Indexed: 11/30/2022]
Abstract
Medical and public health research increasingly involves the collection of complex and high dimensional data. In particular, functional data-where the unit of observation is a curve or set of curves that are finely sampled over a grid-is frequently obtained. Moreover, researchers often sample multiple curves per person resulting in repeated functional measures. A common question is how to analyze the relationship between two functional variables. We propose a general function-on-function regression model for repeatedly sampled functional data on a fine grid, presenting a simple model as well as a more extensive mixed model framework, and introducing various functional Bayesian inferential procedures that account for multiple testing. We examine these models via simulation and a data analysis with data from a study that used event-related potentials to examine how the brain processes various types of images.
Collapse
|
Research Support, N.I.H., Extramural |
10 |
36 |
8
|
Zhu H, Yao F, Zhang HH. Structured functional additive regression in reproducing kernel Hilbert spaces. J R Stat Soc Series B Stat Methodol 2013; 76:581-603. [PMID: 25013362 DOI: 10.1111/rssb.12036] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Functional additive models (FAMs) provide a flexible yet simple framework for regressions involving functional predictors. The utilization of data-driven basis in an additive rather than linear structure naturally extends the classical functional linear model. However, the critical issue of selecting nonlinear additive components has been less studied. In this work, we propose a new regularization framework for the structure estimation in the context of Reproducing Kernel Hilbert Spaces. The proposed approach takes advantage of the functional principal components which greatly facilitates the implementation and the theoretical analysis. The selection and estimation are achieved by penalized least squares using a penalty which encourages the sparse structure of the additive components. Theoretical properties such as the rate of convergence are investigated. The empirical performance is demonstrated through simulation studies and a real data application.
Collapse
|
Journal Article |
12 |
31 |
9
|
Sagittal plane walking biomechanics in individuals with knee osteoarthritis after quadriceps strengthening. Osteoarthritis Cartilage 2019; 27:771-780. [PMID: 30660722 PMCID: PMC6475608 DOI: 10.1016/j.joca.2018.12.026] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 12/12/2018] [Accepted: 12/23/2018] [Indexed: 02/02/2023]
Abstract
OBJECTIVE To compare sagittal walking gait biomechanics between participants with knee osteoarthritis (KOA) who increased quadriceps strength following a lower-extremity strengthening intervention (responders) and those who did not increase strength following the same strengthening protocol (non-responders) both at baseline and following the lower extremity strengthening protocol. DESIGN Fifty-three participants with radiographic KOA (47% female, 62.3 ± 7.1 years, BMI = 28.5 ± 3.9 kg/m2) were enrolled in 10 sessions of lower extremity strengthening over a 28-day period. Maximum isometric quadriceps strength and walking gait biomechanics were collected on the involved limb at baseline and 4-weeks following the strengthening intervention. Responders were classified as individuals who increased quadriceps strength greater than the upper limit of the 95% confidence interval (CI) for the minimal detectable change (MDC) in quadriceps strength (29 Nm) determined in a previous study. 2 × 2 functional analyses of variance were used to evaluate the effects of group (responders and non-responders) and time (baseline and 4-weeks) on time-normalized waveforms for knee flexion angle (KFA), vertical ground reaction force (vGRF), and internal knee extension moment (KEM). RESULTS A significant group x time interaction for KFA demonstrated greater KFA in the first half of stance at baseline and greater knee extension in the second half of stance at 4-weeks in responders compared to non-responders. There was no significant group x time interaction for vGRF or internal KEM. CONCLUSIONS Quadriceps strengthening may be used to stimulate small changes in KFA in individuals with KOA.
Collapse
|
research-article |
6 |
28 |
10
|
Hébert-Losier K, Schelin L, Tengman E, Strong A, Häger CK. Curve analyses reveal altered knee, hip, and trunk kinematics during drop-jumps long after anterior cruciate ligament rupture. Knee 2018. [PMID: 29525548 DOI: 10.1016/j.knee.2017.12.005] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
BACKGROUND Anterior cruciate ligament (ACL) ruptures may lead to knee dysfunctions later in life. Single-leg tasks are often evaluated, but bilateral movements may also be compromised. Our aim was to use curve analyses to examine double-leg drop-jump kinematics in ACL-reconstructed, ACL-deficient, and healthy-knee cohorts. METHODS Subjects with unilateral ACL ruptures treated more than two decades ago (17-28years) conservatively with physiotherapy (ACLPT, n=26) or in combination with reconstructive surgery (ACLR, n=28) and healthy-knee controls (n=25) performed 40-cm drop-jumps. Three-dimensional knee, hip, and trunk kinematics were analyzed during Rebound, Flight, and Landing phases. Curves were time-normalized and compared between groups (injured and non-injured legs of ACLPT and ACLR vs. non-dominant and dominant legs of controls) and within groups (between legs) using functional analysis of variance methods. RESULTS Compared to controls, ACL groups exhibited less knee and hip flexion on both legs during Rebound and greater knee external rotation on their injured leg at the start of Rebound and Landing. ACLR also showed less trunk flexion during Rebound. Between-leg differences were observed in ACLR only, with the injured leg more internally rotated at the hip. Overall, kinematic curves were similar between ACLR and ACLPT. However, compared to controls, deviations spanned a greater proportion of the drop-jump movement at the hip in ACLR and at the knee in ACLPT. CONCLUSIONS Trunk and bilateral leg kinematics during double-leg drop-jumps are still compromised long after ACL-rupture care, independent of treatment. Curve analyses indicate the presence of distinct compensatory mechanisms in ACLPT and ACLR compared to controls.
Collapse
|
|
7 |
28 |
11
|
Ankle plantarflexion strength in rearfoot and forefoot runners: a novel clusteranalytic approach. Hum Mov Sci 2014; 35:104-20. [PMID: 24746605 DOI: 10.1016/j.humov.2014.03.008] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Revised: 03/16/2014] [Accepted: 03/23/2014] [Indexed: 12/11/2022]
Abstract
The purpose of the present study was to test for differences in ankle plantarflexion strengths of habitually rearfoot and forefoot runners. In order to approach this issue, we revisit the problem of classifying different footfall patterns in human runners. A dataset of 119 subjects running shod and barefoot (speed 3.5m/s) was analyzed. The footfall patterns were clustered by a novel statistical approach, which is motivated by advances in the statistical literature on functional data analysis. We explain the novel statistical approach in detail and compare it to the classically used strike index of Cavanagh and Lafortune (1980). The two groups found by the new cluster approach are well interpretable as a forefoot and a rearfoot footfall groups. The subsequent comparison study of the clustered subjects reveals that runners with a forefoot footfall pattern are capable of producing significantly higher joint moments in a maximum voluntary contraction (MVC) of their ankle plantarflexor muscles tendon units; difference in means: 0.28Nm/kg. This effect remains significant after controlling for an additional gender effect and for differences in training levels. Our analysis confirms the hypothesis that forefoot runners have a higher mean MVC plantarflexion strength than rearfoot runners. Furthermore, we demonstrate that our proposed stochastic cluster analysis provides a robust and useful framework for clustering foot strikes.
Collapse
|
Research Support, Non-U.S. Gov't |
11 |
28 |
12
|
Chen X, Robinson DG, Storey JD. The functional false discovery rate with applications to genomics. Biostatistics 2021; 22:68-81. [PMID: 31135886 PMCID: PMC7846131 DOI: 10.1093/biostatistics/kxz010] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Revised: 03/08/2019] [Accepted: 03/24/2019] [Indexed: 12/15/2022] Open
Abstract
The false discovery rate (FDR) measures the proportion of false discoveries among a set of hypothesis tests called significant. This quantity is typically estimated based on p-values or test statistics. In some scenarios, there is additional information available that may be used to more accurately estimate the FDR. We develop a new framework for formulating and estimating FDRs and q-values when an additional piece of information, which we call an "informative variable", is available. For a given test, the informative variable provides information about the prior probability a null hypothesis is true or the power of that particular test. The FDR is then treated as a function of this informative variable. We consider two applications in genomics. Our first application is a genetics of gene expression (eQTL) experiment in yeast where every genetic marker and gene expression trait pair are tested for associations. The informative variable in this case is the distance between each genetic marker and gene. Our second application is to detect differentially expressed genes in an RNA-seq study carried out in mice. The informative variable in this study is the per-gene read depth. The framework we develop is quite general, and it should be useful in a broad range of scientific applications.
Collapse
|
Research Support, N.I.H., Extramural |
4 |
24 |
13
|
Li J, Huang C, Zhu H. A Functional Varying-Coefficient Single-Index Model for Functional Response Data. J Am Stat Assoc 2017; 112:1169-1181. [PMID: 29200540 DOI: 10.1080/01621459.2016.1195742] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Motivated by the analysis of imaging data, we propose a novel functional varying-coefficient single index model (FVCSIM) to carry out the regression analysis of functional response data on a set of covariates of interest. FVCSIM represents a new extension of varying-coefficient single index models for scalar responses collected from cross-sectional and longitudinal studies. An efficient estimation procedure is developed to iteratively estimate varying coefficient functions, link functions, index parameter vectors, and the covariance function of individual functions. We systematically examine the asymptotic properties of all estimators including the weak convergence of the estimated varying coefficient functions, the asymptotic distribution of the estimated index parameter vectors, and the uniform convergence rate of the estimated covariance function and their spectrum. Simulation studies are carried out to assess the finite-sample performance of the proposed procedure. We apply FVCSIM to investigating the development of white matter diffusivities along the corpus callosum skeleton obtained from Alzheimer's Disease Neuroimaging Initiative (ADNI) study.
Collapse
|
Research Support, Non-U.S. Gov't |
8 |
24 |
14
|
Gellar JE, Colantuoni E, Needham DM, Crainiceanu CM. Variable-Domain Functional Regression for Modeling ICU Data. J Am Stat Assoc 2014; 109:1425-1439. [PMID: 25663725 DOI: 10.1080/01621459.2014.940044] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
We introduce a class of scalar-on-function regression models with subject-specific functional predictor domains. The fundamental idea is to consider a bivariate functional parameter that depends both on the functional argument and on the width of the functional predictor domain. Both parametric and nonparametric models are introduced to fit the functional coefficient. The nonparametric model is theoretically and practically invariant to functional support transformation, or support registration. Methods were motivated by and applied to a study of association between daily measures of the Intensive Care Unit (ICU) Sequential Organ Failure Assessment (SOFA) score and two outcomes: in-hospital mortality, and physical impairment at hospital discharge among survivors. Methods are generally applicable to a large number of new studies that record a continuous variables over unequal domains.
Collapse
|
Journal Article |
11 |
23 |
15
|
Serban N, Staicu AM, Carroll RJ. Multilevel cross-dependent binary longitudinal data. Biometrics 2013; 69:903-13. [PMID: 24131242 DOI: 10.1111/biom.12083] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2012] [Revised: 06/01/2013] [Accepted: 07/01/2013] [Indexed: 11/30/2022]
Abstract
We provide insights into new methodology for the analysis of multilevel binary data observed longitudinally, when the repeated longitudinal measurements are correlated. The proposed model is logistic functional regression conditioned on three latent processes describing the within- and between-variability, and describing the cross-dependence of the repeated longitudinal measurements. We estimate the model components without employing mixed-effects modeling but assuming an approximation to the logistic link function. The primary objectives of this article are to highlight the challenges in the estimation of the model components, to compare two approximations to the logistic regression function, linear and exponential, and to discuss their advantages and limitations. The linear approximation is computationally efficient whereas the exponential approximation applies for rare events functional data. Our methods are inspired by and applied to a scientific experiment on spectral backscatter from long range infrared light detection and ranging (LIDAR) data. The models are general and relevant to many new binary functional data sets, with or without dependence between repeated functional measurements.
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
12 |
22 |
16
|
Şentürk D, Nguyen DV. Varying Coefficient Models for Sparse Noise-contaminated Longitudinal Data. Stat Sin 2011; 21:1831-1856. [PMID: 25589822 DOI: 10.5705/ss.2009.328] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In this paper we propose a varying coefficient model for highly sparse longitudinal data that allows for error-prone time-dependent variables and time-invariant covariates. We develop a new estimation procedure, based on covariance representation techniques, that enables effective borrowing of information across all subjects in sparse and irregular longitudinal data observed with measurement error, a challenge in which there is no adequate solution currently. More specifically, sparsity is addressed via a functional analysis approach that considers the observed longitudinal data as noise contaminated realizations of a random process that produces smooth trajectories. This approach allows for estimation based on pooled data, borrowing strength from all subjects, in targeting the mean functions and auto- and cross-covariances to overcome sparse noisy designs. The resulting estimators are shown to be uniformly consistent. Consistent prediction for the response trajectories are also obtained via conditional expectation under Gaussian assumptions. Asymptotic distribution of the predicted response trajectories are derived, allowing for construction of asymptotic pointwise confidence bands. Efficacy of the proposed method is investigated in simulation studies and compared to the commonly used local polynomial smoothing method. The proposed method is illustrated with a sparse longitudinal data set, examining the age-varying relationship between calcium absorption and dietary calcium. Prediction of individual calcium absorption curves as a function of age are also examined.
Collapse
|
Journal Article |
14 |
22 |
17
|
Pini A, Vantini S. The interval testing procedure: A general framework for inference in functional data analysis. Biometrics 2016; 72:835-45. [PMID: 26811864 DOI: 10.1111/biom.12476] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Revised: 12/01/2015] [Accepted: 12/01/2015] [Indexed: 11/28/2022]
Abstract
We introduce in this work the Interval Testing Procedure (ITP), a novel inferential technique for functional data. The procedure can be used to test different functional hypotheses, e.g., distributional equality between two or more functional populations, equality of mean function of a functional population to a reference. ITP involves three steps: (i) the representation of data on a (possibly high-dimensional) functional basis; (ii) the test of each possible set of consecutive basis coefficients; (iii) the computation of the adjusted p-values associated to each basis component, by means of a new strategy here proposed. We define a new type of error control, the interval-wise control of the family wise error rate, particularly suited for functional data. We show that ITP is provided with such a control. A simulation study comparing ITP with other testing procedures is reported. ITP is then applied to the analysis of hemodynamical features involved with cerebral aneurysm pathology. ITP is implemented in the fdatest R package.
Collapse
|
Journal Article |
9 |
22 |
18
|
Brien DC, Riek HC, Yep R, Huang J, Coe B, Areshenkoff C, Grimes D, Jog M, Lang A, Marras C, Masellis M, McLaughlin P, Peltsch A, Roberts A, Tan B, Beaton D, Lou W, Swartz R, Munoz DP. Classification and staging of Parkinson's disease using video-based eye tracking. Parkinsonism Relat Disord 2023; 110:105316. [PMID: 36822878 DOI: 10.1016/j.parkreldis.2023.105316] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 01/11/2023] [Accepted: 02/04/2023] [Indexed: 02/10/2023]
Abstract
INTRODUCTION 83% of those diagnosed with Parkinson's Disease (PD) eventually progress to PD with mild cognitive impairment (PD-MCI) followed by dementia (PDD) - suggesting a complex spectrum of pathology concomitant with aging. Biomarkers sensitive and specific to this spectrum are required if useful diagnostics are to be developed that may supplement current clinical testing procedures. We used video-based eye tracking and machine learning to develop a simple, non-invasive test sensitive to PD and the stages of cognitive dysfunction. METHODS From 121 PD (45 Cognitively Normal/45 MCI/20 Dementia/11 Other) and 106 healthy controls, we collected video-based eye tracking data on an interleaved pro/anti-saccade task. Features of saccade, pupil, and blink behavior were used to train a classifier to predict confidence scores for PD/PD-MCI/PDD diagnosis. RESULTS The Receiver Operator Characteristic Area Under the Curve (ROC-AUC) of the classifier was 0.88, with the cognitive-dysfunction subgroups showing progressively increased AUC, and the AUC of PDD being 0.95. The classifier reached a sensitivity of 83% and a specificity of 78%. The confidence scores predicted PD motor and cognitive performance scores. CONCLUSION Biomarkers of saccade, pupil, and blink were extracted from video-based eye tracking to create a classifier with high sensitivity to the landscape of PD cognitive and motor dysfunction. A complex landscape of PD is revealed through a quick, non-invasive eye tracking task and our model provides a framework for such a task to be used as a supplementary screening tool in the clinic.
Collapse
|
|
2 |
22 |
19
|
Wang X, Nan B, Zhu J, Koeppe R. REGULARIZED 3D FUNCTIONAL REGRESSION FOR BRAIN IMAGE DATA VIA HAAR WAVELETS. Ann Appl Stat 2014; 8:1045-1064. [PMID: 26082826 DOI: 10.1214/14-aoas736] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The primary motivation and application in this article come from brain imaging studies on cognitive impairment in elderly subjects with brain disorders. We propose a regularized Haar wavelet-based approach for the analysis of three-dimensional brain image data in the framework of functional data analysis, which automatically takes into account the spatial information among neighboring voxels. We conduct extensive simulation studies to evaluate the prediction performance of the proposed approach and its ability to identify related regions to the outcome of interest, with the underlying assumption that only few relatively small subregions are truly predictive of the outcome of interest. We then apply the proposed approach to searching for brain subregions that are associated with cognition using PET images of patients with Alzheimer's disease, patients with mild cognitive impairment, and normal controls.
Collapse
|
Journal Article |
11 |
21 |
20
|
Zhang L, Baladandayuthapani V, Zhu H, Baggerly KA, Majewski T, Czerniak BA, Morris JS. Functional CAR models for large spatially correlated functional datasets. J Am Stat Assoc 2016; 111:772-786. [PMID: 28018013 DOI: 10.1080/01621459.2015.1042581] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on functions defined on higher dimensional domains such as images. Through simulation studies, we demonstrate that accounting for the spatial correlation in our modeling leads to improved functional regression performance. Applied to a high-throughput spatially correlated copy number dataset, the model identifies genetic markers not identified by comparable methods that ignore spatial correlations.
Collapse
|
Journal Article |
9 |
21 |
21
|
Hasenstab K, Scheffler A, Telesca D, Sugar CA, Jeste S, DiStefano C, Şentürk D. A multi-dimensional functional principal components analysis of EEG data. Biometrics 2017; 73:999-1009. [PMID: 28072468 PMCID: PMC5517364 DOI: 10.1111/biom.12635] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Revised: 11/01/2016] [Accepted: 11/01/2016] [Indexed: 11/28/2022]
Abstract
The electroencephalography (EEG) data created in event-related potential (ERP) experiments have a complex high-dimensional structure. Each stimulus presentation, or trial, generates an ERP waveform which is an instance of functional data. The experiments are made up of sequences of multiple trials, resulting in longitudinal functional data and moreover, responses are recorded at multiple electrodes on the scalp, adding an electrode dimension. Traditional EEG analyses involve multiple simplifications of this structure to increase the signal-to-noise ratio, effectively collapsing the functional and longitudinal components by identifying key features of the ERPs and averaging them across trials. Motivated by an implicit learning paradigm used in autism research in which the functional, longitudinal, and electrode components all have critical interpretations, we propose a multidimensional functional principal components analysis (MD-FPCA) technique which does not collapse any of the dimensions of the ERP data. The proposed decomposition is based on separation of the total variation into subject and subunit level variation which are further decomposed in a two-stage functional principal components analysis. The proposed methodology is shown to be useful for modeling longitudinal trends in the ERP functions, leading to novel insights into the learning patterns of children with Autism Spectrum Disorder (ASD) and their typically developing peers as well as comparisons between the two groups. Finite sample properties of MD-FPCA are further studied via extensive simulations.
Collapse
|
research-article |
8 |
21 |
22
|
Rosquist PG, Collins G, Merrell AJ, Tuttle NJ, Tracy JB, Bird ET, Seeley MK, Fullwood DT, Christensen WF, Bowden AE. Estimation of 3D Ground Reaction Force Using Nanocomposite Piezo-Responsive Foam Sensors During Walking. Ann Biomed Eng 2017; 45:2122-2134. [PMID: 28512701 DOI: 10.1007/s10439-017-1852-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 05/10/2017] [Indexed: 11/25/2022]
Abstract
This paper describes a method for the estimation of the 3D ground reaction force (GRF) during human walking using novel nanocomposite piezo-responsive foam (NCPF) sensors. Nine subjects (5 male, 4 female) walked on a force-instrumented treadmill at 1.34 m/s for 120 s each while wearing a shoe that was instrumented with four NCPF sensors. GRF data, measured via the treadmill, and sensor data, measured via the NCPF inserts, were used in a tenfold cross validation process to calibrate a separate model for each individual. The calibration model estimated average anterior-posterior, mediolateral and vertical GRF with mean average errors (MAE) of 6.52 N (2.14%), 4.79 N (6.34%), and 15.4 N (2.15%), respectively. Two additional models were created using the sensor data from all subjects and subject demographics. A tenfold cross validation process for this combined data set resulted in models that estimated average anterior-posterior, mediolateral and vertical GRF with less than 8.16 N (2.41%), 6.63 N (7.37%), and 19.4 N (2.31%) errors, respectively. Intra-subject estimates based on the model had a higher accuracy than inter-subject estimates, likely due to the relatively small subject cohort used in creating the model. The novel NCPF sensors demonstrate the ability to accurately estimate 3D GRF during human movement outside of the traditional biomechanics laboratory setting.
Collapse
|
Journal Article |
8 |
20 |
23
|
Hadjipantelis PZ, Aston JAD, Müller HG, Evans JP. Unifying Amplitude and Phase Analysis: A Compositional Data Approach to Functional Multivariate Mixed-Effects Modeling of Mandarin Chinese. J Am Stat Assoc 2015; 110:545-559. [PMID: 26692591 PMCID: PMC4647844 DOI: 10.1080/01621459.2015.1006729] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 03/01/2015] [Indexed: 12/05/2022]
Abstract
Mandarin Chinese is characterized by being a tonal language; the pitch (or F0) of its utterances carries considerable linguistic information. However, speech samples from different individuals are subject to changes in amplitude and phase, which must be accounted for in any analysis that attempts to provide a linguistically meaningful description of the language. A joint model for amplitude, phase, and duration is presented, which combines elements from functional data analysis, compositional data analysis, and linear mixed effects models. By decomposing functions via a functional principal component analysis, and connecting registration functions to compositional data analysis, a joint multivariate mixed effect model can be formulated, which gives insights into the relationship between the different modes of variation as well as their dependence on linguistic and nonlinguistic covariates. The model is applied to the COSPRO-1 dataset, a comprehensive database of spoken Taiwanese Mandarin, containing approximately 50,000 phonetically diverse sample F0 contours (syllables), and reveals that phonetic information is jointly carried by both amplitude and phase variation. Supplementary materials for this article are available online.
Collapse
|
research-article |
10 |
16 |
24
|
Scheffler A, Telesca D, Li Q, Sugar CA, Distefano C, Jeste S, Şentürk D. Hybrid principal components analysis for region-referenced longitudinal functional EEG data. Biostatistics 2020; 21:139-157. [PMID: 30084925 DOI: 10.1093/biostatistics/kxy034] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2017] [Revised: 01/25/2018] [Accepted: 06/11/2018] [Indexed: 11/12/2022] Open
Abstract
Electroencephalography (EEG) data possess a complex structure that includes regional, functional, and longitudinal dimensions. Our motivating example is a word segmentation paradigm in which typically developing (TD) children, and children with autism spectrum disorder (ASD) were exposed to a continuous speech stream. For each subject, continuous EEG signals recorded at each electrode were divided into one-second segments and projected into the frequency domain via fast Fourier transform. Following a spectral principal components analysis, the resulting data consist of region-referenced principal power indexed regionally by scalp location, functionally across frequencies, and longitudinally by one-second segments. Standard EEG power analyses often collapse information across the longitudinal and functional dimensions by averaging power across segments and concentrating on specific frequency bands. We propose a hybrid principal components analysis for region-referenced longitudinal functional EEG data, which utilizes both vector and functional principal components analyses and does not collapse information along any of the three dimensions of the data. The proposed decomposition only assumes weak separability of the higher-dimensional covariance process and utilizes a product of one dimensional eigenvectors and eigenfunctions, obtained from the regional, functional, and longitudinal marginal covariances, to represent the observed data, providing a computationally feasible non-parametric approach. A mixed effects framework is proposed to estimate the model components coupled with a bootstrap test for group level inference, both geared towards sparse data applications. Analysis of the data from the word segmentation paradigm leads to valuable insights about group-region differences among the TD and verbal and minimally verbal children with ASD. Finite sample properties of the proposed estimation framework and bootstrap inference procedure are further studied via extensive simulations.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
16 |
25
|
Kim JS, Staicu AM, Maity A, Carroll RJ, Ruppert D. Additive Function-on-Function Regression. J Comput Graph Stat 2017; 27:234-244. [PMID: 29780218 DOI: 10.1080/10618600.2017.1356730] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
We study additive function-on-function regression where the mean response at a particular time point depends on the time point itself, as well as the entire covariate trajectory. We develop a computationally efficient estimation methodology based on a novel combination of spline bases with an eigenbasis to represent the trivariate kernel function. We discuss prediction of a new response trajectory, propose an inference procedure that accounts for total variability in the predicted response curves, and construct pointwise prediction intervals. The estimation/inferential procedure accommodates realistic scenarios, such as correlated error structure as well as sparse and/or irregular designs. We investigate our methodology in finite sample size through simulations and two real data applications. Supplementary Material for this article is available online.
Collapse
|
Journal Article |
8 |
16 |