1
|
Johns JT, Crainiceanu C, Zipunnikov V, Gellar J. Variable-Domain Functional Principal Component Analysis. J Comput Graph Stat 2019. [DOI: 10.1080/10618600.2019.1604373] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Jordan T. Johns
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD
| | - Ciprian Crainiceanu
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD
| | - Vadim Zipunnikov
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD
| | | |
Collapse
|
2
|
Szczesniak RD, Li D, Su W, Brokamp C, Pestian J, Seid M, Clancy JP. Phenotypes of Rapid Cystic Fibrosis Lung Disease Progression during Adolescence and Young Adulthood. Am J Respir Crit Care Med 2017; 196:471-478. [PMID: 28410569 PMCID: PMC5564675 DOI: 10.1164/rccm.201612-2574oc] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2016] [Accepted: 04/13/2017] [Indexed: 01/12/2023] Open
Abstract
RATIONALE Individuals with cystic fibrosis are at risk for prolonged drops in lung function, clinically termed rapid decline, during discreet periods of the disease. OBJECTIVES To identify phenotypes of rapid pulmonary decline and determine how these phenotypes are related to patient characteristics. METHODS A longitudinal cohort study of patients with cystic fibrosis aged 6-21 years was conducted using the Cystic Fibrosis Foundation Patient Registry. A statistical approach for clustering longitudinal profiles, sparse functional principal components analysis, was used to classify patients into distinct phenotypes by evaluating trajectories of FEV1 decline. Phenotypes were compared with respect to baseline and mortality characteristics. MEASUREMENTS AND MAIN RESULTS Three distinct phenotypes of rapid decline were identified, corresponding to early, middle, and late timing of maximal FEV1 loss, in the overall cohort (n = 18,387). The majority of variation (first functional principal component, 94%) among patient profiles was characterized by differences in mean longitudinal FEV1 trajectories. Average degree of rapid decline was similar among phenotypes (roughly -3% predicted/yr); however, average timing differed, with early, middle, and late phenotypes experiencing rapid decline at 12.9, 16.3, and 18.5 years of age, respectively. Individuals with the late phenotype had the highest initial FEV1 but experienced the greatest loss of lung function. The early phenotype was more likely to have respiratory infections and acute exacerbations at baseline or to develop them subsequently, compared with other phenotypes. CONCLUSIONS By identifying phenotypes and associated risk factors, timing of interventions may be more precisely targeted for subgroups at highest risk of lung function loss.
Collapse
Affiliation(s)
- Rhonda D. Szczesniak
- Division of Biostatistics and Epidemiology
- Division of Pulmonary Medicine
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio
| | - Dan Li
- Alzheimer’s Therapeutic Research Institute, Keck School of Medicine, University of Southern California, Los Angeles, California; and
| | - Weiji Su
- Department of Mathematical Sciences, University of Cincinnati, Cincinnati, Ohio
| | | | - John Pestian
- Division of Biomedical Informatics, and
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio
| | - Michael Seid
- Division of Pulmonary Medicine
- James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio
| | - John P. Clancy
- Division of Pulmonary Medicine
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio
| |
Collapse
|
3
|
Bai J, Ivanescu A, Crainiceanu CM. Discussion of the paper ‘A general framework for functional regression modelling’. STAT MODEL 2017. [DOI: 10.1177/1471082x16681335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
This discussion provides our reaction to the article by Greven and Scheipl. It contains an overview of their article and a description of the many areas of research that remain open and could benefit from further methodological and computational development.
Collapse
Affiliation(s)
- Jiawei Bai
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| | - Andrada Ivanescu
- Department of Mathematical Sciences, Montclair State University, Montclair, NJ, USA
| | | |
Collapse
|
4
|
Fisher A, Caffo B, Schwartz B, Zipunnikov V. Fast, Exact Bootstrap Principal Component Analysis for p > 1 million. J Am Stat Assoc 2016; 111:846-860. [PMID: 27616801 DOI: 10.1080/01621459.2015.1062383] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (p) is much larger than the number of subjects (n), calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same n-dimensional subspace as the original sample. As a result, all bootstrap principal components are limited to the same n-dimensional subspace and can be efficiently represented by their low dimensional coordinates in that subspace. Several uncertainty metrics can be computed solely based on the bootstrap distribution of these low dimensional coordinates, without calculating or storing the p-dimensional bootstrap components. Fast bootstrap PCA is applied to a dataset of sleep electroencephalogram recordings (p = 900, n = 392), and to a dataset of brain magnetic resonance images (MRIs) (p ≈ 3 million, n = 352). For the MRI dataset, our method allows for standard errors for the first 3 principal components based on 1000 bootstrap samples to be calculated on a standard laptop in 47 minutes, as opposed to approximately 4 days with standard methods.
Collapse
|
5
|
Morgenthaler TI, Croft JB, Dort LC, Loeding LD, Mullington JM, Thomas SM. Development of the National Healthy Sleep Awareness Project Sleep Health Surveillance Questions. J Clin Sleep Med 2015; 11:1057-62. [PMID: 26235156 DOI: 10.5664/jcsm.5026] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Accepted: 07/20/2015] [Indexed: 11/13/2022]
Abstract
OBJECTIVES For the first time ever, as emphasized by inclusion in the Healthy People 2020 goals, sleep health is an emphasis of national health aims. The National Healthy Sleep Awareness Project (NHSAP) was tasked to propose questions for inclusion in the next Behavioral Risk Factor Surveillance System (BRFSS), a survey that includes a number of questions that target behaviors thought to impact health, as a means to measure community sleep health. The total number of questions could not exceed five, and had to include an assessment of the risk for obstructive sleep apnea (OSA). METHODS An appointed workgroup met via teleconference and face-to-face venues to develop an inventory of published survey questions being used to identify sleep health, to develop a framework on which to analyze the strengths and weaknesses of current survey questions concerning sleep, and to develop recommendations for sleep health and disease surveillance questions going forward. RESULTS The recommendation was to focus on certain existing BRFSS questions pertaining to sleep duration, quality, satisfaction, daytime alertness, and to add to these other BRFSS existing questions to make a modified STOP-BANG questionnaire (minus the N for neck circumference) to assess for risk of OSA. CONCLUSIONS Sleep health is an important dimension of health that has previously received less attention in national health surveys. We believe that 5 questions recommended for the upcoming BRFSS question banks will assist as important measures of sleep health, and may help to evaluate the effectiveness of interventions to improve sleep health in our nation.
Collapse
Affiliation(s)
| | - Janet B Croft
- Centers for Disease Control and Prevention, Atlanta, GA
| | | | | | | | | |
Collapse
|
6
|
Swihart BJ, Punjabi NM, Crainiceanu CM. Modeling sleep fragmentation in sleep hypnograms: An instance of fast, scalable discrete-state, discrete-time analyses. Comput Stat Data Anal 2015; 89:1-11. [PMID: 27182097 DOI: 10.1016/j.csda.2015.03.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Methods are introduced for the analysis of large sets of sleep study data (hypnograms) using a 5-state 20-transition-type structure defined by the American Academy of Sleep Medicine. Application of these methods to the hypnograms of 5598 subjects from the Sleep Heart Health Study provide: the first analysis of sleep hypnogram data of such size and complexity in a community cohort with a range of sleep-disordered breathing severity; introduce a novel approach to compare 5-state (20-transition-type) to 3-state (6-transition-type) sleep structures to assess information loss from combining sleep state categories; extend current approaches of multivariate survival data analysis to clustered, recurrent event discrete-state discrete-time processes; and provide scalable solutions for data analyses required by the case study. The analysis provides detailed new insights into the association between sleep-disordered breathing and sleep architecture. The example data and both R and SAS code are included in online supplementary materials.
Collapse
Affiliation(s)
- Bruce J Swihart
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, United States
| | | | - Ciprian M Crainiceanu
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, United States
| |
Collapse
|
7
|
Shou H, Zipunnikov V, Crainiceanu CM, Greven S. Structured functional principal component analysis. Biometrics 2014; 71:247-257. [PMID: 25327216 DOI: 10.1111/biom.12236] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Revised: 07/01/2013] [Accepted: 08/01/2014] [Indexed: 11/30/2022]
Abstract
Motivated by modern observational studies, we introduce a class of functional models that expand nested and crossed designs. These models account for the natural inheritance of the correlation structures from sampling designs in studies where the fundamental unit is a function or image. Inference is based on functional quadratics and their relationship with the underlying covariance structure of the latent processes. A computationally fast and scalable estimation procedure is developed for high-dimensional data. Methods are used in applications including high-frequency accelerometer data for daily activity, pitch linguistic data for phonetic analysis, and EEG data for studying electrical brain activity during sleep.
Collapse
Affiliation(s)
- Haochang Shou
- Department of Biostatistics and Epidemiology, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, U.S.A
| | - Vadim Zipunnikov
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, U.S.A
| | - Ciprian M Crainiceanu
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, U.S.A
| | - Sonja Greven
- Department of Statistics, Ludwig-Maximilians-Universität München, Munich, Germany
| |
Collapse
|
8
|
Di C, Crainiceanu CM, Jank WS. Multilevel sparse functional principal component analysis. Stat (Int Stat Inst) 2014; 3:126-143. [PMID: 24872597 DOI: 10.1002/sta4.50] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis (MFPCA; Di et al. 2009) was proposed for such data when functions are densely recorded. Here we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between and within subject levels. We address inherent methodological differences in the sparse sampling context to: 1) estimate the covariance operators; 2) estimate the functional principal component scores; 3) predict the underlying curves. Through simulations the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions.
Collapse
Affiliation(s)
- Chongzhi Di
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, M2-B500, Seattle, WA 98115, USA
| | - Ciprian M Crainiceanu
- Department of Biostatistics, Johns Hopkins University, 615 North Wolfe Street, Baltimore, MD 21205, USA
| | - Wolfgang S Jank
- Department of Information Systems and Decision Sciences, University of South Florida, Tampa, FL 33620, USA
| |
Collapse
|
9
|
Staicu AM, Li Y, Crainiceanu CM, Ruppert D. Likelihood Ratio Tests for Dependent Data with Applications to Longitudinal and Functional Data Analysis. Scand Stat Theory Appl 2014. [DOI: 10.1111/sjos.12075] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | - Yingxing Li
- The Wang Yanan Institute for Studies in Economics; Xiamen University
| | | | - David Ruppert
- Department of Statistical Science and School of Operations Research and Information Engineering; Cornell University
| |
Collapse
|
10
|
Langrock R, Swihart BJ, Caffo BS, Punjabi NM, Crainiceanu CM. Combining hidden Markov models for comparing the dynamics of multiple sleep electroencephalograms. Stat Med 2013; 32:3342-56. [PMID: 23348835 PMCID: PMC3753805 DOI: 10.1002/sim.5747] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2011] [Accepted: 01/04/2013] [Indexed: 11/11/2022]
Abstract
In this manuscript, we consider methods for the analysis of populations of electroencephalogram signals during sleep for the study of sleep disorders using hidden Markov models (HMMs). Notably, we propose an easily implemented method for simultaneously modeling multiple time series that involve large amounts of data. We apply these methods to study sleep-disordered breathing (SDB) in the Sleep Heart Health Study (SHHS), a landmark study of SDB and cardiovascular consequences. We use the entire, longitudinally collected, SHHS cohort to develop HMM population parameters, which we then apply to obtain subject-specific Markovian predictions. From these predictions, we create several indices of interest, such as transition frequencies between latent states. Our HMM analysis of electroencephalogram signals uncovers interesting findings regarding differences in brain activity during sleep between those with and without SDB. These findings include stability of the percent time spent in HMM latent states across matched diseased and non-diseased groups and differences in the rate of transitioning.
Collapse
Affiliation(s)
- Roland Langrock
- School of Mathematics and Statistics, University of St Andrews, The Observatory, Buchanan Gardens, St Andrews, Fife, KY16 PLZ, Scotland, UK.
| | | | | | | | | |
Collapse
|
11
|
Woodard DB, Crainiceanu C, Ruppert D. Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors. J Comput Graph Stat 2013; 22:10.1080/10618600.2012.694765. [PMID: 24293988 PMCID: PMC3842620 DOI: 10.1080/10618600.2012.694765] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
We propose a new method for regression using a parsimonious and scientifically interpretable representation of functional predictors. Our approach is designed for data that exhibit features such as spikes, dips, and plateaus whose frequency, location, size, and shape varies stochastically across subjects. We propose Bayesian inference of the joint functional and exposure models, and give a method for efficient computation. We contrast our approach with existing state-of-the-art methods for regression with functional predictors, and show that our method is more effective and efficient for data that include features occurring at varying locations. We apply our methodology to a large and complex dataset from the Sleep Heart Health Study, to quantify the association between sleep characteristics and health outcomes. Software and technical appendices are provided in online supplemental materials.
Collapse
|
12
|
Li S, Eloyan A, Joel S, Mostofsky S, Pekar J, Bassett SS, Caffo B. Analysis of group ICA-based connectivity measures from fMRI: application to Alzheimer's disease. PLoS One 2012; 7:e49340. [PMID: 23226208 PMCID: PMC3511486 DOI: 10.1371/journal.pone.0049340] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2012] [Accepted: 10/10/2012] [Indexed: 11/18/2022] Open
Abstract
Functional magnetic resonance imaging (fMRI) is a powerful tool for the in vivo study of the pathophysiology of brain disorders and disease. In this manuscript, we propose an analysis stream for fMRI functional connectivity data and apply it to a novel study of Alzheimer's disease. In the first stage, spatial independent component analysis is applied to group fMRI data to obtain common brain networks (spatial maps) and subject-specific mixing matrices (time courses). In the second stage, functional principal component analysis is utilized to decompose the mixing matrices into population-level eigenvectors and subject-specific loadings. Inference is performed using permutation-based exact logistic regression for matched pairs data. The method is applied to a novel fMRI study of Alzheimer's disease risk under a verbal paired associates task. We found empirical evidence of alternative ICA-based metrics of connectivity when comparing subjects evidencing mild cognitive impairment relative to carefully matched controls.
Collapse
Affiliation(s)
- Shanshan Li
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA.
| | | | | | | | | | | | | |
Collapse
|
13
|
Crainiceanu CM, Staicu AM, Ray S, Punjabi N. Bootstrap-based inference on the difference in the means of two correlated functional processes. Stat Med 2012; 31:3223-40. [PMID: 22855258 DOI: 10.1002/sim.5439] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2011] [Accepted: 04/18/2012] [Indexed: 11/06/2022]
Abstract
We propose nonparametric inference methods on the mean difference between two correlated functional processes. We compare methods that (1) incorporate different levels of smoothing of the mean and covariance; (2) preserve the sampling design; and (3) use parametric and nonparametric estimation of the mean functions. We apply our method to estimating the mean difference between average normalized δ power of sleep electroencephalograms for 51 subjects with severe sleep apnea and 51 matched controls in the first 4 h after sleep onset. We obtain data from the Sleep Heart Health Study, the largest community cohort study of sleep. Although methods are applied to a single case study, they can be applied to a large number of studies that have correlated functional data.
Collapse
Affiliation(s)
- Ciprian M Crainiceanu
- Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe St., Baltimore, MD 21205, U.S.A.
| | | | | | | |
Collapse
|
14
|
Swihart BJ, Caffo BS, Crainiceanu CM, Punjabi NM. Mixed effect Poisson log-linear models for clinical and epidemiological sleep hypnogram data. Stat Med 2012; 31:855-70. [PMID: 22241689 DOI: 10.1002/sim.4457] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2010] [Accepted: 10/14/2011] [Indexed: 11/11/2022]
Abstract
Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased with non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear generalized estimating equations (GEE) models for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis.
Collapse
Affiliation(s)
- Bruce J Swihart
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.
| | | | | | | |
Collapse
|
15
|
Guan Y, Li Y, Sinha R. Cocaine Dependence Treatment Data: Methods for Measurement Error Problems With Predictors Derived From Stationary Stochastic Processes. J Am Stat Assoc 2011; 106:480-493. [PMID: 21984854 DOI: 10.1198/jasa.2011.ap10291] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
In a cocaine dependence treatment study, we use linear and nonlinear regression models to model posttreatment cocaine craving scores and first cocaine relapse time. A subset of the covariates are summary statistics derived from baseline daily cocaine use trajectories, such as baseline cocaine use frequency and average daily use amount. These summary statistics are subject to estimation error and can therefore cause biased estimators for the regression coefficients. Unlike classical measurement error problems, the error we encounter here is heteroscedastic with an unknown distribution, and there are no replicates for the error-prone variables or instrumental variables. We propose two robust methods to correct for the bias: a computationally efficient method-of-moments-based method for linear regression models and a subsampling extrapolation method that is generally applicable to both linear and nonlinear regression models. Simulations and an application to the cocaine dependence treatment data are used to illustrate the efficacy of the proposed methods. Asymptotic theory and variance estimation for the proposed subsampling extrapolation method and some additional simulation results are described in the online supplementary material.
Collapse
Affiliation(s)
- Yongtao Guan
- Division of Biostatistics, Yale School of Public Health, Yale University, New Haven, CT 06520
| | | | | |
Collapse
|
16
|
Crainiceanu CM, Caffo BS, Luo S, Zipunnikov VM, Punjabi NM. Population Value Decomposition, a Framework for the Analysis of Image Populations. J Am Stat Assoc 2011; 106:10.1198/jasa.2011.ap10089. [PMID: 24415813 PMCID: PMC3886284 DOI: 10.1198/jasa.2011.ap10089] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Images, often stored in multidimensional arrays, are fast becoming ubiquitous in medical and public health research. Analyzing populations of images is a statistical problem that raises a host of daunting challenges. The most significant challenge is the massive size of the datasets incorporating images recorded for hundreds or thousands of subjects at multiple visits. We introduce the population value decomposition (PVD), a general method for simultaneous dimensionality reduction of large populations of massive images. We show how PVD can be seamlessly incorporated into statistical modeling, leading to a new, transparent, and rapid inferential framework. Our PVD methodology was motivated by and applied to the Sleep Heart Health Study, the largest community-based cohort study of sleep containing more than 85 billion observations on thousands of subjects at two visits. This article has supplementary material online.
Collapse
Affiliation(s)
- Ciprian M. Crainiceanu
- Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe St., Baltimore, MD 21205
| | - Brian S. Caffo
- Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe St., Baltimore, MD 21205
| | - Sheng Luo
- Division of Biostatistics, School of Public Health, University of Texas Health Science Center at Houston, 1200 Herman Pressler Dr, Houston, TX 77030
| | - Vadim M. Zipunnikov
- Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe St., Baltimore, MD 21205
| | - Naresh M. Punjabi
- Department of Epidemiology, Johns Hopkins University, 615 N. Wolfe St., Baltimore, MD 21205
| |
Collapse
|
17
|
|
18
|
Caffo BS, Crainiceanu CM, Verduzco G, Joel S, Mostofsky SH, Bassett SS, Pekar JJ. Two-stage decompositions for the analysis of functional connectivity for fMRI with application to Alzheimer's disease risk. Neuroimage 2010; 51:1140-9. [PMID: 20227508 DOI: 10.1016/j.neuroimage.2010.02.081] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2009] [Revised: 02/25/2010] [Accepted: 02/28/2010] [Indexed: 11/25/2022] Open
Abstract
Functional connectivity is the study of correlations in measured neurophysiological signals. Altered functional connectivity has been shown to be associated with a variety of cognitive and memory impairments and dysfunction, including Alzheimer's disease. In this manuscript we use a two-stage application of the singular value decomposition to obtain data driven population-level measures of functional connectivity in functional magnetic resonance imaging (fMRI). The method is computationally simple and amenable to high dimensional fMRI data with large numbers of subjects. Simulation studies suggest the ability of the decomposition methods to recover population brain networks and their associated loadings. We further demonstrate the utility of these decompositions in a functional logistic regression model. The method is applied to a novel fMRI study of Alzheimer's disease risk under a verbal paired associates task. We found an indication of alternative connectivity in clinically asymptomatic at-risk subjects when compared to controls, which was not significant in the light of multiple comparisons adjustment. The relevant brain network loads primarily on the temporal lobe and overlaps significantly with the olfactory areas and temporal poles.
Collapse
Affiliation(s)
- Brian S Caffo
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
| | | | | | | | | | | | | |
Collapse
|
19
|
Crainiceanu CM, Goldsmith AJ. Bayesian Functional Data Analysis Using WinBUGS. J Stat Softw 2010; 32:i11. [PMID: 21743798 PMCID: PMC3130307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023] Open
Abstract
We provide user friendly software for Bayesian analysis of functional data models using WinBUGS 1.4. The excellent properties of Bayesian analysis in this context are due to: (1) dimensionality reduction, which leads to low dimensional projection bases; (2) mixed model representation of functional models, which provides a modular approach to model extension; and (3) orthogonality of the principal component bases, which contributes to excellent chain convergence and mixing properties. Our paper provides one more, essential, reason for using Bayesian analysis for functional models: the existence of software.
Collapse
Affiliation(s)
- Ciprian M. Crainiceanu
- Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe St. E3636, Baltimore, MD 21205, United States of America, URL: http://www.biostat.jhsph.edu/~ccrainic/
| | - A. Jeffrey Goldsmith
- Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe St. E3037, Baltimore, MD 21205, United States of America
| |
Collapse
|
20
|
Abstract
We introduce Generalized Multilevel Functional Linear Models (GMFLMs), a novel statistical framework for regression models where exposure has a multilevel functional structure. We show that GMFLMs are, in fact, generalized multilevel mixed models (GLMMs). Thus, GMFLMs can be analyzed using the mixed effects inferential machinery and can be generalized within a well researched statistical framework. We propose and compare two methods for inference: 1) a two-stage frequentist approach; and 2) a joint Bayesian analysis. Our methods are motivated by and applied to the Sleep Heart Health Study (SHHS), the largest community cohort study of sleep. However, our methods are general and easy to apply to a wide spectrum of emerging biological and medical data sets. Supplemental materials for this article are available online.
Collapse
Affiliation(s)
- Ciprian M Crainiceanu
- Ciprian M. Crainiceanu is Associate Professor, Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland 21205 (E-mail: )
| | | | | |
Collapse
|
21
|
Di CZ, Crainiceanu CM, Caffo BS, Punjabi NM. Multilevel functional principal component analysis. Ann Appl Stat 2009. [DOI: 10.1214/08-aoas206] [Citation(s) in RCA: 205] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|