1
|
Wu H, Li YF. Clustering Spatially Correlated Functional Data With Multiple Scalar Covariates. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7074-7088. [PMID: 35020597 DOI: 10.1109/tnnls.2021.3137795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
We propose a probabilistic model for clustering spatially correlated functional data with multiple scalar covariates. The motivating application is to partition the 29 provinces of the Chinese mainland into a few groups characterized by the epidemic severity of COVID-19, while the spatial dependence and effects of risk factors are considered. It can be regarded as an extension of mixture models, which allows different subsets of covariates to influence the component weights and the component densities by modeling the parameters of the mixture as functions of the covariates. In this way, provinces with similar spatial factors are a priori more likely to be clustered together. Posterior predictive inference in this model formalizes the desired prediction. Further, the identifiability of the proposed model is analyzed, and sufficient conditions to guarantee "generic" identifiability are provided. An L1 -penalized estimator is developed to assist variable selection and robust estimation when the number of explanatory covariates is large. An efficient expectation-minimization algorithm is presented for parameter estimation. Simulation studies and real-data examples are presented to investigate the empirical performance of the proposed method. Finally, it is worth noting that the proposed model has a wide range of practical applications, e.g., health management, environmental science, ecological studies, and so on.
Collapse
|
2
|
Drouin P, Stamm A, Chevreuil L, Graillot V, Barbin L, Gourraud PA, Laplaud DA, Bellanger L. Semi-supervised clustering of quaternion time series: Application to gait analysis in multiple sclerosis using motion sensor data. Stat Med 2023; 42:433-456. [PMID: 36509423 PMCID: PMC10108058 DOI: 10.1002/sim.9625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 09/02/2022] [Accepted: 11/24/2022] [Indexed: 12/14/2022]
Abstract
Recent approaches in gait analysis involve the use of wearable motion sensors to extract spatio-temporal parameters that characterize multiple aspects of an individual's gait. In particular, the medical community could largely benefit from this type of devices as they could provide the clinicians with a valuable tool for assessing gait impairment. Motion sensor data are however complex and there is an urgent unmet need to develop sound statistical methods for analyzing such data and extracting clinically relevant information. In this article, we measure gait by following the hip rotation over time and the resulting statistical unit is a time series of unit quaternions. We explore the possibility to form groups of patients with similar walking impairment by taking into account their walking data and their global decease severity with semi-supervised clustering. We generalize a compromise-based method named hclustcompro to unit quaternion time series by combining it with the proper dissimilarity quaternion dynamic time warping. We apply this method on patients diagnosed with multiple sclerosis to form groups of patients with similar walking deficiencies while accounting for the clinical assessment of their overall disability. We also compare the compromise-based clustering approach with the method mergeTrees that falls into a sub-class of ensemble clustering named collaborative clustering. The results provide a first proof of both the interest of using wearable motion sensors for assessing gait impairment and the use of prior knowledge to guide the clustering process. It also demonstrates that compromise-based clustering is a more appropriate approach in this context.
Collapse
Affiliation(s)
- Pierre Drouin
- Laboratoire de Mathématiques Jean Leray, Université de Nantes, Nantes, France.,UmanIT, Nantes, France
| | - Aymeric Stamm
- Laboratoire de Mathématiques Jean Leray, Université de Nantes, Nantes, France
| | | | | | - Laetitia Barbin
- CRTI-Inserm U1064, CIC, Service de Neurologie, CHU et Université de Nantes, Nantes, France
| | - Pierre-Antoine Gourraud
- Centre de Recherche en Transplantation et Immunologie, UMR 1064, ATIP-Avenir, Université de Nantes, CHU de Nantes, INSERM, Nantes, France
| | - David-Axel Laplaud
- CRTI-Inserm U1064, CIC, Service de Neurologie, CHU et Université de Nantes, Nantes, France
| | - Lise Bellanger
- Laboratoire de Mathématiques Jean Leray, Université de Nantes, Nantes, France
| |
Collapse
|
3
|
Cremona MA, Chiaromonte F. Probabilistic K-means with local alignment for clustering and motif discovery in functional data. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2156522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Marzia A. Cremona
- Dept. of Operations and Decision Systems, Université Laval, CHU de Québec – Université Laval Research Center
| | - Francesca Chiaromonte
- Dept. of Statistics, The Pennsylvania State University, Inst. of Economics and EMbeDS, Sant’Anna School of Advanced Studies
| |
Collapse
|
4
|
Iorio C, Frasso G, D’Ambrosio A, Siciliano R. Boosted-oriented probabilistic smoothing-spline clustering of series. STAT METHOD APPL-GER 2022. [DOI: 10.1007/s10260-022-00665-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
AbstractFuzzy clustering methods allow the objects to belong to several clusters simultaneously, with different degrees of membership. However, a factor that influences the performance of fuzzy algorithms is the value of fuzzifier parameter. In this paper, we propose a fuzzy clustering procedure for data (time) series that does not depend on the definition of a fuzzifier parameter. It comes from two approaches, theoretically motivated for unsupervised and supervised classification cases, respectively. The first is the Probabilistic Distance clustering procedure. The second is the well known Boosting philosophy. Our idea is to adopt a boosting prospective for unsupervised learning problems, in particular we face with non hierarchical clustering problems. The global performance of the proposed method is investigated by various experiments.
Collapse
|
5
|
Guo X, Kurtek S, Bharath K. Variograms for kriging and clustering of spatial functional data with phase variation. SPATIAL STATISTICS 2022; 51:100687. [PMID: 36777259 PMCID: PMC9912960 DOI: 10.1016/j.spasta.2022.100687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Spatial, amplitude and phase variations in spatial functional data are confounded. Conclusions from the popular functional trace-variogram, which quantifies spatial variation, can be misleading when analyzing misaligned functional data with phase variation. To remedy this, we describe a framework that extends amplitude-phase separation methods in functional data to the spatial setting, with a view towards performing clustering and spatial prediction. We propose a decomposition of the trace-variogram into amplitude and phase components, and quantify how spatial correlations between functional observations manifest in their respective amplitude and phase. This enables us to generate separate amplitude and phase clustering methods for spatial functional data, and develop a novel spatial functional interpolant at unobserved locations based on combining separate amplitude and phase predictions. Through simulations and real data analyses, we demonstrate advantages of our approach when compared to standard ones that ignore phase variation, through more accurate predictions and more interpretable clustering results.
Collapse
Affiliation(s)
- Xiaohan Guo
- Department of Statistics, The Ohio State University, 1958 Neil Avenue, Columbus, OH 43210, USA
| | - Sebastian Kurtek
- Department of Statistics, The Ohio State University, 1958 Neil Avenue, Columbus, OH 43210, USA
| | - Karthik Bharath
- School of Mathematical Sciences, University of Nottingham, University Park, Nottingham NG7 2RD, UK
| |
Collapse
|
6
|
Band depth based initialization of K-means for functional data clustering. ADV DATA ANAL CLASSI 2022. [DOI: 10.1007/s11634-022-00510-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
AbstractThe k-Means algorithm is one of the most popular choices for clustering data but is well-known to be sensitive to the initialization process. There is a substantial number of methods that aim at finding optimal initial seeds for k-Means, though none of them is universally valid. This paper presents an extension to longitudinal data of one of such methods, the BRIk algorithm, that relies on clustering a set of centroids derived from bootstrap replicates of the data and on the use of the versatile Modified Band Depth. In our approach we improve the BRIk method by adding a step where we fit appropriate B-splines to our observations and a resampling process that allows computational feasibility and handling issues such as noise or missing data. We have derived two techniques for providing suitable initial seeds, each of them stressing respectively the multivariate or the functional nature of the data. Our results with simulated and real data sets indicate that our Functional Data Approach to the BRIK method (FABRIk) and our Functional Data Extension of the BRIK method (FDEBRIk) are more effective than previous proposals at providing seeds to initialize k-Means in terms of clustering recovery.
Collapse
|
7
|
Hao W, Hao H, Ren CF, Wang X, Gao B. Associations Between Posterior Communicating Artery Aneurysms and Morphological Characteristics of Surrounding Arteries. Front Neurol 2022; 13:874466. [PMID: 35911913 PMCID: PMC9326252 DOI: 10.3389/fneur.2022.874466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Accepted: 05/30/2022] [Indexed: 11/13/2022] Open
Abstract
Objectives To explore the associations between posterior communicating artery (PComA) aneurysms and morphological characteristics of arteries upstream of and around the PComA bifurcation site. Methods In this study, fifty-seven patients with PComA aneurysms and sixty-two control subjects without aneurysms were enrolled. The centerlines of the internal carotid artery (ICA) and important branches were generated for the measurement and analysis of morphological parameters, such as carotid siphon types, diameters of two fitting circles, and the angle formed by them (D1, D2, and ϕ), length (L) and tortuosity (TL) of ICA segment between an ophthalmic artery and PComA bifurcations, bifurcation angle (θ), tortuosity (TICA and TPComA), and flow direction changes (θICA and θPComA) around the PComA bifurcation site. Results No significant difference (p > 0.05) was found in the siphon types (p = 0.467) or L (p = 0.114). Significant differences (p < 0.05) were detected in D1 (p = 0.036), TL (p < 0.001), D2 (p = 0.004), ϕ (p = 0.008), θ (p = 0.001), TICA (p < 0.001), TPComA (p = 0.012), θICA (p < 0.001), and θPComA (p < 0.001) between the two groups. TICA had the largest area under the curve (AUC) (0.843) in the receiver operating characteristic (ROC) analysis in diagnosing the probability of PComA aneurysms presence and was identified as the only potent morphological parameter (OR = 11.909) associated with PComA aneurysms presence. Conclusions The high tortuosity of the ICA segment around the PComA bifurcation is associated with PComA aneurysm presence.
Collapse
Affiliation(s)
- Weili Hao
- Department of Medical Research, Shijiazhuang People's Hospital, Shijiazhuang, China
| | - Hong Hao
- Department of Medical Research, Shijiazhuang People's Hospital, Shijiazhuang, China
| | - Chun-Feng Ren
- The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Xiangling Wang
- Department of Catheterization Room, Shijiazhuang People's Hospital, Shijiazhuang, China
| | - Bulang Gao
- Department of Medical Research, Shijiazhuang People's Hospital, Shijiazhuang, China
- *Correspondence: Bulang Gao
| |
Collapse
|
8
|
Lynch ML, DeGruttola V. Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2022; 14:305-318. [PMID: 35528805 PMCID: PMC9064718 DOI: 10.1007/s41060-022-00323-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Accepted: 03/31/2022] [Indexed: 11/30/2022]
Abstract
This paper describes an ensemble cluster analysis of bivariate profiles of HIV biomarkers, viral load and CD4 cell counts, which jointly measure disease progression. Data are from a prevalent cohort of HIV positive participants in a clinical trial of vitamin supplementation in Botswana. These individuals were HIV positive upon enrollment, but with unknown times of infection. To categorize groups of participants based on their patterns of progression of HIV infection using both biomarkers, we combine univariate shape-based cluster results for multiple biomarkers through the use of ensemble clustering methods. We first describe univariate clustering for each of the individual biomarker profiles, and make use of shape-respecting distances for clustering the longitudinal profile data. In our data, profiles are subject to either missing or irregular measurements as well as unobserved initiation times of the process of interest. Shape-respecting distances that can handle such data issues, preserve time-ordering, and identify similar profile shapes are useful in identifying patterns of disease progression from longitudinal biomarker data. However, their performance with regard to clustering differs by severity of the data issues mentioned above. We provide an empirical investigation of shape-respecting distances (Fréchet and dynamic time warping (DTW)) on benchmark shape data, and use DTW in cluster analysis of biomarker profile observations. These reveal a primary group of ‘typical progressors,’ as well as a smaller group that shows relatively rapid progression. We then refine the analysis using ensemble clustering for both markers to obtain a single classification. The information from joint evaluation of the two biomarkers combined with ensemble clustering reveals subgroups of patients not identifiable through univariate analyses; noteworthy subgroups are those that appear to represent recently and chronically infected subsets.
Collapse
Affiliation(s)
- Miranda L Lynch
- Hauptman-Woodward Medical Research Institute, 700 Ellicott Street, Buffalo, NY 14203 USA
| | - Victor DeGruttola
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA 02115 USA
| |
Collapse
|
9
|
Elías A, Jiménez R, Paganoni AM, Sangalli LM. Integrated Depths for Partially Observed Functional Data. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2070171] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Antonio Elías
- OASYS Group, Department of Applied Mathematics, Universidad de Málaga
| | - Raúl Jiménez
- Department of Statistics, University Carlos III of Madrid
| | - Anna M. Paganoni
- MOX Laboratory for Modeling and Scientic Computing, Dipartimento di Matematica, Politecnico di Milano
| | - Laura M. Sangalli
- MOX Laboratory for Modeling and Scientic Computing, Dipartimento di Matematica, Politecnico di Milano
| |
Collapse
|
10
|
LoMauro A, Colli A, Colombo L, Aliverti A. Breathing patterns recognition: A functional data analysis approach. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 217:106670. [PMID: 35172250 DOI: 10.1016/j.cmpb.2022.106670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 01/27/2022] [Accepted: 01/27/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE The ongoing pandemic proved fundamental is to assess a subject's respiratory functionality and breathing pattern measurement during quiet breathing is feasible in almost all patients, even those uncooperative. Breathing pattern consists of tidal volume and respiratory rate in an individual assessed by data tracks of lung or chest wall volume over time. State-of-art analysis of these data requires operator-dependent choices such as individuation of local minima in the track, elimination of anomalous breaths and individuation of breath clusters corresponding to different breathing patterns. METHODS A semi-automatic, robust and reproducible procedure was proposed to pre-process and analyse respiratory tracks, based on Functional Data Analysis (FDA) techniques, to identify representative breath curve and the corresponding breathing patterns. This was achieved through three steps: 1) breath separation through precise localization of the minima of the volume trace; 2) functional outlier breaths detection according to time-duration, magnitude and shape; 3) breath clustering to identify different pattern of interest, through K-medoids with Alignment. The method was firstly validated on simulated tracks and then applied to real data in conditions of clinical interest: operational volume change, exercise, mechanical ventilation, paradoxical breathing and age. RESULTS The total error in the accuracy of minima detection and in was less than 5%; with the artificial outliers being almost completely removed with an accuracy of 99%. During incremental exercise and independently on the bike resistance level, five clusters were identified (quiet breathing; recovery phase; onset of exercise; maximal and intermediate levels of exercise). During mechanical ventilation, the procedure was able to separate the non-ventilated from the ventilatory-supported breathing and to identify the worsening of paradoxical breathing due to the disease progression and the breathing pattern changes in healthy subjects due to age. CONCLUSIONS We proposed a robust validated automatic breathing patterns identification algorithm that extracted representative curves that could be implemented in clinical practice for objective comparison of the breathing patterns within and between subjects. In all case studies the identified patterns proved to be coherent with the clinical conditions and the physiopathology of the subjects, therefore enforcing the potential clinical translational value of the method.
Collapse
Affiliation(s)
- A LoMauro
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, P.zza L. da Vinci 32; 20133 Milano, Italy.
| | - A Colli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, P.zza L. da Vinci 32; 20133 Milano, Italy
| | - L Colombo
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, P.zza L. da Vinci 32; 20133 Milano, Italy
| | - A Aliverti
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, P.zza L. da Vinci 32; 20133 Milano, Italy
| |
Collapse
|
11
|
Tang L, Zeng P, Qing Shi J, Kim WS. Model-based joint curve registration and classification. J Appl Stat 2022; 50:1178-1198. [PMID: 37009594 PMCID: PMC10062228 DOI: 10.1080/02664763.2021.2023118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
In this paper, we consider the problem of classification of misaligned multivariate functional data. We propose to use a model-based approach for the joint registration and classification of such data. The observed functional inputs are modeled as a functional nonlinear mixed effects model containing a nonlinear functional fixed effect constructed upon warping functions to account for curve alignment, and a nonlinear functional random effects component to address the variability among subjects. The warping functions are also modeled to accommodate common effect within groups and the variability between subjects. Then, a functional logistic regression model defined upon the representation of the aligned curves and scalar inputs is used to account for curve classification. EM-based algorithms are developed to perform maximum likelihood inference of the proposed models. The identifiability of the registration model and the asymptotical properties of the proposed method are established. The performance of the proposed procedure is illustrated via simulation studies and an analysis of a hyoid bone movement data application. The statistical developments proposed in this paper were motivated by the hyoid bone movement study, the methodology is designed and presented generality and can be applied to numerous areas of scientific research.
Collapse
Affiliation(s)
- Lin Tang
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, Yunnan University, Kunming, Yunnan, People's Republic of China
| | - Pengcheng Zeng
- Institute of Mathematical Sciences, ShanghaiTech University, Shanghai, People's Republic of China
| | - Jian Qing Shi
- Department of Statistics and Data Science, Southern University of Science and Technology, Shenzhen, People's Republic of China
- National Center for Applied Mathematics, Shenzhen, People's Republic of China
| | - Won-Seok Kim
- Department of Rehabilitation Medicine, Seoul National University College of Medicine, Seoul National University Bundang Hospital, Seongnam, South Korea
| |
Collapse
|
12
|
Belli E, Vantini S. Measure inducing classification and regression trees for functional data. Stat Anal Data Min 2021. [DOI: 10.1002/sam.11569] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Edoardo Belli
- MOX ‐ Department of Mathematics Politecnico di Milano Milano Italy
| | - Simone Vantini
- MOX ‐ Department of Mathematics Politecnico di Milano Milano Italy
| |
Collapse
|
13
|
Abstract
AbstractA robust approach for clustering functional directional data is proposed. The proposal adapts “impartial trimming” techniques to this particular framework. Impartial trimming uses the dataset itself to tell us which appears to be the most outlying curves. A feasible algorithm is proposed for its practical implementation justified by some theoretical properties. A “warping” approach is also introduced which allows including controlled time warping in that robust clustering procedure to detect typical “templates”. The proposed methodology is illustrated in a real data analysis problem where it is applied to cluster aircraft trajectories.
Collapse
|
14
|
Carroll C, Müller H, Kneip A. Cross‐component registration for multivariate functional data, with application to growth curves. Biometrics 2021. [DOI: 10.1111/biom.13340] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Cody Carroll
- Department of Statistics University of California Davis California
| | | | - Alois Kneip
- Department of Economics Universität Bonn Bonn Germany
| |
Collapse
|
15
|
Boschi T, Chiaromonte F, Secchi P, Li B. Covariance‐based low‐dimensional registration for function‐on‐function regression. Stat (Int Stat Inst) 2021. [DOI: 10.1002/sta4.404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Tobia Boschi
- Department of Statistics Penn State University University Park Pennsylvania 16802 USA
| | - Francesca Chiaromonte
- Department of Statistics Penn State University University Park Pennsylvania 16802 USA
- EMbeDS Sant'Anna School of Advanced Studies Pisa 56127 Italy
| | - Piercesare Secchi
- Department of Mathematics Politecnico di Milano Milan 20133 Italy
- Center for Analysis, Decisions and Society Human Technopole of Milano Milan 20157 Italy
| | - Bing Li
- Department of Statistics Penn State University University Park Pennsylvania 16802 USA
| |
Collapse
|
16
|
|
17
|
|
18
|
Cheam AS, Fredette M. On the importance of similarity characteristics of curve clustering and its applications. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2020.04.024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
19
|
Bellini G, Cipriano M, De Angeli N, Gargano JP, Gianella M, Goi G, Rossi G, Masciadri A, Comai S. Alzheimer’s Garden: Understanding Social Behaviors of Patients with Dementia to Improve Their Quality of Life. LECTURE NOTES IN COMPUTER SCIENCE 2020. [PMCID: PMC7479800 DOI: 10.1007/978-3-030-58805-2_46] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
This paper aims at understanding the social behavior of people with dementia through the use of technology, specifically by analyzing localization data of patients of an Alzheimer’s assisted care home in Italy. The analysis will allow to promote social relations by enhancing the facility’s spaces and activities, with the ultimate objective of improving residents’ quality of life. To assess social wellness and evaluate the effectiveness of the village areas and activities, this work introduces measures of sociability for both residents and places. Our data analysis is based on classical statistical methods and innovative machine learning techniques. First, we analyze the correlation between relational indicators and factors such as the outdoor temperature and the patients’ movements inside the facility. Then, we use statistical and accessibility analyses to determine the spaces residents appreciate the most and those in need of enhancements. We observe that patients’ sociability is strongly related to the considered factors. From our analysis, outdoor areas result less frequented and need spatial redesign to promote accessibility and attendance among patients. The data awareness obtained from our analysis will also be of great help to caregivers, doctors, and psychologists to enhance assisted care home social activities, adjust patient-specific treatments, and deepen the comprehension of the disease.
Collapse
|
20
|
Wang L, Xiong Q, Wu G, Gautam A, Jiang J, Liu S, Zhao W, Guan H. Spatio-Temporal Variation Characteristics of PM 2.5 in the Beijing-Tianjin-Hebei Region, China, from 2013 to 2018. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019; 16:ijerph16214276. [PMID: 31689921 PMCID: PMC6862089 DOI: 10.3390/ijerph16214276] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Revised: 10/23/2019] [Accepted: 11/01/2019] [Indexed: 11/16/2022]
Abstract
Air pollution, including particulate matter (PM2.5) pollution, is extremely harmful to the environment as well as human health. The Beijing–Tianjin–Hebei (BTH) Region has experienced heavy PM2.5 pollution within China. In this study, a six-year time series (January 2013–December 2018) of PM2.5 mass concentration data from 102 air quality monitoring stations were studied to understand the spatio-temporal variation characteristics of the BTH region. The average annual PM2.5 mass concentration in the BTH region decreased from 98.9 μg/m3 in 2013 to 64.9 μg/m3 in 2017. Therefore, China has achieved its Air Pollution Prevention and Control Plan goal of reducing the concentration of fine particulate matter in the BTH region by 25% by 2017. The PM2.5 pollution in BTH plain areas showed a more significant change than mountains areas, with the highest PM2.5 mass concentration in winter and the lowest in summer. The results of spatial autocorrelation and cluster analyses showed that the PM2.5 mass concentration in the BTH region from 2013–2018 showed a significant spatial agglomeration, and that spatial distribution characteristics were high in the south and low in the north. Changes in PM2.5 mass concentration in the BTH region were affected by both socio-economic factors and meteorological factors. Our results can provide a point of reference for making PM2.5 pollution control decisions.
Collapse
Affiliation(s)
- Lili Wang
- College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China.
| | - Qiulin Xiong
- Faculty of Geomatics, East China University of Technology, Nanchang 330013, China.
| | - Gaofeng Wu
- College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China.
| | - Atul Gautam
- College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China.
| | - Jianfang Jiang
- College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China.
| | - Shuang Liu
- College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China.
| | - Wenji Zhao
- College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China.
| | - Hongliang Guan
- College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China.
| |
Collapse
|
21
|
Pini A, Markström JL, Schelin L. Test–retest reliability measures for curve data: an overview with recommendations and supplementary code. Sports Biomech 2019; 21:179-200. [DOI: 10.1080/14763141.2019.1655089] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Alessia Pini
- Department of Statistics, Umeå School of Business, Economics and Statistics, Umeå University, Umeå, Sweden
- Department of Statistical Sciences, Catholic University of the Sacred Heart, Milan, Italy
| | - Jonas L Markström
- Department of Community Medicine and Rehabilitation, Physiotherapy, Umeå University, Umeå, Sweden
| | - Lina Schelin
- Department of Statistics, Umeå School of Business, Economics and Statistics, Umeå University, Umeå, Sweden
| |
Collapse
|
22
|
|
23
|
Abramowicz K, Schelin L, Sjöstedt de Luna S, Strandberg J. Multiresolution clustering of dependent functional data with application to climate reconstruction. Stat (Int Stat Inst) 2019. [DOI: 10.1002/sta4.240] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Konrad Abramowicz
- Department of Mathematics and Mathematical StatisticsUmeå University Umeå Sweden
| | - Lina Schelin
- Umeå School of Business, Economics and StatisticsUmeå University Umeå Sweden
| | | | - Johan Strandberg
- Department of Mathematics and Mathematical StatisticsUmeå University Umeå Sweden
| |
Collapse
|
24
|
Zeng P, Qing Shi J, Kim WS. Simultaneous Registration and Clustering for Multidimensional Functional Data. J Comput Graph Stat 2019. [DOI: 10.1080/10618600.2019.1607744] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Pengcheng Zeng
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, UK
| | - Jian Qing Shi
- School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne, UK
| | - Won-Seok Kim
- Department of Rehabilitation Medicine, Seoul National University College of Medicine, Seoul National University Bundang Hospital, Seongnam, South Korea
| |
Collapse
|
25
|
A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data. STAT METHOD APPL-GER 2019. [DOI: 10.1007/s10260-018-00446-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
26
|
Rošťáková Z, Rosipal R. Profiling continuous sleep representations for better understanding of the dynamic character of normal sleep. Artif Intell Med 2019; 97:152-167. [DOI: 10.1016/j.artmed.2018.12.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Revised: 12/12/2018] [Accepted: 12/27/2018] [Indexed: 10/27/2022]
|
27
|
Wrobel J, Zipunnikov V, Schrack J, Goldsmith J. Registration for exponential family functional data. Biometrics 2019; 75:48-57. [PMID: 30129091 PMCID: PMC10585654 DOI: 10.1111/biom.12963] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Revised: 08/01/2018] [Accepted: 08/01/2018] [Indexed: 12/01/2022]
Abstract
We introduce a novel method for separating amplitude and phase variability in exponential family functional data. Our method alternates between two steps: the first uses generalized functional principal components analysis to calculate template functions, and the second estimates smooth warping functions that map observed curves to templates. Existing approaches to registration have primarily focused on continuous functional observations, and the few approaches for discrete functional data require a pre-smoothing step; these methods are frequently computationally intensive. In contrast, we focus on the likelihood of the observed data and avoid the need for preprocessing, and we implement both steps of our algorithm in a computationally efficient way. Our motivation comes from the Baltimore Longitudinal Study on Aging, in which accelerometer data provides valuable insights into the timing of sedentary behavior. We analyze binary functional data with observations each minute over 24 hours for 592 participants, where values represent activity and inactivity. Diurnal patterns of activity are obscured due to misalignment in the original data but are clear after curves are aligned. Simulations designed to mimic the application indicate that the proposed methods outperform competing approaches in terms of estimation accuracy and computational efficiency. Code for our method and simulations is publicly available.
Collapse
Affiliation(s)
- Julia Wrobel
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York, U.S.A
| | - Vadim Zipunnikov
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, U.S.A
| | - Jennifer Schrack
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, U.S.A
- Longitudinal Studies Section, Translational Gerontology Branch, National Institute on Aging, National Institutes of Health, Bethesda, Maryland, U.S.A
| | - Jeff Goldsmith
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York, U.S.A
| |
Collapse
|
28
|
Happ C, Scheipl F, Gabriel A, Greven S. A general framework for multivariate functional principal component analysis of amplitude and phase variation. Stat (Int Stat Inst) 2019. [DOI: 10.1002/sta4.220] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Clara Happ
- Department of StatisticsLMU Munich Munich Germany
| | | | | | - Sonja Greven
- Department of StatisticsLMU Munich Munich Germany
| |
Collapse
|
29
|
A divisive clustering method for functional data with special consideration of outliers. ADV DATA ANAL CLASSI 2017. [DOI: 10.1007/s11634-017-0290-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
30
|
Abstract
We congratulate the authors for their excellent work that provides a clear overview of the large and now mature field of regression models for functional data. We here complement their discussion indicating some directions of further research that we deem particularly important.
Collapse
|
31
|
|
32
|
Wu Z, Hitchcock DB. A Bayesian method for simultaneous registration and clustering of functional observations. Comput Stat Data Anal 2016. [DOI: 10.1016/j.csda.2016.02.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
33
|
Wall shear stress at the initiation site of cerebral aneurysms. Biomech Model Mechanobiol 2016; 16:97-115. [DOI: 10.1007/s10237-016-0804-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 06/24/2016] [Indexed: 11/30/2022]
|
34
|
Genolini C, Ecochard R, Benghezal M, Driss T, Andrieu S, Subtil F. kmlShape: An Efficient Method to Cluster Longitudinal Data (Time-Series) According to Their Shapes. PLoS One 2016; 11:e0150738. [PMID: 27258355 PMCID: PMC4892497 DOI: 10.1371/journal.pone.0150738] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2015] [Accepted: 02/18/2016] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Longitudinal data are data in which each variable is measured repeatedly over time. One possibility for the analysis of such data is to cluster them. The majority of clustering methods group together individual that have close trajectories at given time points. These methods group trajectories that are locally close but not necessarily those that have similar shapes. However, in several circumstances, the progress of a phenomenon may be more important than the moment at which it occurs. One would thus like to achieve a partitioning where each group gathers individuals whose trajectories have similar shapes whatever the time lag between them. METHOD In this article, we present a longitudinal data partitioning algorithm based on the shapes of the trajectories rather than on classical distances. Because this algorithm is time consuming, we propose as well two data simplification procedures that make it applicable to high dimensional datasets. RESULTS In an application to Alzheimer disease, this algorithm revealed a "rapid decline" patient group that was not found by the classical methods. In another application to the feminine menstrual cycle, the algorithm showed, contrarily to the current literature, that the luteinizing hormone presents two peaks in an important proportion of women (22%).
Collapse
Affiliation(s)
- Christophe Genolini
- Inserm UMR 1027, University of Toulouse III, Toulouse, France
- CeRSM (EA 2931), UFR STAPS, University Paris Ouest-Nanterre-La Défense, Nanterre, France
| | - René Ecochard
- Service de Biostatistique, Université Lyon 1, Villeurbanne, France
- CNRS, UMR5558, Equipe Biotatistique-Santé, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, France
| | | | - Tarak Driss
- CeRSM (EA 2931), UFR STAPS, University Paris Ouest-Nanterre-La Défense, Nanterre, France
| | - Sandrine Andrieu
- Inserm UMR 1027, University of Toulouse III, Toulouse, France
- Department of Epidemiology and Public Health, CHU Toulouse, Toulouse, France
| | - Fabien Subtil
- Service de Biostatistique, Université Lyon 1, Villeurbanne, France
- CNRS, UMR5558, Equipe Biotatistique-Santé, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, France
| |
Collapse
|
35
|
Park J, Ahn J. Clustering multivariate functional data with phase variation. Biometrics 2016; 73:324-333. [PMID: 27218696 DOI: 10.1111/biom.12546] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2015] [Revised: 03/01/2016] [Accepted: 04/01/2016] [Indexed: 11/27/2022]
Abstract
When functional data come as multiple curves per subject, characterizing the source of variations is not a trivial problem. The complexity of the problem goes deeper when there is phase variation in addition to amplitude variation. We consider clustering problem with multivariate functional data that have phase variations among the functional variables. We propose a conditional subject-specific warping framework in order to extract relevant features for clustering. Using multivariate growth curves of various parts of the body as a motivating example, we demonstrate the effectiveness of the proposed approach. The found clusters have individuals who show different relative growth patterns among different parts of the body.
Collapse
Affiliation(s)
- Juhyun Park
- Department of Mathematics and Statistics, Lancaster University, Lancaster LA1 4YF, UK
| | - Jeongyoun Ahn
- Department of Statistics, University of Georgia, Athens, Georgia 30602-1952, U.S.A
| |
Collapse
|
36
|
Menafoglio A, Petris G. Kriging for Hilbert-space valued random fields: The operatorial point of view. J MULTIVARIATE ANAL 2016. [DOI: 10.1016/j.jmva.2015.06.012] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
37
|
Marron JS, Ramsay JO, Sangalli LM, Srivastava A. Functional Data Analysis of Amplitude and Phase Variation. Stat Sci 2015. [DOI: 10.1214/15-sts524] [Citation(s) in RCA: 81] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
38
|
Secchi P, Vantini S, Vitelli V. Analysis of spatio-temporal mobile phone data: a case study in the metropolitan area of Milan. STAT METHOD APPL-GER 2015. [DOI: 10.1007/s10260-014-0294-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
39
|
Chen H, Zeng D. Comment. J Am Stat Assoc 2014. [DOI: 10.1080/01621459.2014.972158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
40
|
Chung B, Cebral JR. CFD for Evaluation and Treatment Planning of Aneurysms: Review of Proposed Clinical Uses and Their Challenges. Ann Biomed Eng 2014; 43:122-38. [DOI: 10.1007/s10439-014-1093-6] [Citation(s) in RCA: 81] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Accepted: 08/08/2014] [Indexed: 11/29/2022]
|
41
|
Peng J, Paul D, Müller HG. Time-warped growth processes, with applications to the modeling of boom–bust cycles in house prices. Ann Appl Stat 2014. [DOI: 10.1214/14-aoas740] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
42
|
Gervini D, Carter PA. Warped functional analysis of variance. Biometrics 2014; 70:526-35. [PMID: 24779611 DOI: 10.1111/biom.12171] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2013] [Revised: 03/01/2014] [Accepted: 03/01/2014] [Indexed: 11/30/2022]
Abstract
This article presents an Analysis of Variance model for functional data that explicitly incorporates phase variability through a time-warping component, allowing for a unified approach to estimation and inference in presence of amplitude and time variability. The focus is on single-random-factor models but the approach can be easily generalized to more complex ANOVA models. The behavior of the estimators is studied by simulation, and an application to the analysis of growth curves of flour beetles is presented. Although the model assumes a smooth latent process behind the observed trajectories, smootheness of the observed data is not required; the method can be applied to irregular time grids, which are common in longitudinal studies.
Collapse
Affiliation(s)
- Daniel Gervini
- Department of Mathematical Sciences, University of Wisconsin-Milwaukee, PO Box 413, Milwaukee, Wisconsin 53201, U.S.A
| | - Patrick A Carter
- School of Biological Sciences, Washington State University, PO Box 644236, Pullman, Washington 99164, U.S.A
| |
Collapse
|
43
|
Sangalli LM, Secchi P, Vantini S. Object Oriented Data Analysis: A few methodological challenges. Biom J 2014; 56:774-7. [PMID: 24753126 DOI: 10.1002/bimj.201300217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2013] [Revised: 01/16/2014] [Accepted: 01/17/2014] [Indexed: 11/11/2022]
Abstract
This is a discussion of the paper "Overview of object oriented data analysis" by J. Steve Marron and Andrés M. Alonso.
Collapse
Affiliation(s)
- Laura M Sangalli
- MOX-Department of Mathematics, Politecnico di Milano, Piazza Leonardo Da Vinci, 32, 20133, Milano, Italy
| | - Piercesare Secchi
- MOX-Department of Mathematics, Politecnico di Milano, Piazza Leonardo Da Vinci, 32, 20133, Milano, Italy
| | - Simone Vantini
- MOX-Department of Mathematics, Politecnico di Milano, Piazza Leonardo Da Vinci, 32, 20133, Milano, Italy
| |
Collapse
|
44
|
|
45
|
Dimeglio C, Gallón S, Loubes JM, Maza E. A robust algorithm for template curve estimation based on manifold embedding. Comput Stat Data Anal 2014. [DOI: 10.1016/j.csda.2013.09.030] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
46
|
|
47
|
|
48
|
Patriarca M, Sangalli LM, Secchi P, Vantini S. Analysis of spike train data: An application of $k$-mean alignment. Electron J Stat 2014. [DOI: 10.1214/14-ejs865a] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
49
|
Marron JS, Ramsay JO, Sangalli LM, Srivastava A. Statistics of time warpings and phase variations. Electron J Stat 2014. [DOI: 10.1214/14-ejs901] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
50
|
Sangalli LM, Secchi P, Vantini S. Analysis of AneuRisk65 data: $k$-mean alignment. Electron J Stat 2014. [DOI: 10.1214/14-ejs938a] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|