1
|
Manouchehri N, Bouguila N. Human Activity Recognition with an HMM-Based Generative Model. SENSORS (BASEL, SWITZERLAND) 2023; 23:1390. [PMID: 36772428 PMCID: PMC9920173 DOI: 10.3390/s23031390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 01/11/2023] [Accepted: 01/20/2023] [Indexed: 06/18/2023]
Abstract
Human activity recognition (HAR) has become an interesting topic in healthcare. This application is important in various domains, such as health monitoring, supporting elders, and disease diagnosis. Considering the increasing improvements in smart devices, large amounts of data are generated in our daily lives. In this work, we propose unsupervised, scaled, Dirichlet-based hidden Markov models to analyze human activities. Our motivation is that human activities have sequential patterns and hidden Markov models (HMMs) are some of the strongest statistical models used for modeling data with continuous flow. In this paper, we assume that emission probabilities in HMM follow a bounded-scaled Dirichlet distribution, which is a proper choice in modeling proportional data. To learn our model, we applied the variational inference approach. We used a publicly available dataset to evaluate the performance of our proposed model.
Collapse
Affiliation(s)
- Narges Manouchehri
- Algorithmic Dynamics Lab, Unit of Computational Medicine, Karolinska Institute, 171 77 Stockholm, Sweden
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC H3G1T7, Canada
| | - Nizar Bouguila
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC H3G1T7, Canada
| |
Collapse
|
2
|
Ge S, Wang S, Nathoo FS, Wang L. Online Bayesian learning for mixtures of spatial spline regressions with mixed effects. J STAT COMPUT SIM 2021. [DOI: 10.1080/00949655.2021.2002329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Shufei Ge
- Institute of Mathematical Sciences, ShanghaiTech University, Shanghai, People's Republic of China
| | - Shijia Wang
- School of Statistics and Data Science, LPMC& KLMDASR, Nankai University, Tianjin, People's Republic of China
| | - Farouk S. Nathoo
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| | - Liangliang Wang
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, Canada
| |
Collapse
|
3
|
Durif G, Modolo L, Mold JE, Lambert-Lacroix S, Picard F. Probabilistic count matrix factorization for single cell expression data analysis. Bioinformatics 2020; 35:4011-4019. [PMID: 30865271 DOI: 10.1093/bioinformatics/btz177] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2018] [Revised: 01/31/2019] [Accepted: 03/12/2019] [Indexed: 01/02/2023] Open
Abstract
MOTIVATION The development of high-throughput single-cell sequencing technologies now allows the investigation of the population diversity of cellular transcriptomes. The expression dynamics (gene-to-gene variability) can be quantified more accurately, thanks to the measurement of lowly expressed genes. In addition, the cell-to-cell variability is high, with a low proportion of cells expressing the same genes at the same time/level. Those emerging patterns appear to be very challenging from the statistical point of view, especially to represent a summarized view of single-cell expression data. Principal component analysis (PCA) is a most powerful tool for high dimensional data representation, by searching for latent directions catching the most variability in the data. Unfortunately, classical PCA is based on Euclidean distance and projections that poorly work in presence of over-dispersed count data with dropout events like single-cell expression data. RESULTS We propose a probabilistic Count Matrix Factorization (pCMF) approach for single-cell expression data analysis that relies on a sparse Gamma-Poisson factor model. This hierarchical model is inferred using a variational EM algorithm. It is able to jointly build a low dimensional representation of cells and genes. We show how this probabilistic framework induces a geometry that is suitable for single-cell data visualization, and produces a compression of the data that is very powerful for clustering purposes. Our method is competed against other standard representation methods like t-SNE, and we illustrate its performance for the representation of single-cell expression data. AVAILABILITY AND IMPLEMENTATION Our work is implemented in the pCMF R-package (https://github.com/gdurif/pCMF). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ghislain Durif
- Univ Lyon, Université Lyon 1, CNRS, LBBE UMR 5558, F Villeurbanne, France.,Université Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK UMR 5224, F Grenoble, France.,Université de Montpellier, CNRS, IMAG UMR 5149, F Montpellier, France
| | - Laurent Modolo
- Univ Lyon, Université Lyon 1, CNRS, LBBE UMR 5558, F Villeurbanne, France.,Univ Lyon, ENS Lyon, Université Lyon 1, CNRS, LBMC UMR 5239, F Lyon, France.,Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
| | - Jeff E Mold
- Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
| | | | - Franck Picard
- Univ Lyon, Université Lyon 1, CNRS, LBBE UMR 5558, F Villeurbanne, France
| |
Collapse
|
4
|
Vranckx M, Neyens T, Faes C. Comparison of different software implementations for spatial disease mapping. Spat Spatiotemporal Epidemiol 2019; 31:100302. [PMID: 31677763 DOI: 10.1016/j.sste.2019.100302] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Revised: 06/30/2019] [Accepted: 08/06/2019] [Indexed: 11/28/2022]
Abstract
Disease mapping is a scientific field that aims to understand and predict disease risk based on counts of observed cases within small regions of a study area of interest. Hierarchical model-based approaches that borrow information from neighbouring areas via conditional autoregressive (CAR) random effects on the local disease rates have gained a lot of popularity, thanks to the readily implemented Markov chain Monte Carlo methods. Nowadays, many software implementations to model risk distributions exist. Many of these applications differ, to varying degrees, in the underlying methodology. This paper provides an in-depth comparison between analysis results, coming from R-packages CARBayes, R2OpenBUGS, NIMBLE, R2BayesX, R-INLA, and RStan. We investigate CAR models typically used in disease mapping for spatially discrete count data. Data about diabetics in children and young adults in Belgium are used in a case study, while simulation studies are undertaken to assess software performance in different settings.
Collapse
Affiliation(s)
- M Vranckx
- I-BioStat, Hasselt University, Diepenbeek, Belgium.
| | - T Neyens
- I-BioStat, Hasselt University, Diepenbeek, Belgium
| | - C Faes
- I-BioStat, Hasselt University, Diepenbeek, Belgium
| |
Collapse
|
5
|
Teng M, Johnson TD, Nathoo FS. Time series analysis of fMRI data: Spatial modelling and Bayesian computation. Stat Med 2018; 37:2753-2770. [PMID: 29717508 DOI: 10.1002/sim.7680] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Revised: 03/09/2018] [Accepted: 03/20/2018] [Indexed: 11/06/2022]
Abstract
Time series analysis of fMRI data is an important area of medical statistics for neuroimaging data. Spatial models and Bayesian approaches for inference in such models have advantages over more traditional mass univariate approaches; however, a major challenge for such analyses is the required computation. As a result, the neuroimaging community has embraced approximate Bayesian inference based on mean-field variational Bayes (VB) approximations. These approximations are implemented in standard software packages such as the popular statistical parametric mapping software. While computationally efficient, the quality of VB approximations remains unclear even though they are commonly used in the analysis of neuroimaging data. For reliable statistical inference, it is important that these approximations be accurate and that users understand the scenarios under which they may not be accurate. We consider this issue for a particular model that includes spatially varying coefficients. To examine the accuracy of the VB approximation, we derive Hamiltonian Monte Carlo (HMC) for this model and conduct simulation studies to compare its performance with VB in terms of estimation accuracy, posterior variability, the spatial smoothness of estimated images, and computation time. As expected, we find that the computation time required for VB is considerably less than that for HMC. In settings involving a high or moderate signal-to-noise ratio (SNR), we find that the 2 approaches produce very similar results suggesting that the VB approximation is useful in this setting. On the other hand, when one considers a low SNR, substantial differences are found, suggesting that the approximation may not be accurate in such cases and we demonstrate that VB produces Bayes estimators with larger mean squared error. A comparison of the 2 computational approaches in an application examining the hemodynamic response to face perception in addition to a comparison with the traditional mass univariate approach in this application is also considered. Overall, our work clarifies the usefulness of VB for the spatiotemporal analysis of fMRI data, while also pointing out the limitation of VB when the SNR is low and the utility of HMC in this case.
Collapse
Affiliation(s)
- Ming Teng
- Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI, 48109, USA
| | - Timothy D Johnson
- Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI, 48109, USA
| | - Farouk S Nathoo
- Department of Mathematics and Statistics, University of Victoria, Victoria, BC V8W 3P4, Canada
| |
Collapse
|
6
|
Teng M, Nathoo FS, Johnson TD. Bayesian Computation for Log-Gaussian Cox Processes: A Comparative Analysis of Methods. J STAT COMPUT SIM 2017; 87:2227-2252. [PMID: 29200537 DOI: 10.1080/00949655.2017.1326117] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
The Log-Gaussian Cox Process is a commonly used model for the analysis of spatial point pattern data. Fitting this model is difficult because of its doubly-stochastic property, i.e., it is an hierarchical combination of a Poisson process at the first level and a Gaussian Process at the second level. Various methods have been proposed to estimate such a process, including traditional likelihood-based approaches as well as Bayesian methods. We focus here on Bayesian methods and several approaches that have been considered for model fitting within this framework, including Hamiltonian Monte Carlo, the Integrated nested Laplace approximation, and Variational Bayes. We consider these approaches and make comparisons with respect to statistical and computational efficiency. These comparisons are made through several simulation studies as well as through two applications, the first examining ecological data and the second involving neuroimaging data.
Collapse
Affiliation(s)
- Ming Teng
- Department of Biostatistics, University of Michigan
| | - Farouk S Nathoo
- Department of Mathematics and Statistics, University of Victoria
| | | |
Collapse
|
7
|
Redolfi A, Bosco P, Manset D, Frisoni GB. Brain investigation and brain conceptualization. FUNCTIONAL NEUROLOGY 2014; 28:175-90. [PMID: 24139654 DOI: 10.11138/fneur/2013.28.3.175] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The brain of a patient with Alzheimer's disease (AD) undergoes changes starting many years before the development of the first clinical symptoms. The recent availability of large prospective datasets makes it possible to create sophisticated brain models of healthy subjects and patients with AD, showing pathophysiological changes occurring over time. However, these models are still inadequate; representations are mainly single-scale and they do not account for the complexity and interdependence of brain changes. Brain changes in AD patients occur at different levels and for different reasons: at the molecular level, changes are due to amyloid deposition; at cellular level, to loss of neuron synapses, and at tissue level, to connectivity disruption. All cause extensive atrophy of the whole brain organ. Initiatives aiming to model the whole human brain have been launched in Europe and the US with the goal of reducing the burden of brain diseases. In this work, we describe a new approach to earlier diagnosis based on a multimodal and multiscale brain concept, built upon existing and well-characterized single modalities.
Collapse
|