1
|
Rikhtehgaran R, Shamsi K, Renani EM, Arab A, Nouri F, Mohammadifard N, Marateb HR, Mansourian M, Sarrafzadegan N. Population food intake clusters and cardiovascular disease incidence: a Bayesian quantifying of a prospective population-based cohort study in a low and middle-income country. Front Nutr 2023; 10:1150481. [PMID: 37521422 PMCID: PMC10374205 DOI: 10.3389/fnut.2023.1150481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 06/26/2023] [Indexed: 08/01/2023] Open
Abstract
Aims This study was designed to explore the relationship between cardiovascular disease incidence and population clusters, which were established based on daily food intake. Methods The current study examined 5,396 Iranian adults (2,627 males and 2,769 females) aged 35 years and older, who participated in a 10-year longitudinal population-based study that began in 2001. The frequency of food group consumption over the preceding year (daily, weekly, or monthly) was assessed using a 49-item qualitative food frequency questionnaire (FFQ) administered via a face-to-face interview conducted by an expert dietitian. Participants were clustered based on their dietary intake by applying the semi-parametric Bayesian approach of the Dirichlet Process. In this approach, individuals with the same multivariate distribution based on dietary intake were assigned to the same cluster. The association between the extracted population clusters and the incidence of cardiovascular diseases was examined using Cox proportional hazard models. Results In the 10-year follow-up, 741 participants (401 men and 340 women) were diagnosed with cardiovascular diseases. Individuals were categorized into three primary dietary clusters: healthy, unhealthy, and mixed. After adjusting for potential confounders, subjects in the unhealthy cluster exhibited a higher risk for cardiovascular diseases [Hazard Ratio (HR): 2.059; 95% CI: 1.013, 4.184] compared to those in the healthy cluster. In the unadjusted model, individuals in the mixed cluster demonstrated a higher risk for cardiovascular disease than those in the healthy cluster (HR: 1.515; 95% CI: 1.097, 2.092). However, this association was attenuated after adjusting for potential confounders (HR: 1.145; 95% CI: 0.769, 1.706). Conclusion The results have shown that individuals within an unhealthy cluster have a risk that is twice as high for the incidence of cardiovascular diseases. However, these associations need to be confirmed through further prospective investigations.
Collapse
Affiliation(s)
- Reyhaneh Rikhtehgaran
- Department of Mathematical Sciences, Isfahan University of Technology, Isfahan, Iran
| | - Khadijeh Shamsi
- Student Research Committee, Department of Epidemiology and Biostatistics, School of Health, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Elnaz Mojoudi Renani
- Department of Mathematical Sciences, Isfahan University of Technology, Isfahan, Iran
| | - Arman Arab
- Department of Community Nutrition, School of Nutrition and Food Sciences, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Fatemeh Nouri
- Isfahan Cardiovascular Research Center, Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Noushin Mohammadifard
- Hypertension Research Center, Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Hamid Reza Marateb
- Department of Biomedical Engineering, Faculty of Engineering, University of Isfahan, Isfahan, Iran
| | - Marjan Mansourian
- Pediatric Cardiovascular Research Center, Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Nizal Sarrafzadegan
- Isfahan Cardiovascular Research Center, Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
2
|
Vaičiulytė J, Sakalauskas L. Recursive parameter estimation algorithm of the Dirichlet hidden Markov model. J STAT COMPUT SIM 2019. [DOI: 10.1080/00949655.2019.1679144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Jūratė Vaičiulytė
- Institute of Data Science and Digital Technologies, Vilnius University, Vilnius, Lithuania
| | - Leonidas Sakalauskas
- Department of Informatics and Statistics, Klaipėda University, Klaipėda, Lithuania
| |
Collapse
|
3
|
Kidando E, Moses R, Ozguven EE, Sando T. Incorporating travel time reliability in predicting the likelihood of severe crashes on arterial highways using non-parametric random-effect regression. JOURNAL OF TRAFFIC AND TRANSPORTATION ENGINEERING (ENGLISH ED. ONLINE) 2019. [DOI: 10.1016/j.jtte.2018.04.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
4
|
Nunes C, Moreira E, Ferreira SS, Ferreira D, Mexia JT. Considering the sample sizes as truncated Poisson random variables in mixed effects models. J Appl Stat 2019; 47:2641-2657. [DOI: 10.1080/02664763.2019.1641188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Célia Nunes
- Department of Mathematics and Center of Mathematics and Applications, University of Beira Interior, Covilhã, Portugal
| | - Elsa Moreira
- CMA – Center of Mathematics and its Applications, Faculty of Science and Technology, New University of Lisbon, Lisbon, Portugal
| | - Sandra S. Ferreira
- Department of Mathematics and Center of Mathematics and Applications, University of Beira Interior, Covilhã, Portugal
| | - Dário Ferreira
- Department of Mathematics and Center of Mathematics and Applications, University of Beira Interior, Covilhã, Portugal
| | - João T. Mexia
- CMA – Center of Mathematics and its Applications, Faculty of Science and Technology, New University of Lisbon, Lisbon, Portugal
| |
Collapse
|
5
|
Xu P, Peng H, Huang T. Unsupervised learning of mixture regression models for longitudinal data. Comput Stat Data Anal 2018. [DOI: 10.1016/j.csda.2018.03.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
|
6
|
Affiliation(s)
- Moritz Berger
- Institut für Medizinische Biometrie, Informatik und Epidemiologie, Universitätsklinikum Bonn, Bonn, Germany
| | - Gerhard Tutz
- Ludwig-Maximilians-Universität München, München, Germany
| |
Collapse
|
7
|
Lim KL, Wang H. Fast approximation of variational Bayes Dirichlet process mixture using the maximization–maximization algorithm. Int J Approx Reason 2018. [DOI: 10.1016/j.ijar.2017.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
8
|
Rikhtehgaran R. An application of Dirichlet process in clustering subjects via variance shift models: A course-evaluation study. STAT MODEL 2017. [DOI: 10.1177/1471082x17699299] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this article, the Dirichlet process (DP) is applied to cluster subjects with longitudinal observations. The basis of clustering is the ability of subjects to adapt themselves to new circumstances. Indeed, the basis of clustering depends on the time of changing response variability. This is done by providing a random change-point time in the variance structure of mixed-effects models. The DP is assumed as a prior for the distribution of the random change point. The discrete nature of the DP is utilized to cluster subjects according to the time of adaption. The proposed model is useful to identify groups of subjects with distinctive time-based progressions or declines. Transition mixed-effects models are also used to account for the serial correlation among observations over time. A joint modelling approach is utilized to handle the bias created in these models. The Gibbs sampling technique is adopted to achieve parameter estimates. Performance of the proposed method is evaluated via conducting a simulation study. The usefulness of the proposed model is assessed on a course-evaluation dataset.
Collapse
Affiliation(s)
- Reyhaneh Rikhtehgaran
- Department of Mathematical Sciences, Isfahan University of Technology, Isfahan, Iran
| |
Collapse
|
9
|
Niklitschek EJ, Darnaude AM. Performance of maximum likelihood mixture models to estimate nursery habitat contributions to fish stocks: a case study on sea bream Sparus aurata. PeerJ 2016; 4:e2415. [PMID: 27761305 PMCID: PMC5068389 DOI: 10.7717/peerj.2415] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Accepted: 08/05/2016] [Indexed: 11/21/2022] Open
Abstract
Background Mixture models (MM) can be used to describe mixed stocks considering three sets of parameters: the total number of contributing sources, their chemical baseline signatures and their mixing proportions. When all nursery sources have been previously identified and sampled for juvenile fish to produce baseline nursery-signatures, mixing proportions are the only unknown set of parameters to be estimated from the mixed-stock data. Otherwise, the number of sources, as well as some/all nursery-signatures may need to be also estimated from the mixed-stock data. Our goal was to assess bias and uncertainty in these MM parameters when estimated using unconditional maximum likelihood approaches (ML-MM), under several incomplete sampling and nursery-signature separation scenarios. Methods We used a comprehensive dataset containing otolith elemental signatures of 301 juvenile Sparus aurata, sampled in three contrasting years (2008, 2010, 2011), from four distinct nursery habitats. (Mediterranean lagoons) Artificial nursery-source and mixed-stock datasets were produced considering: five different sampling scenarios where 0–4 lagoons were excluded from the nursery-source dataset and six nursery-signature separation scenarios that simulated data separated 0.5, 1.5, 2.5, 3.5, 4.5 and 5.5 standard deviations among nursery-signature centroids. Bias (BI) and uncertainty (SE) were computed to assess reliability for each of the three sets of MM parameters. Results Both bias and uncertainty in mixing proportion estimates were low (BI ≤ 0.14, SE ≤ 0.06) when all nursery-sources were sampled but exhibited large variability among cohorts and increased with the number of non-sampled sources up to BI = 0.24 and SE = 0.11. Bias and variability in baseline signature estimates also increased with the number of non-sampled sources, but tended to be less biased, and more uncertain than mixing proportion ones, across all sampling scenarios (BI < 0.13, SE < 0.29). Increasing separation among nursery signatures improved reliability of mixing proportion estimates, but lead to non-linear responses in baseline signature parameters. Low uncertainty, but a consistent underestimation bias affected the estimated number of nursery sources, across all incomplete sampling scenarios. Discussion ML-MM produced reliable estimates of mixing proportions and nursery-signatures under an important range of incomplete sampling and nursery-signature separation scenarios. This method failed, however, in estimating the true number of nursery sources, reflecting a pervasive issue affecting mixture models, within and beyond the ML framework. Large differences in bias and uncertainty found among cohorts were linked to differences in separation of chemical signatures among nursery habitats. Simulation approaches, such as those presented here, could be useful to evaluate sensitivity of MM results to separation and variability in nursery-signatures for other species, habitats or cohorts.
Collapse
Affiliation(s)
| | - Audrey M Darnaude
- Center for Marine Biodiversity, Exploitation & Conservation, Centre National de la Recherche Scientifique , Montpellier , France
| |
Collapse
|
10
|
Rikhtehgaran R, Kazemi I. The determination of uncertainty levels in robust clustering of subjects with longitudinal observations using the Dirichlet process mixture. ADV DATA ANAL CLASSI 2016. [DOI: 10.1007/s11634-016-0262-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
11
|
Shirazi M, Lord D, Dhavala SS, Geedipally SR. A semiparametric negative binomial generalized linear model for modeling over-dispersed count data with a heavy tail: Characteristics and applications to crash data. ACCIDENT; ANALYSIS AND PREVENTION 2016; 91:10-18. [PMID: 26945472 DOI: 10.1016/j.aap.2016.02.020] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Revised: 02/21/2016] [Accepted: 02/22/2016] [Indexed: 06/05/2023]
Abstract
Crash data can often be characterized by over-dispersion, heavy (long) tail and many observations with the value zero. Over the last few years, a small number of researchers have started developing and applying novel and innovative multi-parameter models to analyze such data. These multi-parameter models have been proposed for overcoming the limitations of the traditional negative binomial (NB) model, which cannot handle this kind of data efficiently. The research documented in this paper continues the work related to multi-parameter models. The objective of this paper is to document the development and application of a flexible NB generalized linear model with randomly distributed mixed effects characterized by the Dirichlet process (NB-DP) to model crash data. The objective of the study was accomplished using two datasets. The new model was compared to the NB and the recently introduced model based on the mixture of the NB and Lindley (NB-L) distributions. Overall, the research study shows that the NB-DP model offers a better performance than the NB model once data are over-dispersed and have a heavy tail. The NB-DP performed better than the NB-L when the dataset has a heavy tail, but a smaller percentage of zeros. However, both models performed similarly when the dataset contained a large amount of zeros. In addition to a greater flexibility, the NB-DP provides a clustering by-product that allows the safety analyst to better understand the characteristics of the data, such as the identification of outliers and sources of dispersion.
Collapse
Affiliation(s)
- Mohammadali Shirazi
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843, United States.
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843, United States.
| | | | | |
Collapse
|
12
|
Abstract
In the last two decades, regularization techniques, in particular penalty-based methods, have become very popular in statistical modelling. Driven by technological developments, most approaches have been designed for high-dimensional problems with metric variables, whereas categorical data has largely been neglected. In recent years, however, it has become clear that regularization is also very promising when modelling categorical data. A specific trait of categorical data is that many parameters are typically needed to model the underlying structure. This results in complex estimation problems that call for structured penalties which are tailored to the categorical nature of the data. This article gives a systematic overview of penalty-based methods for categorical data developed so far and highlights some issues where further research is needed. We deal with categorical predictors as well as models for categorical response variables. The primary interest of this article is to give insight into basic properties of and differences between methods that are important with respect to statistical modelling in practice, without going into technical details or extensive discussion of asymptotic properties.
Collapse
Affiliation(s)
- Gerhard Tutz
- Department of Statistics, Ludwig-Maximilians-Universität Munich, Germany
| | - Jan Gertheiss
- Institute of Applied Stochastics and Operations Research, Clausthal University of Technology, Germany
| |
Collapse
|
13
|
|
14
|
Grilli L, Rampichini C. Specification of random effects in multilevel models: a review. ACTA ACUST UNITED AC 2014. [DOI: 10.1007/s11135-014-0060-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
15
|
Heinzl F, Tutz G. Clustering in linear-mixed models with a group fused lasso penalty. Biom J 2013; 56:44-68. [DOI: 10.1002/bimj.201200111] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2012] [Revised: 12/04/2012] [Accepted: 08/18/2013] [Indexed: 11/06/2022]
Affiliation(s)
- Felix Heinzl
- Department of Statistics; Ludwig-Maximilians-University Munich, Akademiestr. 1; 80799 Munich Germany
| | - Gerhard Tutz
- Department of Statistics; Ludwig-Maximilians-University Munich, Akademiestr. 1; 80799 Munich Germany
| |
Collapse
|