1
|
Kundu D, Krishnan S, Gogoi MP, Das K. A Bayesian quantile joint modeling of multivariate longitudinal and time-to-event data. LIFETIME DATA ANALYSIS 2024; 30:680-699. [PMID: 38427151 DOI: 10.1007/s10985-024-09622-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 02/03/2024] [Indexed: 03/02/2024]
Abstract
Linear mixed models are traditionally used for jointly modeling (multivariate) longitudinal outcomes and event-time(s). However, when the outcomes are non-Gaussian a quantile regression model is more appropriate. In addition, in the presence of some time-varying covariates, it might be of interest to see how the effects of different covariates vary from one quantile level (of outcomes) to the other, and consequently how the event-time changes across different quantiles. For such analyses linear quantile mixed models can be used, and an efficient computational algorithm can be developed. We analyze a dataset from the Acute Lymphocytic Leukemia (ALL) maintenance study conducted by Tata Medical Center, Kolkata. In this study, the patients suffering from ALL were treated with two standard drugs (6MP and MTx) for the first two years, and three biomarkers (e.g. lymphocyte count, neutrophil count and platelet count) were longitudinally measured. After treatment the patients were followed nearly for the next three years, and the relapse-time (if any) for each patient was recorded. For this dataset we develop a Bayesian quantile joint model for the three longitudinal biomarkers and time-to-relapse. We consider an Asymmetric Laplace Distribution (ALD) for each outcome, and exploit the mixture representation of the ALD for developing a Gibbs sampler algorithm to estimate the regression coefficients. Our proposed model allows different quantile levels for different biomarkers, but still simultaneously estimates the regression coefficients corresponding to a particular quantile combination. We infer that a higher lymphocyte count accelerates the chance of a relapse while a higher neutrophil count and a higher platelet count (jointly) reduce it. Also, we infer that across (almost) all quantiles 6MP reduces the lymphocyte count, while MTx increases the neutrophil count. Simulation studies are performed to assess the effectiveness of the proposed approach.
Collapse
Affiliation(s)
- Damitri Kundu
- Applied Statistics Division, Indian Statistical Institute, Kolkata, India
| | - Shekhar Krishnan
- Tata Translational Cancer Research Center, Tata Medical Center, Kolkata, India
| | - Manash Pratim Gogoi
- Tata Translational Cancer Research Center, Tata Medical Center, Kolkata, India
| | - Kiranmoy Das
- Applied Statistics Division, Indian Statistical Institute, Kolkata, India.
- Beijing Institute of Mathematical Sciences and Applications, Beijing, China.
| |
Collapse
|
2
|
Das K, Pareek B, Brown S, Ghosh P. A semi-parametric Bayesian dynamic hurdle model with an application to the health and retirement study. Comput Stat 2021. [DOI: 10.1007/s00180-021-01143-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
3
|
Das K, Ghosh P, Daniels MJ. Modeling Multiple Time-Varying Related Groups: A Dynamic Hierarchical Bayesian Approach With an Application to the Health and Retirement Study. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2021.1886105] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Kiranmoy Das
- Applied Statistics Division, Indian Statistical Institute, Kolkata, India
| | - Pulak Ghosh
- Decision Sciences & Center of Public Policy, Indian Institute of Management, Bangalore, India
| | | |
Collapse
|
4
|
Kang T, Gaskins J, Levy S, Datta S. A longitudinal Bayesian mixed effects model with hurdle Conway-Maxwell-Poisson distribution. Stat Med 2021; 40:1336-1356. [PMID: 33368533 PMCID: PMC9167575 DOI: 10.1002/sim.8844] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 09/29/2020] [Accepted: 11/21/2020] [Indexed: 11/06/2022]
Abstract
Dental caries (i.e., cavities) is one of the most common chronic childhood diseases and may continue to progress throughout a person's lifetime. The Iowa Fluoride Study (IFS) was designed to investigate the effects of various fluoride, dietary and nondietary factors on the progression of dental caries among a cohort of Iowa school children. We develop a mixed effects model to perform a comprehensive analysis of the longitudinal clustered data of IFS at ages 5, 9, 13, and 17. We combine a Bayesian hurdle framework with the Conway-Maxwell-Poisson regression model, which can account for both excessive zeros and various levels of dispersion. A hierarchical shrinkage prior distribution is used to share the temporal information for predictors in the fixed-effects model. The dependence among teeth of each individual child is modeled through a sparse covariance structure of the random effects across time. Moreover, we obtain the parameter estimates and credible intervals from a Gibbs sampler. Simulation studies are conducted to assess the accuracy and effectiveness of our statistical methodology. The results of this article provide novel tools to statistical practitioners and offer fresh insights to dental researchers on effects of various risk and protective factors on caries progression.
Collapse
Affiliation(s)
- Tong Kang
- Department of Biostatistics, University of Florida, Gainesville, Florida, USA
| | - Jeremy Gaskins
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, Kentucky, USA
| | - Steven Levy
- Department of Preventive and Community Dentistry, University of Iowa, Iowa City, Iowa, USA
| | - Somnath Datta
- Department of Biostatistics, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
5
|
Zhao L, Chen T, Novitsky V, Wang R. Joint penalized spline modeling of multivariate longitudinal data, with application to HIV-1 RNA load levels and CD4 cell counts. Biometrics 2020; 77:1061-1074. [PMID: 32683682 DOI: 10.1111/biom.13339] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2019] [Revised: 06/21/2020] [Accepted: 07/09/2020] [Indexed: 12/01/2022]
Abstract
Motivated by the need to jointly model the longitudinal trajectories of HIV viral load levels and CD4 counts during the primary infection stage, we propose a joint penalized spline modeling approach that can be used to model the repeated measurements from multiple biomarkers of various types (eg, continuous, binary) simultaneously. This approach allows for flexible trajectories for each marker, accounts for potentially time-varying correlation between markers, and is robust to misspecification of knots. Despite its advantages, the application of multivariate penalized spline models, especially when biomarkers may be of different data types, has been limited in part due to its seemingly complexity in implementation. To overcome this, we describe a procedure that transforms the multivariate setting to the univariate one, and then makes use of the generalized linear mixed effect model representation of a penalized spline model to facilitate its implementation with standard statistical software. We performed simulation studies to evaluate the validity and efficiency through joint modeling of correlated biomarkers measured longitudinally compared to the univariate modeling approach. We applied this modeling approach to longitudinal HIV-1 RNA load and CD4 count data from Southern African cohorts to estimate features of the joint distributions such as the correlation and the proportion of subjects with high viral load levels and high CD4 cell counts over time.
Collapse
Affiliation(s)
- Lihui Zhao
- Department of Prevention Medicine, Northwestern University, Chicago, Illinois
| | - Tom Chen
- Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, Massachusetts
| | - Vladimir Novitsky
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, Massachusetts
| | - Rui Wang
- Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, Massachusetts.,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| |
Collapse
|
6
|
Biswas J, Das K. A Bayesian quantile regression approach to multivariate semi-continuous longitudinal data. Comput Stat 2020. [DOI: 10.1007/s00180-020-01002-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
7
|
Kulkarni H, Biswas J, Das K. A joint quantile regression model for multiple longitudinal outcomes. ASTA ADVANCES IN STATISTICAL ANALYSIS 2019. [DOI: 10.1007/s10182-018-00339-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
8
|
Kunihama T, Halpern CT, Herring AH. Non‐parametric Bayes models for mixed scale longitudinal surveys. J R Stat Soc Ser C Appl Stat 2019. [DOI: 10.1111/rssc.12348] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
9
|
Biswas J, Das K. A Bayesian approach of analysing semi-continuous longitudinal data with monotone missingness. STAT MODEL 2019. [DOI: 10.1177/1471082x18810119] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
There is a rich literature on the analysis of longitudinal data with missing values. However, the analysis becomes complex for semi-continuous (zero-inflated) longitudinal response with missingness. In this article, we propose a partially varying coefficients regression model for analysing such data. We use a two-part model, where in the first part we propose a latent dynamic model for accounting a ‘zero’ or a ‘non-zero’ response, and in the second part we use another dynamic model for estimating the mean trajectories of non-zero responses. Two dynamic models are linked through subject-specific random effects. The missing covariates are imputed repeatedly based on their respective posterior predictive distributions and the missing responses are imputed using the working model under different identifying restrictions. We analyse data from the Health and Retirement Study (HRS) for aged individuals and develop a dynamic model for predicting out-of-pocket medical expenditures (OOPME) containing excess zeros. The operating characteristics of the proposed model are investigated through extensive simulation studies.
Collapse
Affiliation(s)
- Jayabrata Biswas
- Interdisciplinary Statistical Research Unit, Applied Statistics Division, Indian Statistical Institute, Kolkata, West Bengal, India
| | - Kiranmoy Das
- Interdisciplinary Statistical Research Unit, Applied Statistics Division, Indian Statistical Institute, Kolkata, West Bengal, India
| |
Collapse
|
10
|
Li H, Staudenmayer J, Wang T, Keadle SK, Carroll RJ. Three-part joint modeling methods for complex functional data mixed with zero-and-one-inflated proportions and zero-inflated continuous outcomes with skewness. Stat Med 2018; 37:611-626. [PMID: 29052239 DOI: 10.1002/sim.7534] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Revised: 08/24/2017] [Accepted: 09/25/2017] [Indexed: 11/12/2022]
Abstract
We take a functional data approach to longitudinal studies with complex bivariate outcomes. This work is motivated by data from a physical activity study that measured 2 responses over time in 5-minute intervals. One response is the proportion of time active in each interval, a continuous proportions with excess zeros and ones. The other response, energy expenditure rate in the interval, is a continuous variable with excess zeros and skewness. This outcome is complex because there are 3 possible activity patterns in each interval (inactive, partially active, and completely active), and those patterns, which are observed, induce both nonrandom and random associations between the responses. More specifically, the inactive pattern requires a zero value in both the proportion for active behavior and the energy expenditure rate; a partially active pattern means that the proportion of activity is strictly between zero and one and that the energy expenditure rate is greater than zero and likely to be moderate, and the completely active pattern means that the proportion of activity is exactly one, and the energy expenditure rate is greater than zero and likely to be higher. To address these challenges, we propose a 3-part functional data joint modeling approach. The first part is a continuation-ratio model to reorder the ordinal valued 3 activity patterns. The second part models the proportions when they are in interval (0,1). The last component specifies the skewed continuous energy expenditure rate with Box-Cox transformations when they are greater than zero. In this 3-part model, the regression structures are specified as smooth curves measured at various time points with random effects that have a correlation structure. The smoothed random curves for each variable are summarized using a few important principal components, and the association of the 3 longitudinal components is modeled through the association of the principal component scores. The difficulties in handling the ordinal and proportional variables are addressed using a quasi-likelihood type approximation. We develop an efficient algorithm to fit the model that also involves the selection of the number of principal components. The method is applied to physical activity data and is evaluated empirically by a simulation study.
Collapse
Affiliation(s)
- Haocheng Li
- Departments of Oncology and Community Health Sciences, University of Calgary, Calgary, Canada
| | - John Staudenmayer
- Department of Mathematics and Statistics, University of Massachusetts, Amherst, MA, USA
| | - Tianying Wang
- Department of Statistics, Texas A&M University, College Station, TX, USA
| | - Sarah Kozey Keadle
- Kinesiology Department, California Polytechnic State University, San Luis Obispo, CA, USA
| | - Raymond J Carroll
- Department of Statistics, Texas A&M University, College Station, TX, USA.,Department of Mathematics and Statistics, University of Technology Sydney, Ultimo, NSW, Australia
| |
Collapse
|
11
|
Das K. A semiparametric Bayesian approach for joint modeling of longitudinal trait and event time. J Appl Stat 2016. [DOI: 10.1080/02664763.2016.1155108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Kiranmoy Das
- Interdisciplinary Statistical Research Unit, Indian Statistical Institute, Kolkata, India
| |
Collapse
|
12
|
Das K, Afriyie P, Spirko L. A Semiparametric Bayesian Approach for Analyzing Longitudinal Data from Multiple Related Groups. Int J Biostat 2015; 11:273-84. [PMID: 26565556 DOI: 10.1515/ijb-2015-0002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Often the biological and/or clinical experiments result in longitudinal data from multiple related groups. The analysis of such data is quite challenging due to the fact that groups might have shared information on the mean and/or covariance functions. In this article, we consider a Bayesian semiparametric approach of modeling the mean trajectories for longitudinal response coming from multiple related groups. We consider matrix stick-breaking process priors on the group mean parameters which allows information sharing on the mean trajectories across the groups. Simulation studies are performed to demonstrate the effectiveness of the proposed approach compared to the more traditional approaches. We analyze data from a one-year follow-up of nutrition education for hypercholesterolemic children with three different treatments where the children are from different age-groups. Our analysis provides more clinically useful information than the previous analysis of the same dataset. The proposed approach will be a very powerful tool for analyzing data from clinical trials and other medical experiments.
Collapse
|