1
|
Cho S, Psioda MA, Ibrahim JG. Bayesian joint modeling of multivariate longitudinal and survival outcomes using Gaussian copulas. Biostatistics 2024:kxae009. [PMID: 38669589 DOI: 10.1093/biostatistics/kxae009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 03/06/2024] [Accepted: 03/11/2024] [Indexed: 04/28/2024] Open
Abstract
There is an increasing interest in the use of joint models for the analysis of longitudinal and survival data. While random effects models have been extensively studied, these models can be hard to implement and the fixed effect regression parameters must be interpreted conditional on the random effects. Copulas provide a useful alternative framework for joint modeling. One advantage of using copulas is that practitioners can directly specify marginal models for the outcomes of interest. We develop a joint model using a Gaussian copula to characterize the association between multivariate longitudinal and survival outcomes. Rather than using an unstructured correlation matrix in the copula model to characterize dependence structure as is common, we propose a novel decomposition that allows practitioners to impose structure (e.g., auto-regressive) which provides efficiency gains in small to moderate sample sizes and reduces computational complexity. We develop a Markov chain Monte Carlo model fitting procedure for estimation. We illustrate the method's value using a simulation study and present a real data analysis of longitudinal quality of life and disease-free survival data from an International Breast Cancer Study Group trial.
Collapse
|
2
|
Chen X, Nifong B, Alt EM, Psioda MA, Ibrahim JG. Bayesian design of clinical trials using the scale transformed power prior. J Biopharm Stat 2024:1-20. [PMID: 38639571 DOI: 10.1080/10543406.2024.2330205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 03/01/2024] [Indexed: 04/20/2024]
Abstract
There are many Bayesian design methods allowing for the incorporation of historical data for sample size determination (SSD) in situations where the outcome in the historical data is the same as the outcome of a new study. However, there is a dearth of methods supporting the incorporation of data from a previously completed clinical trial that investigated the same or similar treatment as the new trial but had a primary outcome that is different. We propose a simulation-based Bayesian SSD framework using the partial-borrowing scale transformed power prior (straPP). The partial-borrowing straPP is developed by applying a novel scale transformation to a traditional power prior on the parameters from the historical data model to make the information better align with the new data model. The scale transformation is based on the assumption that the standardized parameters (i.e., parameters multiplied by the square roots of their respective Fisher information matrices) are equal. To illustrate the method, we present results from simulation studies that use real data from a previously completed clinical trial to design a new clinical trial with a primary time-to-event endpoint.
Collapse
|
3
|
Tan X, Wang W, Zeng D, Liu GF, Diao G, Jafari N, Alt EM, Ibrahim JG. Safety signal detection with control of latent factors. Stat Med 2024; 43:1397-1418. [PMID: 38297431 DOI: 10.1002/sim.10015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 10/26/2023] [Accepted: 12/27/2023] [Indexed: 02/02/2024]
Abstract
Postmarket drug safety database like vaccine adverse event reporting system (VAERS) collect thousands of spontaneous reports annually, with each report recording occurrences of any adverse events (AEs) and use of vaccines. We hope to identify signal vaccine-AE pairs, for which certain vaccines are statistically associated with certain adverse events (AE), using such data. Thus, the outcomes of interest are multiple AEs, which are binary outcomes and could be correlated because they might share certain latent factors; and the primary covariates are vaccines. Appropriately accounting for the complex correlation among AEs could improve the sensitivity and specificity of identifying signal vaccine-AE pairs. We propose a two-step approach in which we first estimate the shared latent factors among AEs using a working multivariate logistic regression model, and then use univariate logistic regression model to examine the vaccine-AE associations after controlling for the latent factors. Our simulation studies show that this approach outperforms current approaches in terms of sensitivity and specificity. We apply our approach in analyzing VAERS data and report our findings.
Collapse
|
4
|
Weideman AMK, Wang R, Ibrahim JG, Jiang Y. Canopy2: tumor phylogeny inference by bulk DNA and single-cell RNA sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.18.585595. [PMID: 38562795 PMCID: PMC10983938 DOI: 10.1101/2024.03.18.585595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Tumors are comprised of a mixture of distinct cell populations that differ in terms of genetic makeup and function. Such heterogeneity plays a role in the development of drug resistance and the ineffectiveness of targeted cancer therapies. Insight into this complexity can be obtained through the construction of a phylogenetic tree, which illustrates the evolutionary lineage of tumor cells as they acquire mutations over time. We propose Canopy2, a Bayesian framework that uses single nucleotide variants derived from bulk DNA and single-cell RNA sequencing to infer tumor phylogeny and conduct mutational profiling of tumor subpopulations. Canopy2 uses Markov chain Monte Carlo methods to sample from a joint probability distribution involving a mixture of binomial and beta-binomial distributions, specifically chosen to account for the sparsity and stochasticity of the single-cell data. Canopy2 demystifies the sources of zeros in the single-cell data and separates zeros categorized as non-cancerous (cells without mutations), stochastic (mutations not expressed due to bursting), and technical (expressed mutations not picked up by sequencing). Simulations demonstrate that Canopy2 consistently outperforms competing methods and reconstructs the clonal tree with high fidelity, even in situations involving low sequencing depth, poor single-cell yield, and highly-advanced and polyclonal tumors. We further assess the performance of Canopy2 through application to breast cancer and glioblastoma data, benchmarking against existing methods. Canopy2 is an open-source R package available at https://github.com/annweideman/canopy2.
Collapse
|
5
|
Heiling HM, Rashid NU, Li Q, Peng XL, Yeh JJ, Ibrahim JG. Efficient computation of high-dimensional penalized generalized linear mixed models by latent factor modeling of the random effects. Biometrics 2024; 80:ujae016. [PMID: 38497825 PMCID: PMC10946237 DOI: 10.1093/biomtc/ujae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 11/22/2023] [Accepted: 02/16/2024] [Indexed: 03/19/2024]
Abstract
Modern biomedical datasets are increasingly high-dimensional and exhibit complex correlation structures. Generalized linear mixed models (GLMMs) have long been employed to account for such dependencies. However, proper specification of the fixed and random effects in GLMMs is increasingly difficult in high dimensions, and computational complexity grows with increasing dimension of the random effects. We present a novel reformulation of the GLMM using a factor model decomposition of the random effects, enabling scalable computation of GLMMs in high dimensions by reducing the latent space from a large number of random effects to a smaller set of latent factors. We also extend our prior work to estimate model parameters using a modified Monte Carlo Expectation Conditional Minimization algorithm, allowing us to perform variable selection on both the fixed and random effects simultaneously. We show through simulation that through this factor model decomposition, our method can fit high-dimensional penalized GLMMs faster than comparable methods and more easily scale to larger dimensions not previously seen in existing approaches.
Collapse
|
6
|
LaVange LM, Alt EM, Ibrahim JG. Discussion of "Optimal test procedures for multiple hypotheses controlling the familywise expected loss" by Willi Maurer, Frank Bretz, and Xiaolei Xun. Biometrics 2023; 79:2802-2805. [PMID: 37488695 DOI: 10.1111/biom.13910] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 01/30/2023] [Indexed: 07/26/2023]
Abstract
We provide commentary on the paper by Willi Maurer, Frank Bretz, and Xiaolei Xun entitled, "Optimal test procedures for multiple hypotheses controlling for the familywise expected loss." The authors provide an excellent discussion of the multiplicity problem in clinical trials and propose a novel approach based on a decision-theoretic framework that incorporates loss functions that can vary across multiple hypotheses in a family. We provide some considerations for the practical use of the authors' proposed methods as well as some alternative methods that may also be of interest in this setting.
Collapse
|
7
|
Heiling HM, Rashid NU, Li Q, Ibrahim JG. glmmPen: High Dimensional Penalized Generalized Linear Mixed Models. THE R JOURNAL 2023; 15:106-128. [PMID: 38818017 PMCID: PMC11138212 DOI: 10.32614/rj-2023-086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Generalized linear mixed models (GLMMs) are widely used in research for their ability to model correlated outcomes with non-Gaussian conditional distributions. The proper selection of fixed and random effects is a critical part of the modeling process, where model misspecification may lead to significant bias. However, the joint selection of fixed and random effects has historically been limited to lower dimensional GLMMs, largely due to the use of criterion-based model selection strategies. Here we present the R package glmmPen, one of the first to select fixed and random effects in higher dimension using a penalized GLMM modeling framework. Model parameters are estimated using a Monte Carlo expectation conditional minimization (MCECM) algorithm, which leverages Stan and RcppArmadillo for increased computational efficiency. Our package supports the Binomial, Gaussian, and Poisson families and multiple penalty functions. In this manuscript we discuss the modeling procedure, estimation scheme, and software implementation through application to a pancreatic cancer subtyping study. Simulation results show our method has good performance in selecting both the fixed and random effects in high dimensional GLMMs.
Collapse
|
8
|
Xu J, Psioda MA, Ibrahim JG. Bayesian design of clinical trials using joint models for recurrent and terminating events. Biostatistics 2023; 24:866-884. [PMID: 35851911 DOI: 10.1093/biostatistics/kxac025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 06/19/2022] [Accepted: 06/22/2022] [Indexed: 10/19/2023] Open
Abstract
Joint models for recurrent event and terminating event data are increasingly used for the analysis of clinical trials. However, few methods have been proposed for designing clinical trials using these models. In this article, we develop a Bayesian clinical trial design methodology focused on evaluating the effect of an investigational product (IP) on both recurrent event and terminating event processes considered as multiple primary endpoints, using a multifrailty joint model. Dependence between the recurrent and terminating event processes is accounted for using a shared frailty. Inferences for the multiple primary outcomes are based on posterior model probabilities corresponding to mutually exclusive hypotheses regarding the benefit of IP with respect to the recurrent and terminating event processes. We propose an approach for sample size determination to ensure the trial design has a high power and a well-controlled type I error rate, with both operating characteristics defined from a Bayesian perspective. We also consider a generalization of the proposed parametric model that uses a nonparametric mixture of Dirichlet processes to model the frailty distributions and compare its performance to the proposed approach. We demonstrate the methodology by designing a colorectal cancer clinical trial with a goal of demonstrating that the IP causes a favorable effect on at least one of the two outcomes but no harm on either.
Collapse
|
9
|
Bean NW, Ibrahim JG, Psioda MA. Bayesian joint models for multi-regional clinical trials. Biostatistics 2023:kxad023. [PMID: 37669215 DOI: 10.1093/biostatistics/kxad023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 08/09/2023] [Accepted: 08/10/2023] [Indexed: 09/07/2023] Open
Abstract
In recent years, multi-regional clinical trials (MRCTs) have increased in popularity in the pharmaceutical industry due to their ability to accelerate the global drug development process. To address potential challenges with MRCTs, the International Council for Harmonisation released the E17 guidance document which suggests the use of statistical methods that utilize information borrowing across regions if regional sample sizes are small. We develop an approach that allows for information borrowing via Bayesian model averaging in the context of a joint analysis of survival and longitudinal data from MRCTs. In this novel application of joint models to MRCTs, we use Laplace's method to integrate over subject-specific random effects and to approximate posterior distributions for region-specific treatment effects on the time-to-event outcome. Through simulation studies, we demonstrate that the joint modeling approach can result in an increased rejection rate when testing the global treatment effect compared with methods that analyze survival data alone. We then apply the proposed approach to data from a cardiovascular outcomes MRCT.
Collapse
|
10
|
Lee E, Ibrahim JG, Zhu H. Bayesian bi-level variable selection for genome-wide survival study. Genomics Inform 2023; 21:e28. [PMID: 37813624 PMCID: PMC10584651 DOI: 10.5808/gi.23047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 06/26/2023] [Accepted: 06/27/2023] [Indexed: 10/11/2023] Open
Abstract
Mild cognitive impairment (MCI) is a clinical syndrome characterized by the onset and evolution of cognitive impairments, often considered a transitional stage to Alzheimer's disease (AD). The genetic traits of MCI patients who experience a rapid progression to AD can enhance early diagnosis capabilities and facilitate drug discovery for AD. While a genome-wide association study (GWAS) is a standard tool for identifying single nucleotide polymorphisms (SNPs) related to a disease, it fails to detect SNPs with small effect sizes due to stringent control for multiple testing. Additionally, the method does not consider the group structures of SNPs, such as genes or linkage disequilibrium blocks, which can provide valuable insights into the genetic architecture. To address the limitations, we propose a Bayesian bi-level variable selection method that detects SNPs associated with time of conversion from MCI to AD. Our approach integrates group inclusion indicators into an accelerated failure time model to identify important SNP groups. Additionally, we employ data augmentation techniques to impute censored time values using a predictive posterior. We adapt Dirichlet-Laplace shrinkage priors to incorporate the group structure for SNP-level variable selection. In the simulation study, our method outperformed other competing methods regarding variable selection. The analysis of Alzheimer's Disease Neuroimaging Initiative (ADNI) data revealed several genes directly or indirectly related to AD, whereas a classical GWAS did not identify any significant SNPs.
Collapse
|
11
|
Vincent BG, File DM, McKinnon KP, Moore DT, Frelinger JA, Collins EJ, Ibrahim JG, Bixby L, Reisdorf S, Laurie SJ, Park YA, Anders CK, Collichio FA, Muss HB, Carey LA, van Deventer HW, Dees EC, Serody JS. Efficacy of a Dual-Epitope Dendritic Cell Vaccine as Part of Combined Immunotherapy for HER2-Expressing Breast Tumors. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2023:263816. [PMID: 37204246 DOI: 10.4049/jimmunol.2300077] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 05/02/2023] [Indexed: 05/20/2023]
Abstract
Previous work from our group and others has shown that patients with breast cancer can generate a T cell response against specific human epidermal growth factor 2 (HER2) epitopes. In addition, preclinical work has shown that this T cell response can be augmented by Ag-directed mAb therapy. This study evaluated the activity and safety of a combination of dendritic cell (DC) vaccination given with mAb and cytotoxic therapy. We performed a phase I/II study using autologous DCs pulsed with two different HER2 peptides given with trastuzumab and vinorelbine to a study cohort of patients with HER2-overexpressing and a second with HER2 nonoverexpressing metastatic breast cancer. Seventeen patients with HER2-overexpressing and seven with nonoverexpressing disease were treated. Treatment was well tolerated, with one patient removed from therapy because of toxicity and no deaths. Forty-six percent of patients had stable disease after therapy, with 4% achieving a partial response and no complete responses. Immune responses were generated in the majority of patients but did not correlate with clinical response. However, in one patient, who has survived >14 y since treatment in the trial, a robust immune response was demonstrated, with 25% of her T cells specific to one of the peptides in the vaccine at the peak of her response. These data suggest that autologous DC vaccination when given with anti-HER2-directed mAb therapy and vinorelbine is safe and can induce immune responses, including significant T cell clonal expansion, in a subset of patients.
Collapse
|
12
|
Hauser P, Tan X, Chen F, Ibrahim JG. Bayesian generalized linear low rank regression models for the detection of vaccine-adverse event associations. Stat Med 2023; 42:2009-2026. [PMID: 36974659 DOI: 10.1002/sim.9711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 02/27/2023] [Accepted: 03/07/2023] [Indexed: 03/29/2023]
Abstract
We propose a generalized linear low-rank mixed model (GLLRM) for the analysis of both high-dimensional and sparse responses and covariates where the responses may be binary, counts, or continuous. This development is motivated by the problem of identifying vaccine-adverse event associations in post-market drug safety databases, where an adverse event is any untoward medical occurrence or health problem that occurs during or following vaccination. The GLLRM is a generalization of a generalized linear mixed model in that it integrates a factor analysis model to describe the dependence among responses and a low-rank matrix to approximate the high-dimensional regression coefficient matrix. A sampling procedure combining the Gibbs sampler and Metropolis and Gamerman algorithms is employed to obtain posterior estimates of the regression coefficients and other model parameters. Testing of response-covariate pair associations is based on the posterior distribution of the corresponding regression coefficients. Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures on binary and count outcomes. We further illustrate the GLLRM via a real data example based on the Vaccine Adverse Event Reporting System.
Collapse
|
13
|
Alt EM, Psioda MA, Ibrahim JG. A Bayesian approach to study design and analysis with type I error rate control for response variables of mixed types. Stat Med 2023; 42:1722-1740. [PMID: 36929939 DOI: 10.1002/sim.9696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 11/29/2022] [Accepted: 02/20/2023] [Indexed: 03/18/2023]
Abstract
There has been increased interest in the design and analysis of studies consisting of multiple response variables of mixed types. For example, in clinical trials, it is desirable to establish efficacy for a treatment effect in primary and secondary outcomes. In this article, we develop Bayesian approaches for hypothesis testing and study planning for data consisting of multiple response variables of mixed types with covariates. We assume that the responses are correlated via a Gaussian copula, and that the model for each response is, marginally, a generalized linear model (GLM). Taking a fully Bayesian approach, the proposed method enables inference based on the joint posterior distribution of the parameters. Under some mild conditions, we show that the joint distribution of the posterior probabilities under any Bayesian analysis converges to a Gaussian copula distribution as the sample size tends to infinity. Using this result, we develop an approach to control the type I error rate under multiple testing. Simulation results indicate that the method is more powerful than conducting marginal regression models and correcting for multiplicity using the Bonferroni-Holm Method. We also develop a Bayesian approach to sample size determination in the presence of response variables of mixed types, extending the concept of probability of success (POS) to multiple response variables of mixed types.
Collapse
|
14
|
Shen Y, A. Psioda M, G. Ibrahim J. BayesPPD: An R Package for Bayesian Sample Size Determination Using the Power and Normalized Power Prior for Generalized Linear Models. THE R JOURNAL 2023. [DOI: 10.32614/rj-2023-016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
|
15
|
Sheikh MT, Chen MH, Gelfond JA, Sun W, Ibrahim JG. New C-indices for assessing importance of longitudinal biomarkers in fitting competing risks survival data in the presence of partially masked causes. Stat Med 2023; 42:1308-1322. [PMID: 36696954 DOI: 10.1002/sim.9671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 12/20/2022] [Accepted: 01/13/2023] [Indexed: 01/27/2023]
Abstract
Competing risks survival data in the presence of partially masked causes are frequently encountered in medical research or clinical trials. When longitudinal biomarkers are also available, it is of great clinical importance to examine associations between the longitudinal biomarkers and the cause-specific survival outcomes. In this article, we propose a cause-specific C-index for joint models of longitudinal and competing risks survival data accounting for masked causes. We also develop a posterior predictive algorithm for computing the out-of-sample cause-specific C-index using Markov chain Monte Carlo samples from the joint posterior of the in-sample longitudinal and competing risks survival data. We further construct the Δ $$ \Delta $$ C-index to quantify the strength of association between the longitudinal and cause-specific survival data, or between the out-of-sample longitudinal and survival data. Empirical performance of the proposed assessment criteria is examined through an extensive simulation study. An in-depth analysis of the real data from large cancer prevention trials is carried out to demonstrate the usefulness of the proposed methodology.
Collapse
|
16
|
Alt EM, Nifong B, Chen X, Psioda MA, Ibrahim JG. The scale transformed power prior for use with historical data from a different outcome model. Stat Med 2023; 42:1-14. [PMID: 36318875 PMCID: PMC9789178 DOI: 10.1002/sim.9598] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 08/26/2022] [Accepted: 10/06/2022] [Indexed: 11/05/2022]
Abstract
We develop the scale transformed power prior for settings where historical and current data involve different data types, such as binary and continuous data. This situation arises often in clinical trials, for example, when historical data involve binary responses and the current data involve some other type of continuous or discrete outcome. The power prior, proposed by Ibrahim and Chen, does not address the issue of different data types. Herein, we develop a new type of power prior, which we call the scale transformed power prior (straPP). The straPP is constructed by transforming the power prior for the historical data by rescaling the parameter using a function of the Fisher information matrices for the historical and current data models, thereby shifting the scale of the parameter vector from that of the historical to that of the current data. Examples are presented to motivate the need for such a transformation, and simulation studies are presented to illustrate the performance advantages of the straPP over the power prior and other informative and noninformative priors. A real dataset from a clinical trial undertaken to study a novel transitional care model for stroke survivors is used to illustrate the methodology.
Collapse
|
17
|
Zhang Z, Wu Y, Xiong D, Ibrahim JG, Srivastava A, Zhu H. Rejoinder: LESA: Longitudinal Elastic Shape Analysis of Brain Subcortical Structures. J Am Stat Assoc 2023. [DOI: 10.1080/01621459.2022.2139264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
|
18
|
Xu J, Psioda MA, Ibrahim JG. Bayesian Design of Clinical Trials Using Joint Cure Rate Models for Longitudinal and Time-to-Event Data. LIFETIME DATA ANALYSIS 2023; 29:213-233. [PMID: 36357647 DOI: 10.1007/s10985-022-09581-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 10/24/2022] [Indexed: 06/16/2023]
Abstract
For clinical trial design and analysis, there has been extensive work related to using joint models for longitudinal and time-to-event data without a cure fraction (i.e., when all patients are at risk for the event of interest), but comparatively little treatment has been given to design and analysis of clinical trials using joint models that incorporate a cure fraction. In this paper, we develop a Bayesian clinical trial design methodology focused on evaluating the treatment's effect on a time-to-event endpoint using a promotion time cure rate model, where the longitudinal process is incorporated into the hazard model for the promotion times. A piecewise linear hazard model for the period after assessment of the longitudinal measure ends is proposed as an alternative to extrapolating the longitudinal trajectory. This may be advantageous in scenarios where the period of time from the end of longitudinal measurements until the end of observation is substantial. Inference for the time-to-event endpoint is based on a novel estimand which combines the treatment's effect on the probability of cure and its effect on the promotion time distribution, mediated by the longitudinal outcome. We propose an approach for sample size determination such that the design has a high power and a well-controlled type I error rate with both operating characteristics defined from a Bayesian perspective. We demonstrate the methodology by designing a breast cancer clinical trial with a primary time-to-event endpoint where longitudinal outcomes are measured periodically during follow up.
Collapse
|
19
|
Lim D, Chen MH, G. Ibrahim J, Kim S, K. Shah A, Lin J. metapack: An R Package for Bayesian Meta-Analysis and Network Meta-Analysis with a Unified Formula Interface. THE R JOURNAL 2022; 14:142-161. [PMID: 37168034 PMCID: PMC10168678 DOI: 10.32614/rj-2022-047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Meta-analysis, a statistical procedure that compares, combines, and synthesizes research findings from multiple studies in a principled manner, has become popular in a variety of fields. Meta-analyses using study-level (or equivalently aggregate) data are of particular interest due to data availability and modeling flexibility. In this paper, we describe an R package metapack that introduces a unified formula interface for both meta-analysis and network meta-analysis. The user interface-and therefore the package-allows flexible variance-covariance modeling for multivariate meta-analysis models and univariate network meta-analysis models. Complicated computing for these models has prevented their widespread adoption. The package also provides functions to generate relevant plots and perform statistical inferences like model assessments. Use cases are demonstrated using two real data sets contained in metapack.
Collapse
|
20
|
Zhang Z, Wu Y, Xiong D, Ibrahim JG, Srivastava A, Zhu H. LESA: Longitudinal Elastic Shape Analysis of Brain Subcortical Structures. J Am Stat Assoc 2022; 118:3-17. [PMID: 37153845 PMCID: PMC10162479 DOI: 10.1080/01621459.2022.2102984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 07/01/2022] [Accepted: 07/09/2022] [Indexed: 10/17/2022]
Abstract
Over the past 30 years, magnetic resonance imaging has become a ubiquitous tool for accurately visualizing the change and development of the brain's subcortical structures (e.g., hippocampus). Although subcortical structures act as information hubs of the nervous system, their quantification is still in its infancy due to many challenges in shape extraction, representation, and modeling. Here, we develop a simple and efficient framework of longitudinal elastic shape analysis (LESA) for subcortical structures. Integrating ideas from elastic shape analysis of static surfaces and statistical modeling of sparse longitudinal data, LESA provides a set of tools for systematically quantifying changes of longitudinal subcortical surface shapes from raw structure MRI data. The key novelties of LESA include: (i) it can efficiently represent complex subcortical structures using a small number of basis functions and (ii) it can accurately delineate the spatiotemporal shape changes of the human subcortical structures. We applied LESA to analyze three longitudinal neuroimaging data sets and showcase its wide applications in estimating continuous shape trajectories, building life-span growth patterns, and comparing shape differences among different groups. In particular, with the Alzheimer's Disease Neuroimaging Initiative (ADNI) data, we found that the Alzheimer's Disease (AD) can significantly speed the shape change of ventricle and hippocampus from 60 to 75 years old compared with normal aging.
Collapse
|
21
|
Diao G, Ma H, Zeng D, Ke C, Ibrahim JG. Synthesizing studies for comparing different treatment sequences in clinical trials. Stat Med 2022; 41:5134-5149. [PMID: 36005293 DOI: 10.1002/sim.9559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 07/29/2022] [Accepted: 08/02/2022] [Indexed: 11/09/2022]
Abstract
With advances in cancer treatments and improved patient survival, more patients may go through multiple lines of treatment. It is of clinical importance to choose a sequence of effective treatments (eg, lines of treatment) for individual patients with the goal of optimizing their long-term clinical outcome (eg, survival). Several important issues arise in cancer studies. First, cancer clinical trials are usually conducted by each line of treatment. For a treatment sequence, we may have first line and second line treatment data from two different studies. Second, there is typically a treatment initiation period varying from patient to patient between progression of disease and the start of the second line treatment due to administrative reasons. Additionally, the choice of the second line treatment for patients with progression of disease may depend on their characteristics. We address all these issues and develop semiparametric methods under the potential outcome framework for the estimation of the overall survival probability for a treatment sequence and for comparing different treatment sequences. We establish the large sample properties of the proposed inferential procedures. Simulation studies and an application to a colorectal clinical trial are provided.
Collapse
|
22
|
Jia B, Zeng D, Liao JJZ, Liu GF, Tan X, Diao G, Ibrahim JG. Mixture survival trees for cancer risk classification. LIFETIME DATA ANALYSIS 2022; 28:356-379. [PMID: 35486260 PMCID: PMC10402927 DOI: 10.1007/s10985-022-09552-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 03/04/2022] [Indexed: 06/14/2023]
Abstract
In oncology studies, it is important to understand and characterize disease heterogeneity among patients so that patients can be classified into different risk groups and one can identify high-risk patients at the right time. This information can then be used to identify a more homogeneous patient population for developing precision medicine. In this paper, we propose a mixture survival tree approach for direct risk classification. We assume that the patients can be classified into a pre-specified number of risk groups, where each group has distinct survival profile. Our proposed tree-based methods are devised to estimate latent group membership using an EM algorithm. The observed data log-likelihood function is used as the splitting criterion in recursive partitioning. The finite sample performance is evaluated by extensive simulation studies and the proposed method is illustrated by a case study in breast cancer.
Collapse
|
23
|
Alt EM, Psioda MA, Ibrahim JG. A hierarchical prior for generalized linear models based on predictions for the mean response. Biostatistics 2022; 23:1165-1181. [PMID: 35770800 DOI: 10.1093/biostatistics/kxac022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 05/03/2022] [Accepted: 06/09/2022] [Indexed: 11/14/2022] Open
Abstract
There has been increased interest in using prior information in statistical analyses. For example, in rare diseases, it can be difficult to establish treatment efficacy based solely on data from a prospective study due to low sample sizes. To overcome this issue, an informative prior to the treatment effect may be elicited. We develop a novel extension of the conjugate prior of Chen and Ibrahim (2003) that enables practitioners to elicit a prior prediction for the mean response for generalized linear models, treating the prediction as random. We refer to the hierarchical prior as the hierarchical prediction prior (HPP). For independent and identically distributed settings and the normal linear model, we derive cases for which the hyperprior is a conjugate prior. We also develop an extension of the HPP in situations where summary statistics from a previous study are available. The HPP allows for discounting based on the quality of individual level predictions, and simulation results suggest that, compared to the conjugate prior and the power prior, the HPP efficiency gains (e.g., lower mean squared error) where predictions are incompatible with the data. An efficient Monte Carlo Markov chain algorithm is developed. Applications illustrate that inferences under the HPP are more robust to prior-data conflict compared to selected nonhierarchical priors.
Collapse
|
24
|
Gelfond JA, Hernandez B, Goros M, Ibrahim JG, Chen MH, Sun W, Leach RJ, Kattan MW, Thompson IM, Ankerst DP, Liss M. Prediction of future risk of any and higher-grade prostate cancer based on the PLCO and SELECT trials. BMC Urol 2022; 22:45. [PMID: 35351104 PMCID: PMC8966358 DOI: 10.1186/s12894-022-00986-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 03/01/2022] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND A model was built that characterized effects of individual factors on five-year prostate cancer (PCa) risk in the Prostate, Lung, Colon, and Ovarian Cancer Screening Trial (PLCO) and the Selenium and Vitamin E Cancer Prevention Trial (SELECT). This model was validated in a third San Antonio Biomarkers of Risk (SABOR) screening cohort. METHODS A prediction model for 1- to 5-year risk of developing PCa and Gleason > 7 PCa (HG PCa) was built on PLCO and SELECT using the Cox proportional hazards model adjusting for patient baseline characteristics. Random forests and neural networks were compared to Cox proportional hazard survival models, using the trial datasets for model building and the SABOR cohort for model evaluation. The most accurate prediction model is included in an online calculator. RESULTS The respective rates of PCa were 8.9%, 7.2%, and 11.1% in PLCO (n = 31,495), SELECT (n = 35,507), and SABOR (n = 1790) over median follow-up of 11.7, 8.1 and 9.0 years. The Cox model showed higher prostate-specific antigen (PSA), BMI and age, and African American race to be associated with PCa and HGPCa. Five-year risk predictions from the combined SELECT and PLCO model effectively discriminated risk in the SABOR cohort with C-index 0.76 (95% CI [0.72, 0.79]) for PCa, and 0.74 (95% CI [0.65,0.83]) for HGPCa. CONCLUSIONS A 1- to 5-year PCa risk prediction model developed from PLCO and SELECT was validated with SABOR and implemented online. This model can individualize and inform shared screening decisions.
Collapse
|
25
|
Alt EM, Psioda MA, Ibrahim JG. Bayesian multivariate probability of success using historical data with type I error rate control. Biostatistics 2022; 24:17-31. [PMID: 34981114 PMCID: PMC9748585 DOI: 10.1093/biostatistics/kxab050] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 12/09/2021] [Accepted: 12/14/2021] [Indexed: 01/05/2023] Open
Abstract
In clinical trials, it is common to have multiple clinical outcomes (e.g., coprimary endpoints or a primary and multiple secondary endpoints). It is often desirable to establish efficacy in at least one of multiple clinical outcomes, which leads to a multiplicity problem. In the frequentist paradigm, the most popular methods to correct for multiplicity are typically conservative. Moreover, despite guidance from regulators, it is difficult to determine the sample size of a future study with multiple clinical outcomes. In this article, we introduce a Bayesian methodology for multiple testing that asymptotically guarantees type I error control. Using a seemingly unrelated regression model, correlations between outcomes are specifically modeled, which enables inference on the joint posterior distribution of the treatment effects. Simulation results suggest that the proposed Bayesian approach is more powerful than the method of Holm (1979), which is commonly utilized in practice as a more powerful alternative to the ubiquitous Bonferroni correction. We further develop multivariate probability of success, a Bayesian method to robustly determine sample size in the presence of multiple outcomes.
Collapse
|