301
|
Ibrahim JG, Chen MH, MacEachern SN. Bayesian variable selection for proportional hazards models. CAN J STAT 1999. [DOI: 10.2307/3316126] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
302
|
Ibrahim JG. The Practice of Bayesian Analysis. S. French and J. Q. Smith (eds), Wiley, New York, 1997. No. of pages: ix+284. Price: £29.99. ISBN 0-340-66240-9. Stat Med 1999. [DOI: 10.1002/(sici)1097-0258(19991030)18:20<2813::aid-sim289>3.0.co;2-g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
303
|
Abstract
We propose a likelihood method for estimating parameters in generalized linear models with missing covariates and a non-ignorable missing data mechanism. In this paper, we focus on one missing covariate. We use a logistic model for the probability that the covariate is missing, and allow this probability to depend on the incomplete covariate. We allow the covariates, including the incomplete covariate, to be either categorical or continuous. We propose an EM algorithm in this case. For a missing categorical covariate, we derive a closed form expression for the E- and M-steps of the EM algorithm for obtaining the maximum likelihood estimates (MLEs). For a missing continuous covariate, we use a Monte Carlo version of the EM algorithm to obtain the MLEs via the Gibbs sampler. The methodology is illustrated using an example from a breast cancer clinical trial in which time to disease progression is the outcome, and the incomplete covariate is a quality of life physical well-being score taken after the start of therapy. This score may be missing because the patients are sicker, so this covariate could be non-ignorably missing.
Collapse
|
304
|
Chen MH, Ibrahim JG, Sinha D. A New Bayesian Model for Survival Data with a Surviving Fraction. J Am Stat Assoc 1999. [DOI: 10.1080/01621459.1999.10474196] [Citation(s) in RCA: 170] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
305
|
Abstract
We propose a method for estimating parameters for general parametric regression models with an arbitrary number of missing covariates. We allow any pattern of missing data and assume that the missing data mechanism is ignorable throughout. When the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). We extend this method to continuous or mixed categorical and continuous covariates, and for arbitrary parametric regression models, by adapting a Monte Carlo version of the EM algorithm as discussed by Wei and Tanner (1990, Journal of the American Statistical Association 85, 699-704). In addition, we discuss the Gibbs sampler for sampling from the conditional distribution of the missing covariates given the observed data and show that the appropriate complete conditionals are log-concave. The log-concavity property of the conditional distributions will facilitate a straightforward implementation of the Gibbs sampler via the adaptive rejection algorithm of Gilks and Wild (1992, Applied Statistics 41, 337-348). We assume the model for the response given the covariates is an arbitrary parametric regression model, such as a generalized linear model, a parametric survival model, or a nonlinear model. We model the marginal distribution of the covariates as a product of one-dimensional conditional distributions. This allows us a great deal of flexibility in modeling the distribution of the covariates and reduces the number of nuisance parameters that are introduced in the E-step. We present examples involving both simulated and real data.
Collapse
|
306
|
Lipsitz SR, Ibrahim JG, Fitzmaurice GM. Likelihood methods for incomplete longitudinal binary responses with incomplete categorical covariates. Biometrics 1999; 55:214-23. [PMID: 11318157 DOI: 10.1111/j.0006-341x.1999.00214.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We consider longitudinal studies in which the outcome observed over time is binary and the covariates of interest are categorical. With no missing responses or covariates, one specifies a multinomial model for the responses given the covariates and uses maximum likelihood to estimate the parameters. Unfortunately, incomplete data in the responses and covariates are a common occurrence in longitudinal studies. Here we assume the missing data are missing at random (Rubin, 1976, Biometrika 63, 581-592). Since all of the missing data (responses and covariates) are categorical, a useful technique for obtaining maximum likelihood parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). In using the EM algorithm with missing responses and covariates, one specifies the joint distribution of the responses and covariates. Here we consider the parameters of the covariate distribution as a nuisance. In data sets where the percentage of missing data is high, the estimates of the nuisance parameters can lead to highly unstable estimates of the parameters of interest. We propose a conditional model for the covariate distribution that has several modeling advantages for the EM algorithm and provides a reduction in the number of nuisance parameters, thus providing more stable estimates in finite samples.
Collapse
|
307
|
Abstract
Because of current techniques of determining gene mutation, investigators are now interested in estimating the odds ratio between genetic status (mutation, no mutation) and an outcome variable such as disease cell type (A, B). In this paper we consider the mutation of the RAS genetic family. To determine if the genes have mutated, investigators look at five specific locations on the RAS gene. RAS mutated is a mutation in at least one of the five gene locations and RAS non-mutated is no mutation in any of the five locations. Owing to limited time and financial resources, one cannot obtain a complete genetic evaluation of all five locations on the gene for all patients. We propose the use of maximum likelihood (ML) with a 2(6) multinomial distribution formed by cross-classifying the binary mutation status at five locations by binary disease cell type. This ML method includes all patients regardless of completeness of data, treats the locations not evaluated as missing data, and uses the EM algorithm to estimate the odds ratio between genetic mutation status and the disease type. We compare the ML method to complete case estimates, and a method used by clinical investigators, which excludes patients with data on less than five locations who have no mutations on these sites.
Collapse
|
308
|
Ibrahim JG, Ryan LM, Chen MH. Using Historical Controls to Adjust for Covariates in Trend Tests for Binary Data. J Am Stat Assoc 1998. [DOI: 10.1080/01621459.1998.10473789] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
309
|
Abstract
The linear mixed effects model with normal errors is a popular model for the analysis of repeated measures and longitudinal data. The generalized linear model is useful for data that have non-normal errors but where the errors are uncorrelated. A descendant of these two models generates a model for correlated data with non-normal errors, called the generalized linear mixed model (GLMM). Frequentist attempts to fit these models generally rely on approximate results and inference relies on asymptotic assumptions. Recent advances in computing technology have made Bayesian approaches to this class of models computationally feasible. Markov chain Monte Carlo methods can be used to obtain 'exact' inference for these models, as demonstrated by Zeger and Karim. In the linear or generalized linear mixed model, the random effects are typically taken to have a fully parametric distribution, such as the normal distribution. In this paper, we extend the GLMM by allowing the random effects to have a non-parametric prior distribution. We do this using a Dirichlet process prior for the general distribution of the random effects. The approach easily extends to more general population models. We perform computations for the models using the Gibbs sampler.
Collapse
|
310
|
Kleinman KP, Ibrahim JG. A Semiparametric Bayesian Approach to the Random Effects Model. Biometrics 1998. [DOI: 10.2307/2533846] [Citation(s) in RCA: 118] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
311
|
Kleinman KP, Ibrahim JG. A semiparametric Bayesian approach to the random effects model. Biometrics 1998; 54:921-38. [PMID: 9750242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
In longitudinal random effects models, the random effects are typically assumed to have a normal distribution in both Bayesian and classical models. We provide a Bayesian model that allows the random effects to have a nonparametric prior distribution. We propose a Dirichlet process prior for the distribution of the random effects; computation is made possible by the Gibbs sampler. An example using marker data from an AIDS study is given to illustrate the methodology.
Collapse
|
312
|
Lipsitz SR, Ibrahim JG. Estimating equations with incomplete categorical covariates in the Cox model. Biometrics 1998; 54:1002-13. [PMID: 9750248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Incomplete covariate data is a common occurrence in many studies in which the outcome is survival time. When a full likelihood is specified, a useful technique for obtaining parameter estimates is the EM algorithm. We propose a set of estimating equations to estimate the parameters of Cox's proportional hazards model when some covariate values are missing. These estimating equations can be solved by an algorithm similar to the EM algorithm. Because of the computational burden of finding a solution to these estimating equations, we propose obtaining parameter estimates via Monte Carlo methods. Asymptotic variances of the parameter estimates are also derived. We present a clinical trials example with three covariates, two of which have some missing values.
Collapse
|
313
|
Lipsitz SR, Ibrahim JG. Estimating Equations with Incomplete Categorical Covariates in the Cox Model. Biometrics 1998. [DOI: 10.2307/2533852] [Citation(s) in RCA: 46] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
314
|
Hoeting JA, Ibrahim JG. Bayesian predictive simultaneous variable and transformation selection in the linear model. Comput Stat Data Anal 1998. [DOI: 10.1016/s0167-9473(98)00028-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
315
|
Kleinman KP, Ibrahim JG, Laird NM. A Bayesian Framework for Intent-to-Treat Analysis with Missing Data. Biometrics 1998. [DOI: 10.2307/2534013] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
316
|
Kleinman KP, Ibrahim JG, Laird NM. A Bayesian framework for intent-to-treat analysis with missing data. Biometrics 1998; 54:265-78. [PMID: 9544521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In longitudinal clinical trials, one analysis of interest is an intention-to-treat analysis, which groups subjects according to the randomized treatment regardless of whether they stayed on that treatment or not. When in addition to going off the randomized treatment subjects may also drop out of the study and be lost to follow-up, it is unclear what an intention-to-treat analysis should be. If measurements are made after treatment drop-out on a random sample of subjects who drop the treatment, then Hogan and Laird (1996, Biometrics 52, 1002-1017) present a random effects model, well suited to this type of analysis, which fits a two-piece linear spline to the data with the knot at the time the assigned treatment is dropped. This article presents a Bayesian approach to fitting a similar two-piece linear spline model and shows how the model can be applied to data that have no off-treatment observations.
Collapse
|
317
|
|
318
|
Abstract
One hundred and twenty-two sheet metal workers in New England were examined over a 10-year interval for loss of pulmonary function and the development of asbestosis or asbestos-related pleural fibrosis. Regression models using the generalized estimating equation (GEE) approach were created to investigate the relationship between exposure and pulmonary function after adjusting for smoking status, age, height, and asbestos-related x-ray changes. A history of shipyard work was a significant contributor to the loss of forced vital capacity (FVC). Among smokers, loss in forced expiratory volume at 1 sec (FEV1) also had a significant relationship to prior shipyard work. There was a borderline significant relationship between percentage predicted FEV1 and cumulative years of asbestos exposure in smokers, as well as years-since-initial-exposure in never-smokers. This study supports previous findings of obstructive airway changes in asbestos-exposed workers and identifies shipboard work as an important predictor of loss in pulmonary function even years after shipyard exposure to asbestos has ceased.
Collapse
|
319
|
Ibrahim JG. On Properties of Predictive Priors in Linear Models. AM STAT 1997. [DOI: 10.1080/00031305.1997.10474408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
320
|
Ibrahim JG, Chen MH. Predictive Variable Selection for the Multivariate Linear Model. Biometrics 1997. [DOI: 10.2307/2533950] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
321
|
Ibrahim JG, Chen MH. Predictive variable selection for the multivariate linear model. Biometrics 1997; 53:465-78. [PMID: 9192446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
We develop a predictive Bayesian approach to variable selection in the multivariate linear model. A criterion derived from the Bayesian predictive density is proposed and a calibration is provided for it. Reference and informative priors are discussed, and an automated method that focuses on the response variable is proposed for specifying informative priors for the regression parameters. Relationships between the proposed criterion and other several well-known criteria are examined. Illustrative examples involving real data are given to demonstrate the methodology.
Collapse
|
322
|
Weiss RE, Wang Y, Ibrahim JG. Predictive Model Selection for Repeated Measures Random Effects Models Using Bayes Factors. Biometrics 1997. [DOI: 10.2307/2533960] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
323
|
Weiss RE, Wang Y, Ibrahim JG. Predictive model selection for repeated measures random effects models using Bayes factors. Biometrics 1997; 53:592-602. [PMID: 9192454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The random effects model fit to repeated measures data is an extremely common model and data structure in current biostatistical practice. Modern data analysis often involves the selection of models within broad classes of prespecified models, but for models beyond the generalized linear model, few model-selection tools have been actively studied. In a Bayesian analysis, Bayes factors are the natural tool to use to explore these classes of models. In this paper, we develop a predictive approach for specifying the priors of a repeated measures random effects model with emphasis on selecting the fixed effects. The advantage of the predictive approach is that a single predictive specification is used to specify priors for all models considered. The methodology is applied to a pediatric pain data analysis.
Collapse
|
324
|
Ewell M, Ibrahim JG. The large sample distribution of the weighted log rank statistic under general local alternatives. LIFETIME DATA ANALYSIS 1997; 3:5-12. [PMID: 9384622 DOI: 10.1023/a:1009690200504] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
We derive the large sample distribution of the weighted log rank statistic under a general class of local alternatives in which both the cure rates and the conditional distribution of time to failure among those who fail are assumed to vary in the two treatment arms. The analytic result presented here is important to data analysts who are designing clinical trials for diseases such as non-Hodgkins lymphoma, leukemia and melanoma, where a significant proportion of patients are cured. We present a numerical illustration comparing powers obtained from the analytic result to those obtained from simulations.
Collapse
|
325
|
Ibrahim JG, Ryan LM. Use of historical controls in time-adjusted trend tests for carcinogenicity. Biometrics 1996; 52:1478-85. [PMID: 8962464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
We develop a method for incorporating historical control information into time-adjusted tests for dose effects in carcinogenicity studies. After discretizing the time scale, we use a multinomial distribution to model the number of animals dying with tumor in each interval. Data from past studies are used to estimate the parameters characterizing the prior. A score test derived from the resulting Dirichlet-multinomial generalizes the test of Tarone (1982, Biometrics 38, 215-220) and reduces, in the limit, to the log-rank test in the case of a diffuse prior. The methodology is illustrated with data from a study of the fire retardant 2,2-Bis(bromomethyl)-1,3-propanediol.
Collapse
|