1
|
Chen Q, Zhang F, Chen MH, Cong XJ. Estimation of treatment effects and model diagnostics with two-way time-varying treatment switching: an application to a head and neck study. Lifetime Data Anal 2020; 26:685-707. [PMID: 32125594 PMCID: PMC7483904 DOI: 10.1007/s10985-020-09495-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Accepted: 02/15/2020] [Indexed: 06/10/2023]
Abstract
Treatment switching frequently occurs in clinical trials due to ethical reasons. Intent-to-treat analysis without adjusting for switching yields biased and inefficient estimates of the treatment effects. In this paper, we propose a class of semiparametric semi-competing risks transition survival models to accommodate two-way time-varying switching. Theoretical properties of the proposed method are examined. An efficient expectation-maximization algorithm is derived to obtain maximum likelihood estimates and model diagnostic tools. Existing software is used to implement the algorithm. Simulation studies are conducted to demonstrate the validity of the model. The proposed method is further applied to data from a clinical trial with patients having recurrent or metastatic squamous-cell carcinoma of head and neck.
Collapse
Affiliation(s)
- Qingxia Chen
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA.
| | | | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, 215 Glenbrook Road, U-4120, Storrs, CT, 06269, USA
| | | |
Collapse
|
2
|
Cheng C, Wang R, Zhang H. Surrogate Residuals for Discrete Choice Models. J Comput Graph Stat 2020; 30:67-77. [PMID: 33814875 DOI: 10.1080/10618600.2020.1775618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Discrete choice models (DCMs) are a class of models for modeling response variables that take values from a set of alternatives. Examples include logistic regression, probit regression, and multinomial logistic regression. These models are also referred together as generalized linear models. Although there exist methods for the goodness of fit of DCMs, defining intuitive residuals for such models has been difficult due to the fact that the responses are categorical values instead of continuous numbers. In this article, we propose the surrogate residual for DCMs based on the surrogate approach (Liu and Zhang 2018), which deals with an ordinal response. We consider categorical responses that may or may not be ordered. We shall show that our residual can be used to diagnose misspecification in the aspects of mean structure, individual-specific coefficients, and interaction effects. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Chao Cheng
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT
| | - Rui Wang
- Department of Public Economics, School of Economics, Xiamen University, Xiamen, China
| | - Heping Zhang
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT
| |
Collapse
|
3
|
Feng C, Li L, Sadeghpour A. A comparison of residual diagnosis tools for diagnosing regression models for count data. BMC Med Res Methodol 2020; 20:175. [PMID: 32611379 PMCID: PMC7329451 DOI: 10.1186/s12874-020-01055-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Accepted: 06/17/2020] [Indexed: 11/10/2022] Open
Abstract
Background Examining residuals is a crucial step in statistical analysis to identify the discrepancies between models and data, and assess the overall model goodness-of-fit. In diagnosing normal linear regression models, both Pearson and deviance residuals are often used, which are equivalently and approximately standard normally distributed when the model fits the data adequately. However, when the response vari*able is discrete, these residuals are distributed far from normality and have nearly parallel curves according to the distinct discrete response values, imposing great challenges for visual inspection. Methods Randomized quantile residuals (RQRs) were proposed in the literature by Dunn and Smyth (1996) to circumvent the problems in traditional residuals. However, this approach has not gained popularity partly due to the lack of investigation of its performance for count regression including zero-inflated models through simulation studies. Therefore, we assessed the normality of the RQRs and compared their performance with traditional residuals for diagnosing count regression models through a series of simulation studies. A real data analysis in health care utilization study for modeling the number of repeated emergency department visits was also presented. Results Our results of the simulation studies demonstrated that RQRs have low type I error and great statistical power in comparisons to other residuals for detecting many forms of model misspecification for count regression models (non-linearity in covariate effect, over-dispersion, and zero inflation). Our real data analysis also showed that RQRs are effective in detecting misspecified distributional assumptions for count regression models. Conclusions Our results for evaluating RQRs in comparison with traditional residuals provide further evidence on its advantages for diagnosing count regression models.
Collapse
Affiliation(s)
- Cindy Feng
- School of Epidemiology and Public Health, Faculty of Medicine, University of Ottawa, 600 Peter Morand, Ottawa, K1G5Z3, Canada. .,School of Public Health, University of Saskatchewan, 104 Clinic Place, Saskatoon, S7N2Z4, Canada.
| | - Longhai Li
- Department of Mathematics and Statistics, University of Saskatchewan, 106 Wiggins Road, Saskatoon, S7N5E6, Canada
| | - Alireza Sadeghpour
- Department of Mathematics and Statistics, University of Saskatchewan, 106 Wiggins Road, Saskatoon, S7N5E6, Canada
| |
Collapse
|
4
|
Lee CH, Ning J, Shen Y. Model diagnostics for the proportional hazards model with length-biased data. Lifetime Data Anal 2019; 25:79-96. [PMID: 29450809 PMCID: PMC6095831 DOI: 10.1007/s10985-018-9422-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 02/05/2018] [Indexed: 06/08/2023]
Abstract
Length-biased data are frequently encountered in prevalent cohort studies. Many statistical methods have been developed to estimate the covariate effects on the survival outcomes arising from such data while properly adjusting for length-biased sampling. Among them, regression methods based on the proportional hazards model have been widely adopted. However, little work has focused on checking the proportional hazards model assumptions with length-biased data, which is essential to ensure the validity of inference. In this article, we propose a statistical tool for testing the assumed functional form of covariates and the proportional hazards assumption graphically and analytically under the setting of length-biased sampling, through a general class of multiparameter stochastic processes. The finite sample performance is examined through simulation studies, and the proposed methods are illustrated with the data from a cohort study of dementia in Canada.
Collapse
Affiliation(s)
- Chi Hyun Lee
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, 1400 Pressler Street Unit 1411, Houston, TX, 77030, USA.
| | - Jing Ning
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, 1400 Pressler Street Unit 1411, Houston, TX, 77030, USA
| | - Yu Shen
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, 1400 Pressler Street Unit 1411, Houston, TX, 77030, USA
| |
Collapse
|
5
|
Hertel J, Rotter M, Frenzel S, Zacharias HU, Krumsiek J, Rathkolb B, Hrabe de Angelis M, Rabstein S, Pallapies D, Brüning T, Grabe HJ, Wang-Sattler R. Dilution correction for dynamically influenced urinary analyte data. Anal Chim Acta 2018; 1032:18-31. [PMID: 30143216 DOI: 10.1016/j.aca.2018.07.068] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 06/29/2018] [Accepted: 07/25/2018] [Indexed: 01/03/2023]
Abstract
Urinary analyte data has to be corrected for the sample specific dilution as the dilution varies intra- and interpersonally dramatically, leading to non-comparable concentration measures. Most methods of dilution correction utilized nowadays like probabilistic quotient normalization or total spectra normalization result in a division of the raw data by a dilution correction factor. Here, however, we show that the implicit assumption behind the application of division, log-linearity between the urinary flow rate and the raw urinary concentration, does not hold for analytes which are not in steady state in blood. We explicate the physiological reason for this short-coming in mathematical terms and demonstrate the empirical consequences via simulations and on multiple time-point metabolomic data, showing the insufficiency of division-based normalization procedures to account for the complex non-linear analyte specific dependencies on the urinary flow rate. By reformulating normalization as a regression problem, we propose an analyte specific way to remove the dilution variance via a flexible non-linear regression methodology which then was shown to be more effective in comparison to division-based normalization procedures. In the progress, we developed several, easily applicable methods of normalization diagnostics to decide on the method of dilution correction in a given sample. On the way, we identified furthermore the time-span since last urination as an important variance factor in urinary metabolome data which is until now completely neglected. In conclusion, we present strong theoretical and empirical evidence that normalization has to be analyte specific in dynamically influenced data. Accordingly, we developed a normalization methodology for removing the dilution variance in urinary data respecting the single analyte kinetics.
Collapse
Affiliation(s)
- Johannes Hertel
- Department of Psychiatry and Psychotherapy, University Medicine Greifswald, Germany.
| | - Markus Rotter
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, Germany; Institute of Epidemiology, Helmholtz Zentrum München, Germany
| | - Stefan Frenzel
- Department of Psychiatry and Psychotherapy, University Medicine Greifswald, Germany
| | | | - Jan Krumsiek
- Institute of Computational Biology, Helmholtz Zentrum München, Germany; Institute for Computational Biomedicine, Englander Institute for Precision Medicine, Department of Physiology and Biophysics, Weill Cornell Medicine, New York, USA
| | - Birgit Rathkolb
- German Center for Diabetes Research (DZD), München, Germany; Chair for Molecular Animal Breeding and Biotechnology, Gene Center and Department of Veterinary Sciences, And Center for Innovative Medical Models (CiMM), Ludwig Maximilian University of Munich, Germany; German Mouse Clinic (GMC), Institute of Experimental Genetics, Helmholtz Zentrum München, Germany
| | - Martin Hrabe de Angelis
- German Center for Diabetes Research (DZD), München, Germany; Institute of Experimental Genetics, Helmholtz Zentrum München, Germany; Chair of Experimental Genetics, Center of Life and Food Sciences Weihenstephan, Technische Universität München, Germany
| | - Sylvia Rabstein
- Institute for Prevention and Occupational Medicine of the German Social Accident Insurance, Institute of the Ruhr-Universität Bochum (IPA), Germany
| | - Dirk Pallapies
- Institute for Prevention and Occupational Medicine of the German Social Accident Insurance, Institute of the Ruhr-Universität Bochum (IPA), Germany
| | - Thomas Brüning
- Institute for Prevention and Occupational Medicine of the German Social Accident Insurance, Institute of the Ruhr-Universität Bochum (IPA), Germany
| | - Hans J Grabe
- Department of Psychiatry and Psychotherapy, University Medicine Greifswald, Germany; German Center for Neurodegenerative Diseases (DZNE), Site Rostock/ Greifswald, Germany
| | - Rui Wang-Sattler
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, Germany; Institute of Epidemiology, Helmholtz Zentrum München, Germany; German Center for Diabetes Research (DZD), München, Germany
| |
Collapse
|
6
|
Lim CN, Liang S, Feng K, Chittenden J, Henry A, Mouksassi S, Birnbaum AK. Phxnlme: An R package that facilitates pharmacometric workflow of Phoenix NLME analyses. Comput Methods Programs Biomed 2017; 140:121-129. [PMID: 28254068 DOI: 10.1016/j.cmpb.2016.12.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Revised: 10/27/2016] [Accepted: 12/06/2016] [Indexed: 06/06/2023]
Abstract
BACKGROUND AND OBJECTIVE Pharmacometric analyses are integral components of the drug development process, and Phoenix NLME is one of the popular software used to conduct such analyses. To address current limitations with model diagnostic graphics and efficiency of the workflow for this software, we developed an R package, Phxnlme, to facilitate its workflow and provide improved graphical diagnostics. METHODS Phxnlme was designed to provide functionality for the major tasks that are usually performed in pharmacometric analyses (i.e. nonlinear mixed effects modeling, basic model diagnostics, visual predictive checks and bootstrap). Various estimation methods for modeling using the R package are made available through the Phoenix NLME engine. The Phxnlme R package utilizes other packages such as ggplot2 and lattice to produce the graphical output, and various features were included to allow customizability of the output. Interactive features for some plots were also added using the manipulate R package. RESULTS Phxnlme provides enhanced capabilities for nonlinear mixed effects modeling that can be accessed using the phxnlme() command. Output from the model can be graphed to assess the adequacy of model fits and further explore relationships in the data using various functions included in this R package, such as phxplot() and phxvpc.plot(). Bootstraps, stratified up to three variables, can also be performed to obtain confidence intervals around the model estimates. With the use of an R interface, different R projects can be created to allow multi-tasking, which addresses the current limitation of the Phoenix NLME desktop software. In addition, there is a wide selection of diagnostic and exploratory plots in the Phxnlme package, with improvements in the customizability of plots, compared to Phoenix NLME. CONCLUSIONS The Phxnlme package is a flexible tool that allows implementation of the analytical workflow of Phoenix NLME with R, with features for greater overall efficiency and improved customizable graphics. Phxnlme is freely available for download on the CRAN repository (https://cran.r-project.org/web/packages/Phxnlme/).
Collapse
Affiliation(s)
- Chay Ngee Lim
- Department of Experimental & Clinical Pharmacology, College of Pharmacy, University of Minnesota, 717 Delaware St SE Room 463, Minneapolis, MN, USA
| | - Shuang Liang
- Department of Experimental & Clinical Pharmacology, College of Pharmacy, University of Minnesota, 717 Delaware St SE Room 463, Minneapolis, MN, USA
| | - Kevin Feng
- Pharsight, a Certara Company, Princeton, NJ, USA
| | | | - Ana Henry
- Pharsight, a Certara Company, Princeton, NJ, USA
| | | | - Angela K Birnbaum
- Department of Experimental & Clinical Pharmacology, College of Pharmacy, University of Minnesota, 717 Delaware St SE Room 463, Minneapolis, MN, USA.
| |
Collapse
|
7
|
Dosne AG, Niebecker R, Karlsson MO. dOFV distributions: a new diagnostic for the adequacy of parameter uncertainty in nonlinear mixed-effects models applied to the bootstrap. J Pharmacokinet Pharmacodyn 2016; 43:597-608. [PMID: 27730481 PMCID: PMC5110608 DOI: 10.1007/s10928-016-9496-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2016] [Accepted: 09/28/2016] [Indexed: 12/01/2022]
Abstract
Knowledge of the uncertainty in model parameters is essential for decision-making in drug development. Contrarily to other aspects of nonlinear mixed effects models (NLMEM), scrutiny towards assumptions around parameter uncertainty is low, and no diagnostic exists to judge whether the estimated uncertainty is appropriate. This work aims at introducing a diagnostic capable of assessing the appropriateness of a given parameter uncertainty distribution. The new diagnostic was applied to case bootstrap examples in order to investigate for which dataset sizes case bootstrap is appropriate for NLMEM. The proposed diagnostic is a plot comparing the distribution of differences in objective function values (dOFV) of the proposed uncertainty distribution to a theoretical Chi square distribution with degrees of freedom equal to the number of estimated model parameters. The uncertainty distribution was deemed appropriate if its dOFV distribution was overlaid with or below the theoretical distribution. The diagnostic was applied to the bootstrap of two real data and two simulated data examples, featuring pharmacokinetic and pharmacodynamic models and datasets of 20–200 individuals with between 2 and 5 observations on average per individual. In the real data examples, the diagnostic indicated that case bootstrap was unsuitable for NLMEM analyses with around 70 individuals. A measure of parameter-specific “effective” sample size was proposed as a potentially better indicator of bootstrap adequacy than overall sample size. In the simulation examples, bootstrap confidence intervals were shown to underestimate inter-individual variability at low sample sizes. The proposed diagnostic proved a relevant tool for assessing the appropriateness of a given parameter uncertainty distribution and as such it should be routinely used.
Collapse
Affiliation(s)
- Anne-Gaëlle Dosne
- Department of Pharmaceutical Biosciences, Uppsala University, P.O. Box 591, 751 24, Uppsala, Sweden.
| | - Ronald Niebecker
- Department of Pharmaceutical Biosciences, Uppsala University, P.O. Box 591, 751 24, Uppsala, Sweden
| | - Mats O Karlsson
- Department of Pharmaceutical Biosciences, Uppsala University, P.O. Box 591, 751 24, Uppsala, Sweden
| |
Collapse
|
8
|
Abstract
Reduced-rank methods are very popular in high-dimensional multivariate analysis for conducting simultaneous dimension reduction and model estimation. However, the commonly-used reduced-rank methods are not robust, as the underlying reduced-rank structure can be easily distorted by only a few data outliers. Anomalies are bound to exist in big data problems, and in some applications they themselves could be of the primary interest. While naive residual analysis is often inadequate for outlier detection due to potential masking and swamping, robust reduced-rank estimation approaches could be computationally demanding. Under Stein's unbiased risk estimation framework, we propose a set of tools, including leverage score and generalized information score, to perform model diagnostics and outlier detection in large-scale reduced-rank estimation. The leverage scores give an exact decomposition of the so-called model degrees of freedom to the observation level, which lead to exact decomposition of many commonly-used information criteria; the resulting quantities are thus named information scores of the observations. The proposed information score approach provides a principled way of combining the residuals and leverage scores for anomaly detection. Simulation studies confirm that the proposed diagnostic tools work well. A pattern recognition example with hand-writing digital images and a time series analysis example with monthly U.S. macroeconomic data further demonstrate the efficacy of the proposed approaches.
Collapse
Affiliation(s)
- Kun Chen
- Department of Statistics, University of Connecticut, 215 Glenbrook Rd. U-4120, Storrs, CT 06269-4120,
| |
Collapse
|
9
|
Abstract
We propose a new residual for regression models of ordinal outcomes, defined as E{sign(y,Y)}, where y is the observed outcome and Y is a random variable from the fitted distribution. This new residual is a single value per subject irrespective of the number of categories of the ordinal outcome, contains directional information between the observed value and the fitted distribution, and does not require the assignment of arbitrary numbers to categories. We study its properties, describe its connections with other residuals, ranks and ridits, and demonstrate its use in model diagnostics.
Collapse
Affiliation(s)
- Chun Li
- Department of Biostatistics, Vanderbilt University, Nashville, Tennessee 37232, U.S.A. ,
| | | |
Collapse
|
10
|
Abstract
The use of conditional logistic regression models to analyze matched case-control data has become standard in statistical analysis. However, methods to test the fit of these models has primarily focused on influential observations and the presence of outliers, while little attention has been given to the functional form of the covariates. In this paper we present methods to test the functional form of the covariates in the conditional logistic regression model, these methods are based on nonparametric smoothers. We assess the performance of the proposed methods via simulation studies and illustrate an example of their use on data from a community based intervention.
Collapse
Affiliation(s)
- Melody S Goodman
- Division of Public Health Sciences, Department of Surgery, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| | | |
Collapse
|