1
|
A flexible parametric accelerated failure time model and the extension to time-dependent acceleration factors. Biostatistics 2023; 24:811-831. [PMID: 35639824 PMCID: PMC10346080 DOI: 10.1093/biostatistics/kxac009] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 10/09/2021] [Accepted: 12/14/2021] [Indexed: 07/20/2023] Open
Abstract
Accelerated failure time (AFT) models are used widely in medical research, though to a much lesser extent than proportional hazards models. In an AFT model, the effect of covariates act to accelerate or decelerate the time to event of interest, that is, shorten or extend the time to event. Commonly used parametric AFT models are limited in the underlying shapes that they can capture. In this article, we propose a general parametric AFT model, and in particular concentrate on using restricted cubic splines to model the baseline to provide substantial flexibility. We then extend the model to accommodate time-dependent acceleration factors. Delayed entry is also allowed, and hence, time-dependent covariates. We evaluate the proposed model through simulation, showing substantial improvements compared to standard parametric AFT models. We also show analytically and through simulations that the AFT models are collapsible, suggesting that this model class will be well suited to causal inference. We illustrate the methods with a data set of patients with breast cancer. Finally, we provide highly efficient, user-friendly Stata, and R software packages.
Collapse
|
2
|
artbin: Extended sample size for randomized trials with binary outcomes. THE STATA JOURNAL 2023; 23:24-52. [PMID: 37461744 PMCID: PMC7614770 DOI: 10.1177/1536867x231161971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
We describe the command artbin, which offers various new facilities for the calculation of sample size for binary outcome variables that are not otherwise available in Stata. While artbin has been available since 2004, it has not been previously described in the Stata Journal. artbin has been recently updated to include new options for different statistical tests, methods and study designs, improved syntax, and better handling of noninferiority trials. In this article, we describe the updated version of artbin and detail the various formulas used within artbin in different settings.
Collapse
|
3
|
artcat: Sample-size calculation for an ordered categorical outcome. THE STATA JOURNAL 2023; 23:3-23. [PMID: 37155554 PMCID: PMC7614472 DOI: 10.1177/1536867x231161934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
We describe a new command, artcat, that calculates sample size or power for a randomized controlled trial or similar experiment with an ordered categorical outcome, where analysis is by the proportional-odds model. artcat implements the method of Whitehead (1993, Statistics in Medicine 12: 2257-2271). We also propose and implement a new method that 1) allows the user to specify a treatment effect that does not obey the proportional-odds assumption, 2) offers greater accuracy for large treatment effects, and 3) allows for noninferiority trials. We illustrate the command and explore the value of an ordered categorical outcome over a binary outcome in various settings. We show by simulation that the methods perform well and that the new method is more accurate than Whitehead's method.
Collapse
|
4
|
Reply to U. Capitanio et al. J Clin Oncol 2023; 41:704-706. [PMID: 36166721 DOI: 10.1200/jco.22.01124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
|
5
|
Personalized Model to Predict Keratoconus Progression From Demographic, Topographic, and Genetic Data. Am J Ophthalmol 2022; 240:321-329. [PMID: 35469790 DOI: 10.1016/j.ajo.2022.04.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 04/02/2022] [Accepted: 04/13/2022] [Indexed: 11/26/2022]
Abstract
PURPOSE To generate a prognostic model to predict keratoconus progression to corneal crosslinking (CXL). DESIGN Retrospective cohort study. METHODS We recruited 5025 patients (9341 eyes) with early keratoconus between January 2011 and November 2020. Genetic data from 926 patients were available. We investigated both keratometry or CXL as end points for progression and used the Royston-Parmar method on the proportional hazards scale to generate a prognostic model. We calculated hazard ratios (HRs) for each significant covariate, with explained variation and discrimination, and performed internal-external cross validation by geographic regions. RESULTS After exclusions, model fitting comprised 8701 eyes, of which 3232 underwent CXL. For early keratoconus, CXL provided a more robust prognostic model than keratometric progression. The final model explained 33% of the variation in time to event: age HR (95% CI) 0.9 (0.90-0.91), maximum anterior keratometry 1.08 (1.07-1.09), and minimum corneal thickness 0.95 (0.93-0.96) as significant covariates. Single-nucleotide polymorphisms (SNPs) associated with keratoconus (n=28) did not significantly contribute to the model. The predicted time-to-event curves closely followed the observed curves during internal-external validation. Differences in discrimination between geographic regions was low, suggesting the model maintained its predictive ability. CONCLUSIONS A prognostic model to predict keratoconus progression could aid patient empowerment, triage, and service provision. Age at presentation is the most significant predictor of progression risk. Candidate SNPs associated with keratoconus do not contribute to progression risk.
Collapse
|
6
|
Investigating treatment-effect modification by a continuous covariate in IPD meta-analysis: an approach using fractional polynomials. BMC Med Res Methodol 2022; 22:98. [PMID: 35382744 PMCID: PMC8985287 DOI: 10.1186/s12874-022-01516-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 01/17/2022] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND In clinical trials, there is considerable interest in investigating whether a treatment effect is similar in all patients, or that one or more prognostic variables indicate a differential response to treatment. To examine this, a continuous predictor is usually categorised into groups according to one or more cutpoints. Several weaknesses of categorization are well known. To avoid the disadvantages of cutpoints and to retain full information, it is preferable to keep continuous variables continuous in the analysis. To handle this issue, the Subpopulation Treatment Effect Pattern Plot (STEPP) was proposed about two decades ago, followed by the multivariable fractional polynomial interaction (MFPI) approach. Provided individual patient data (IPD) from several studies are available, it is possible to investigate for treatment heterogeneity with meta-analysis techniques. Meta-STEPP was recently proposed and in patients with primary breast cancer an interaction of estrogen receptors with chemotherapy was investigated in eight randomized controlled trials (RCTs). METHODS We use data from eight randomized controlled trials in breast cancer to illustrate issues from two main tasks. The first task is to derive a treatment effect function (TEF), that is, a measure of the treatment effect on the continuous scale of the covariate in the individual studies. The second is to conduct a meta-analysis of the continuous TEFs from the eight studies by applying pointwise averaging to obtain a mean function. We denote the method metaTEF. To improve reporting of available data and all steps of the analysis we introduce a three-part profile called MethProf-MA. RESULTS Although there are considerable differences between the studies (populations with large differences in prognosis, sample size, effective sample size, length of follow up, proportion of patients with very low estrogen receptor values) our results provide clear evidence of an interaction, irrespective of the choice of the FP function and random or fixed effect models. CONCLUSIONS In contrast to cutpoint-based analyses, metaTEF retains the full information from continuous covariates and avoids several critical issues when performing IPD meta-analyses of continuous effect modifiers in randomised trials. Early experience suggests it is a promising approach. TRIAL REGISTRATION Not applicable.
Collapse
|
7
|
Re: Spline-based accelerated failure time model. Stat Med 2022; 41:1314-1315. [PMID: 35266574 DOI: 10.1002/sim.8964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 03/11/2021] [Indexed: 11/07/2022]
|
8
|
External Validation of the 2003 Leibovich Prognostic Score in Patients Randomly Assigned to SORCE, an International Phase III Trial of Adjuvant Sorafenib in Renal Cell Cancer. J Clin Oncol 2022; 40:1772-1782. [PMID: 35213214 PMCID: PMC9148696 DOI: 10.1200/jco.21.01090] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The 2003 Leibovich score guides prognostication and selection to adjuvant clinical trials for patients with locally advanced renal cell carcinoma (RCC) after nephrectomy. We provide a robust external validation of the 2003 Leibovich score using contemporary data from SORCE, an international, randomized trial of sorafenib after excision of primary RCC. Read how we have shown that the 2003 Leibovich score demonstrates discriminative accuracy in contemporary clear-cell and non–clear-cell RCC patient cohorts, supporting its continued use to guide discussions on patient prognosis and risk-stratification in clinical trials.
Collapse
|
9
|
Doug Altman: Driving critical appraisal and improvements in the quality of methodological and medical research. Biom J 2021; 63:226-246. [PMID: 32639065 DOI: 10.1002/bimj.202000053] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 05/20/2020] [Accepted: 06/03/2020] [Indexed: 12/12/2022]
Abstract
Doug Altman was a visionary leader and one of the most influential medical statisticians of the last 40 years. Based on a presentation in the "Invited session in memory of Doug Altman" at the 40th Annual Conference of the International Society for Clinical Biostatistics (ISCB) in Leuven, Belgium and our long-standing collaborations with Doug, we discuss his contributions to regression modeling, reporting, prognosis research, as well as some more general issues while acknowledging that we cannot cover the whole spectrum of Doug's considerable methodological output. His statement "To maximize the benefit to society, you need to not just do research but do it well" should be a driver for all researchers. To improve current and future research, we aim to summarize Doug's messages for these three topics.
Collapse
|
10
|
A simulation study comparing the power of nine tests of the treatment effect in randomized controlled trials with a time-to-event outcome. Trials 2020; 21:315. [PMID: 32252820 PMCID: PMC7132898 DOI: 10.1186/s13063-020-4153-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 02/08/2020] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND The logrank test is routinely applied to design and analyse randomized controlled trials (RCTs) with time-to-event outcomes. Sample size and power calculations assume the treatment effect follows proportional hazards (PH). If the PH assumption is false, power is reduced and interpretation of the hazard ratio (HR) as the estimated treatment effect is compromised. Using statistical simulation, we investigated the type 1 error and power of the logrank (LR)test and eight alternatives. We aimed to identify test(s) that improve power with three types of non-proportional hazards (non-PH): early, late or near-PH treatment effects. METHODS We investigated weighted logrank tests (early, LRE; late, LRL), the supremum logrank test (SupLR) and composite tests (joint, J; combined, C; weighted combined, WC; versatile and modified versatile weighted logrank, VWLR, VWLR2) with two or more components. Weighted logrank tests are intended to be sensitive to particular non-PH patterns. Composite tests attempt to improve power across a wider range of non-PH patterns. Using extensive simulations based on real trials, we studied test size and power under PH and under simple departures from PH comprising pointwise constant HRs with a single change point at various follow-up times. We systematically investigated the influence of high or low control-arm event rates on power. RESULTS With no preconceived type of treatment effect, the preferred test is VWLR2. Expecting an early effect, tests with acceptable power are SupLR, C, VWLR2, J, LRE and WC. Expecting a late effect, acceptable tests are LRL, VWLR, VWLR2, WC and J. Under near-PH, acceptable tests are LR, LRE, VWLR, C, VWLR2 and SupLR. Type 1 error was well controlled for all tests, showing only minor deviations from the nominal 5%. The location of the HR change point relative to the cumulative proportion of control-arm events considerably affected power. CONCLUSIONS Assuming ignorance of the likely treatment effect, the best choice is VWLR2. Several non-standard tests performed well when the correct type of treatment effect was assumed. A low control-arm event rate reduced the power of weighted logrank tests targeting early effects. Test size was generally well controlled. Further investigation of test characteristics with different types of non-proportional hazards of the treatment effect is warranted.
Collapse
|
11
|
State of the art in selection of variables and functional forms in multivariable analysis-outstanding issues. Diagn Progn Res 2020; 4:3. [PMID: 32266321 PMCID: PMC7114804 DOI: 10.1186/s41512-020-00074-3] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 03/18/2020] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND How to select variables and identify functional forms for continuous variables is a key concern when creating a multivariable model. Ad hoc 'traditional' approaches to variable selection have been in use for at least 50 years. Similarly, methods for determining functional forms for continuous variables were first suggested many years ago. More recently, many alternative approaches to address these two challenges have been proposed, but knowledge of their properties and meaningful comparisons between them are scarce. To define a state of the art and to provide evidence-supported guidance to researchers who have only a basic level of statistical knowledge, many outstanding issues in multivariable modelling remain. Our main aims are to identify and illustrate such gaps in the literature and present them at a moderate technical level to the wide community of practitioners, researchers and students of statistics. METHODS We briefly discuss general issues in building descriptive regression models, strategies for variable selection, different ways of choosing functional forms for continuous variables and methods for combining the selection of variables and functions. We discuss two examples, taken from the medical literature, to illustrate problems in the practice of modelling. RESULTS Our overview revealed that there is not yet enough evidence on which to base recommendations for the selection of variables and functional forms in multivariable analysis. Such evidence may come from comparisons between alternative methods. In particular, we highlight seven important topics that require further investigation and make suggestions for the direction of further research. CONCLUSIONS Selection of variables and of functional forms are important topics in multivariable analysis. To define a state of the art and to provide evidence-supported guidance to researchers who have only a basic level of statistical knowledge, further comparative research is required.
Collapse
|
12
|
Comments on design and monitoring of survival trials in complex scenarios. Stat Med 2019; 38:2704. [DOI: 10.1002/sim.8087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 12/12/2018] [Indexed: 11/06/2022]
|
13
|
Abstract
BACKGROUND The logrank test and the Cox proportional hazards model are routinely applied in the design and analysis of randomised controlled trials (RCTs) with time-to-event outcomes. Usually, sample size and power calculations assume proportional hazards (PH) of the treatment effect, i.e. the hazard ratio is constant over the entire follow-up period. If the PH assumption fails, the power of the logrank/Cox test may be reduced, sometimes severely. It is, therefore, important to understand how serious this can become in real trials, and for a proven, alternative test to be available to increase the robustness of the primary test. METHODS We performed a systematic search to identify relevant articles in four leading medical journals that publish results of phase 3 clinical trials. Altogether, 50 articles satisfied our inclusion criteria. We digitised published Kaplan-Meier curves and created approximations to the original times to event or censoring at the individual patient level. Using the reconstructed data, we tested for non-PH in all 50 trials. We compared the results from the logrank/Cox test with those from the combined test recently proposed by Royston and Parmar. RESULTS The PH assumption was checked and reported only in 28% of the studies. Evidence of non-PH at the 0.10 level was detected in 31% of comparisons. The Cox test of the treatment effect was significant at the 0.05 level in 49% of comparisons, and the combined test in 55%. In four of five trials with discordant results, the interpretation would have changed had the combined test been used. The degree of non-PH and the dominance of the p value for the combined test were strongly associated. Graphical investigation suggested that non-PH was mostly due to a treatment effect manifesting in an early follow-up and disappearing later. CONCLUSIONS The evidence for non-PH is checked (and, hence, identified) in only a small minority of RCTs, but non-PH may be present in a substantial fraction of such trials. In our reanalysis of the reconstructed data from 50 trials, the combined test outperformed the Cox test overall. The combined test is a promising approach to making trial design and analysis more robust.
Collapse
|
14
|
Meta-analysis of non-linear exposure-outcome relationships using individual participant data: A comparison of two methods. Stat Med 2019; 38:326-338. [PMID: 30284314 PMCID: PMC6492097 DOI: 10.1002/sim.7974] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Revised: 07/02/2018] [Accepted: 08/07/2018] [Indexed: 01/14/2023]
Abstract
Non-linear exposure-outcome relationships such as between body mass index (BMI) and mortality are common. They are best explored as continuous functions using individual participant data from multiple studies. We explore two two-stage methods for meta-analysis of such relationships, where the confounder-adjusted relationship is first estimated in a non-linear regression model in each study, then combined across studies. The "metacurve" approach combines the estimated curves using multiple meta-analyses of the relative effect between a given exposure level and a reference level. The "mvmeta" approach combines the estimated model parameters in a single multivariate meta-analysis. Both methods allow the exposure-outcome relationship to differ across studies. Using theoretical arguments, we show that the methods differ most when covariate distributions differ across studies; using simulated data, we show that mvmeta gains precision but metacurve is more robust to model mis-specification. We then compare the two methods using data from the Emerging Risk Factors Collaboration on BMI, coronary heart disease events, and all-cause mortality (>80 cohorts, >18 000 events). For each outcome, we model BMI using fractional polynomials of degree 2 in each study, with adjustment for confounders. For metacurve, the powers defining the fractional polynomials may be study-specific or common across studies. For coronary heart disease, metacurve with common powers and mvmeta correctly identify a small increase in risk in the lowest levels of BMI, but metacurve with study-specific powers does not. For all-cause mortality, all methods identify a steep U-shape. The metacurve and mvmeta methods perform well in combining complex exposure-disease relationships across studies.
Collapse
|
15
|
Building Multivariable Regression Models with Continuous Covariates in Clinical Epidemiology. Methods Inf Med 2018. [DOI: 10.1055/s-0038-1634008] [Citation(s) in RCA: 82] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Summary
Objectives:
In fitting regression models, data analysts must often choose a model based on several candidate predictor variables which may influence the outcome. Most analysts either assume a linear relationship for continuous predictors, or categorize them and postulate step functions. By contrast, we propose to model possible non-linearity in the relationship between the outcome and several continuous predictors by estimating smooth functions of the predictors. We aim to demonstrate that a structured approach based on fractional polynomials can give a broadly satisfactory practical solution to the problem of simultaneously identifying a subset of 'important' predictors and determining the functional relationship for continuous predictors.
Methods:
We discuss the background, and motivate and describe the multivariable fractional polynomial (MFP) approach to model selection from data which include continuous and categorical predictors. We compare our results with those from other approaches in examples. We present a small simulation study to compare the functional form of the relationship obtained by fitting fractional polynomials and splines to a single predictor variable.
Results:
We illustrate the advantages of the MFP approach over standard techniques of model construction in two real example datasets analyzed with logistic and Cox regression models, respectively. In the simulation study, fractional polynomial models had lower mean square error and more realistic behaviour than comparable spline models.
Conclusions:
In many practical situations, the MFP approach can satisfy the aim of finding models that fit the data well and also are simple, interpretable and potentially transportable to other settings.
Collapse
|
16
|
Abstract
Hazard ratios can be approximated by data extracted from published Kaplan-Meier curves. Recently, this curve approach has been extended beyond hazard-ratio approximation with the capability of constructing time-to-event data at the individual level. In this article, we introduce a command, ipdfc, to implement the reconstruction method to convert Kaplan-Meier curves to time-to-event data. We give examples to illustrate how to use the command.
Collapse
|
17
|
Reference curves for the Australian/Canadian Hand Osteoarthritis Index in the middle-aged Dutch population. Rheumatology (Oxford) 2017; 56:745-752. [PMID: 28077692 DOI: 10.1093/rheumatology/kew483] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Indexed: 11/12/2022] Open
Abstract
Objective The aim was to establish reference curves of the Australian/Canadian Hand Osteoarthritis Index (AUSCAN), a widely used questionnaire assessing hand complaints. Methods Analyses were performed in a population-based sample, The Netherlands Epidemiology of Obesity study (n = 6671, aged 45-65 years). Factors associated with AUSCAN scores were analysed with ordered logistic regression, because AUSCAN data were zero inflated, dividing AUSCAN into three categories (0 vs 1-5 vs >5). Age- and sex-specific reference curves for the AUSCAN (range 0-60; higher is worse) were developed using quantile regression in conjunction with fractional polynomials. Observed scores in relevant subgroups were compared with the reference curves. Results The median age was 56 [interquartile range (IQR): 50-61] years; 56% were women and 12% had hand OA according to ACR criteria. AUSCAN scores were low (median 1; IQR: 0-4). Reference curves where higher for women, and increased moderately with age: 95% percentiles for AUSCAN in men and women were, respectively, 5.0 and 12.3 points for a 45-year-old, and 15.2 and 33.6 points for a 65-year-old individual. Additional associated factors included hand OA, inflammatory rheumatic diseases, FM, socio-economic status and BMI. Median AUSCAN pain subscale scores of women with hand OA lay between the 75th and 90th centiles of the general population. Conclusion AUSCAN scores in the middle-aged Dutch population were low overall, and higher in women than in men. AUSCAN reference curves could serve as a benchmark in research and clinical practice settings. However, the AUSCAN does not measure hand complaints specific for hand OA.
Collapse
|
18
|
Abstract
Since Royston and Altman's 1994 publication (Journal of the Royal Statistical Society, Series C 43: 429-467), fractional polynomials have steadily gained popularity as a tool for flexible parametric modeling of regression relationships. In this article, I present fp_select, a postestimation tool for fp that allows the user to select a parsimonious fractional polynomial model according to a closed test procedure called the fractional polynomial selection procedure or function selection procedure. I also give a brief introduction to fractional polynomial models and provide examples of using fp and fp_select to select such models with real data.
Collapse
|
19
|
Life expectancy difference and life expectancy ratio: two measures of treatment effects in randomised trials with non-proportional hazards. BMJ 2017; 357:j2250. [PMID: 28546261 PMCID: PMC5444092 DOI: 10.1136/bmj.j2250] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
20
|
A combined test for a generalized treatment effect in clinical trials with a time-to-event outcome. THE STATA JOURNAL 2017; 17:405-421. [PMID: 29445320 PMCID: PMC5808831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Most randomized controlled trials with a time-to-event outcome are designed and analyzed assuming proportional hazards of the treatment effect. The sample-size calculation is based on a log-rank test or the equivalent Cox test. Nonproportional hazards are seen increasingly in trials and are recognized as a potential threat to the power of the log-rank test. To address the issue, Royston and Parmar (2016, BMC Medical Research Methodology 16: 16) devised a new "combined test" of the global null hypothesis of identical survival curves in each trial arm. The test, which combines the conventional Cox test with a new formulation, is based on the maximal standardized difference in restricted mean survival time (rmst) between the arms. The test statistic is based on evaluations of rmst over several preselected time points. The combined test involves the minimum p-value across the Cox and rmst-based tests, appropriately standardized to have the correct null distribution. In this article, I outline the combined test and introduce a command, stctest, that implements the combined test. I point the way to additional tools currently under development for power and sample-size calculation for the combined test.
Collapse
|
21
|
Multivariable fractional polynomial interaction to investigate continuous effect modifiers in a meta-analysis on higher versus lower PEEP for patients with ARDS. BMJ Open 2016; 6:e011148. [PMID: 27609843 PMCID: PMC5020750 DOI: 10.1136/bmjopen-2016-011148] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
OBJECTIVES A recent individual patient data (IPD) meta-analysis suggested that patients with moderate or severe acute respiratory distress syndrome (ARDS) benefit from higher positive end-expiratory pressure (PEEP) ventilation strategies. However, thresholds for continuous variables (eg, hypoxaemia) are often arbitrary and linearity assumptions in regression approaches may not hold; the multivariable fractional polynomial interaction (MFPI) approach can address both problems. The objective of this study was to apply the MFPI approach to investigate interactions between four continuous patient baseline variables and higher versus lower PEEP on clinical outcomes. SETTING Pooled data from three randomised trials in intensive care identified by a systematic review. PARTICIPANTS 2299 patients with acute lung injury requiring mechanical ventilation. INTERVENTIONS Higher (N=1136) versus lower PEEP (N=1163) ventilation strategy. OUTCOME MEASURES Prespecified outcomes included mortality, time to death and time-to-unassisted breathing. We examined the following continuous baseline characteristics as potential effect modifiers using MFPI: PaO2/FiO2 (arterial partial oxygen pressure/ fraction of inspired oxygen), oxygenation index, respiratory system compliance (tidal volume/(inspiratory plateau pressure-PEEP)) and body mass index (BMI). RESULTS We found that for patients with PaO2/FiO2 below 150 mm Hg, but above 100 mm Hg or an oxygenation index above 12 (moderate ARDS), higher PEEP reduces hospital mortality, but the beneficial effect appears to level off for patients with very severe ARDS. Patients with mild ARDS (PaO2/FiO2 above 200 mm Hg or an oxygenation index below 10) do not seem to benefit from higher PEEP and might even be harmed. For patients with a respiratory system compliance above 40 mL/cm H2O or patients with a BMI above 35 kg/m(2), we found a trend towards reduced mortality with higher PEEP, but there is very weak statistical confidence in these findings. CONCLUSIONS MFPI analyses suggest a nonlinear effect modification of higher PEEP ventilation by PaO2/FiO2 and oxygenation index with reduced mortality for some patients suffering from moderate ARDS. STUDY REGISTRATION NUMBER CRD42012003129.
Collapse
|
22
|
|
23
|
SAT0446 Reference Curves for The Australian/canadian Hand Osteoarthritis Index (AUSCAN) in The General Population. Ann Rheum Dis 2016. [DOI: 10.1136/annrheumdis-2016-eular.3729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
24
|
Augmenting the logrank test in the design of clinical trials in which non-proportional hazards of the treatment effect may be anticipated. BMC Med Res Methodol 2016; 16:16. [PMID: 26869168 PMCID: PMC4751641 DOI: 10.1186/s12874-016-0110-x] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2015] [Accepted: 01/09/2016] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Most randomized controlled trials with a time-to-event outcome are designed assuming proportional hazards (PH) of the treatment effect. The sample size calculation is based on a logrank test. However, non-proportional hazards are increasingly common. At analysis, the estimated hazards ratio with a confidence interval is usually presented. The estimate is often obtained from a Cox PH model with treatment as a covariate. If non-proportional hazards are present, the logrank and equivalent Cox tests may lose power. To safeguard power, we previously suggested a 'joint test' combining the Cox test with a test of non-proportional hazards. Unfortunately, a larger sample size is needed to preserve power under PH. Here, we describe a novel test that unites the Cox test with a permutation test based on restricted mean survival time. METHODS We propose a combined hypothesis test based on a permutation test of the difference in restricted mean survival time across time. The test involves the minimum of the Cox and permutation test P-values. We approximate its null distribution and correct it for correlation between the two P-values. Using extensive simulations, we assess the type 1 error and power of the combined test under several scenarios and compare with other tests. We investigate powering a trial using the combined test. RESULTS The type 1 error of the combined test is close to nominal. Power under proportional hazards is slightly lower than for the Cox test. Enhanced power is available when the treatment difference shows an 'early effect', an initial separation of survival curves which diminishes over time. The power is reduced under a 'late effect', when little or no difference in survival curves is seen for an initial period and then a late separation occurs. We propose a method of powering a trial using the combined test. The 'insurance premium' offered by the combined test to safeguard power under non-PH represents about a single-digit percentage increase in sample size. CONCLUSIONS The combined test increases trial power under an early treatment effect and protects power under other scenarios. Use of restricted mean survival time facilitates testing and displaying a generalized treatment effect.
Collapse
|
25
|
mfpa: Extension of mfp using the ACD covariate transformation for enhanced parametric multivariable modeling. THE STATA JOURNAL 2016; 16:72-87. [PMID: 29398977 PMCID: PMC5796636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
In a recent article, Royston (2015, Stata Journal 15: 275-291) introduced the approximate cumulative distribution (acd) transformation of a continuous covariate x as a route toward modeling a sigmoid relationship between x and an outcome variable. In this article, we extend the approach to multivariable modeling by modifying the standard Stata program mfp. The result is a new program, mfpa, that has all the features of mfp plus the ability to fit a new model for user-selected covariates that we call fp1(p1, p2). The fp1(p1, p2) model comprises the best-fitting combination of a dimension-one fractional polynomial (fp1) function of x and an fp1 function of acd (x). We describe a new model-selection algorithm called function-selection procedure with acd transformation, which uses significance testing to attempt to simplify an fp1(p1, p2) model to a submodel, an fp1 or linear model in x or in acd (x). The function-selection procedure with acd transformation is related in concept to the fsp (fp function-selection procedure), which is an integral part of mfp and which is used to simplify a dimension-two (fp2) function. We describe the mfpa command and give univariable and multivariable examples with real data to demonstrate its use.
Collapse
|
26
|
Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data. BMC Med Res Methodol 2015; 15:82. [PMID: 26459415 PMCID: PMC4603804 DOI: 10.1186/s12874-015-0078-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2014] [Accepted: 10/02/2015] [Indexed: 12/12/2022] Open
Abstract
Background Prognostic studies of time-to-event data, where researchers aim to develop or validate multivariable prognostic models in order to predict survival, are commonly seen in the medical literature; however, most are performed retrospectively and few consider sample size prior to analysis. Events per variable rules are sometimes cited, but these are based on bias and coverage of confidence intervals for model terms, which are not of primary interest when developing a model to predict outcome. In this paper we aim to develop sample size recommendations for multivariable models of time-to-event data, based on their prognostic ability. Methods We derive formulae for determining the sample size required for multivariable prognostic models in time-to-event data, based on a measure of discrimination, D, developed by Royston and Sauerbrei. These formulae fall into two categories: either based on the significance of the value of D in a new study compared to a previous estimate, or based on the precision of the estimate of D in a new study in terms of confidence interval width. Using simulation we show that they give the desired power and type I error and are not affected by random censoring. Additionally, we conduct a literature review to collate published values of D in different disease areas. Results We illustrate our methods using parameters from a published prognostic study in liver cancer. The resulting sample sizes can be large, and we suggest controlling study size by expressing the desired accuracy in the new study as a relative value as well as an absolute value. To improve usability we use the values of D obtained from the literature review to develop an equation to approximately convert the commonly reported Harrell’s c-index to D. A flow chart is provided to aid decision making when using these methods. Conclusion We have developed a suite of sample size calculations based on the prognostic ability of a survival model, rather than the magnitude or significance of model coefficients. We have taken care to develop the practical utility of the calculations and give recommendations for their use in contemporary clinical research.
Collapse
|
27
|
The extension of total gain (TG) statistic in survival models: properties and applications. BMC Med Res Methodol 2015; 15:50. [PMID: 26126418 PMCID: PMC4486698 DOI: 10.1186/s12874-015-0042-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2014] [Accepted: 06/12/2015] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND The results of multivariable regression models are usually summarized in the form of parameter estimates for the covariates, goodness-of-fit statistics, and the relevant p-values. These statistics do not inform us about whether covariate information will lead to any substantial improvement in prediction. Predictive ability measures can be used for this purpose since they provide important information about the practical significance of prognostic factors. R (2)-type indices are the most familiar forms of such measures in survival models, but they all have limitations and none is widely used. METHODS In this paper, we extend the total gain (TG) measure, proposed for a logistic regression model, to survival models and explore its properties using simulations and real data. TG is based on the binary regression quantile plot, otherwise known as the predictiveness curve. Standardised TG ranges from 0 (no explanatory power) to 1 ('perfect' explanatory power). RESULTS The results of our simulations show that unlike many of the other R (2)-type predictive ability measures, TG is independent of random censoring. It increases as the effect of a covariate increases and can be applied to different types of survival models, including models with time-dependent covariate effects. We also apply TG to quantify the predictive ability of multivariable prognostic models developed in several disease areas. CONCLUSIONS Overall, TG performs well in our simulation studies and can be recommended as a measure to quantify the predictive ability in survival models.
Collapse
|
28
|
Meta-analysis of time-to-event outcomes from randomized trials using restricted mean survival time: application to individual participant data. Stat Med 2015; 34:2881-98. [PMID: 26099573 DOI: 10.1002/sim.6556] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2013] [Revised: 03/28/2015] [Accepted: 05/24/2015] [Indexed: 12/13/2022]
Abstract
Meta-analysis of time-to-event outcomes using the hazard ratio as a treatment effect measure has an underlying assumption that hazards are proportional. The between-arm difference in the restricted mean survival time is a measure that avoids this assumption and allows the treatment effect to vary with time. We describe and evaluate meta-analysis based on the restricted mean survival time for dealing with non-proportional hazards and present a diagnostic method for the overall proportional hazards assumption. The methods are illustrated with the application to two individual participant meta-analyses in cancer. The examples were chosen because they differ in disease severity and the patterns of follow-up, in order to understand the potential impacts on the hazards and the overall effect estimates. We further investigate the estimation methods for restricted mean survival time by a simulation study.
Collapse
|
29
|
Combining fractional polynomial model building with multiple imputation. Stat Med 2015; 34:3298-317. [PMID: 26095614 PMCID: PMC4871237 DOI: 10.1002/sim.6553] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Revised: 05/07/2015] [Accepted: 05/19/2015] [Indexed: 01/03/2023]
Abstract
Multivariable fractional polynomial (MFP) models are commonly used in medical research. The datasets in which MFP models are applied often contain covariates with missing values. To handle the missing values, we describe methods for combining multiple imputation with MFP modelling, considering in turn three issues: first, how to impute so that the imputation model does not favour certain fractional polynomial (FP) models over others; second, how to estimate the FP exponents in multiply imputed data; and third, how to choose between models of differing complexity. Two imputation methods are outlined for different settings. For model selection, methods based on Wald‐type statistics and weighted likelihood‐ratio tests are proposed and evaluated in simulation studies. The Wald‐based method is very slightly better at estimating FP exponents. Type I error rates are very similar for both methods, although slightly less well controlled than analysis of complete records; however, there is potential for substantial gains in power over the analysis of complete records. We illustrate the two methods in a dataset from five trauma registries for which a prognostic model has previously been published, contrasting the selected models with that obtained by analysing the complete records only. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Collapse
|
30
|
Correction of the anaemia of chronic renal failure with erythropoietin: pharmacokinetic studies in patients on haemodialysis and CAPD. CONTRIBUTIONS TO NEPHROLOGY 2015; 76:122-30. [PMID: 2582777 DOI: 10.1159/000417888] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
31
|
Prognostic survival model for people diagnosed with invasive cutaneous melanoma. BMC Cancer 2015; 15:27. [PMID: 25637143 PMCID: PMC4328047 DOI: 10.1186/s12885-015-1024-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2014] [Accepted: 01/14/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The ability of medical practitioners to communicate risk estimates effectively to patients diagnosed with melanoma relies on accurate information about prognostic factors and their impact on survival. This study reports the development of one of the few melanoma prognostic models, called the Melanoma Severity Index (MSI), based on population-based cancer registry data. METHODS Data from the Queensland Cancer Registry for people (20-89 years) diagnosed with a single invasive melanoma between 1995 and 2008 (n = 28,654; 1,700 melanoma deaths). Additional clinical information about metastasis, ulceration and positive lymph nodes was manually extracted from pathology forms. Flexible parametric survival models were combined with multivariable fractional polynomial for selecting variables and transformations of continuous variables. Multiple imputation was used for missing covariate values. RESULTS The MSI contained the variables thickness (transformed, explained 40.6% of variation in survival), body site (additional 1.9% in variation), metastasis (1.8%), positive nodes (0.7%), ulceration (1.3%), age (1.1%). Royston and Sauerbrei's D statistic (measure of discrimination) was 1.50 (95% CI = 1.44, 1.56) and the corresponding RD2 (measure of explained variation) was 0.47 (0.45, 0.49), demonstrating strong explanatory performance. The Harrell-C statistic was 0.88 (0.88, 0.89). Lacking an external validation dataset, we applied internal-external cross validation to demonstrate the consistency of the prognostic information across geographically-defined subsets of the cohort. CONCLUSIONS The MSI provides good ability to predict survival for melanoma patients. Beyond the immediate clinical use, the MSI may have important public health and research applications for evaluations of public health interventions aimed at reducing deaths from melanoma.
Collapse
|
32
|
The estimation and use of predictions for the assessment of model performance using large samples with multiply imputed data. Biom J 2015; 57:614-32. [PMID: 25630926 PMCID: PMC4515100 DOI: 10.1002/bimj.201400004] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Revised: 10/02/2014] [Accepted: 10/13/2014] [Indexed: 01/02/2023]
Abstract
Multiple imputation can be used as a tool in the process of constructing prediction models in medical and epidemiological studies with missing covariate values. Such models can be used to make predictions for model performance assessment, but the task is made more complicated by the multiple imputation structure. We summarize various predictions constructed from covariates, including multiply imputed covariates, and either the set of imputation-specific prediction model coefficients or the pooled prediction model coefficients. We further describe approaches for using the predictions to assess model performance. We distinguish between ideal model performance and pragmatic model performance, where the former refers to the model's performance in an ideal clinical setting where all individuals have fully observed predictors and the latter refers to the model's performance in a real-world clinical setting where some individuals have missing predictors. The approaches are compared through an extensive simulation study based on the UK700 trial. We determine that measures of ideal model performance can be estimated within imputed datasets and subsequently pooled to give an overall measure of model performance. Alternative methods to evaluate pragmatic model performance are required and we propose constructing predictions either from a second set of covariate imputations which make no use of observed outcomes, or from a set of partial prediction models constructed for each potential observed pattern of covariate. Pragmatic model performance is generally lower than ideal model performance. We focus on model performance within the derivation data, but describe how to extend all the methods to a validation dataset.
Collapse
|
33
|
Interaction of treatment with a continuous variable: simulation study of power for several methods of analysis. Stat Med 2014; 33:4695-708. [PMID: 25244679 DOI: 10.1002/sim.6308] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2013] [Accepted: 08/16/2014] [Indexed: 11/07/2022]
Abstract
In a large simulation study reported in a companion paper, we investigated the significance levels of 21 methods for investigating interactions between binary treatment and a continuous covariate in a randomised controlled trial. Several of the methods were shown to have inflated type 1 errors. In the present paper, we report the second part of the simulation study in which we investigated the power of the interaction procedures for two sample sizes and with two distributions of the covariate (well and badly behaved). We studied several methods involving categorisation and others in which the covariate was kept continuous, including fractional polynomials and splines. We believe that the results provide sufficient evidence to recommend the multivariable fractional polynomial interaction procedure as a suitable approach to investigate interactions of treatment with a continuous variable. If subject-matter knowledge gives good arguments for a non-monotone treatment effect function, we propose to use a second-degree fractional polynomial approach, but otherwise a first-degree fractional polynomial (FP1) function with added flexibility (FLEX3) is the method of choice. The FP1 class includes the linear function, and the selected functions are simple, understandable, and transferable. Furthermore, software is available. We caution that investigation of interactions in one dataset can only be interpreted in a hypothesis-generating sense and needs validation in new data.
Collapse
|
34
|
An approach to trial design and analysis in the era of non-proportional hazards of the treatment effect. Trials 2014; 15:314. [PMID: 25098243 PMCID: PMC4133607 DOI: 10.1186/1745-6215-15-314] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2013] [Accepted: 07/17/2014] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Most randomized controlled trials with a time-to-event outcome are designed and analysed under the proportional hazards assumption, with a target hazard ratio for the treatment effect in mind. However, the hazards may be non-proportional. We address how to design a trial under such conditions, and how to analyse the results. METHODS We propose to extend the usual approach, a logrank test, to also include the Grambsch-Therneau test of proportional hazards. We test the resulting composite null hypothesis using a joint test for the hazard ratio and for time-dependent behaviour of the hazard ratio. We compute the power and sample size for the logrank test under proportional hazards, and from that we compute the power of the joint test. For the estimation of relevant quantities from the trial data, various models could be used; we advocate adopting a pre-specified flexible parametric survival model that supports time-dependent behaviour of the hazard ratio. RESULTS We present the mathematics for calculating the power and sample size for the joint test. We illustrate the methodology in real data from two randomized trials, one in ovarian cancer and the other in treating cellulitis. We show selected estimates and their uncertainty derived from the advocated flexible parametric model. We demonstrate in a small simulation study that when a treatment effect either increases or decreases over time, the joint test can outperform the logrank test in the presence of both patterns of non-proportional hazards. CONCLUSIONS Those designing and analysing trials in the era of non-proportional hazards need to acknowledge that a more complex type of treatment effect is becoming more common. Our method for the design of the trial retains the tools familiar in the standard methodology based on the logrank test, and extends it to incorporate a joint test of the null hypothesis with power against non-proportional hazards. For the analysis of trial data, we propose the use of a pre-specified flexible parametric model that can represent a time-dependent hazard ratio if one is present.
Collapse
|
35
|
Abstract
The C statistic is a commonly reported measure of screening test performance. Optimistic estimation of the C statistic is a frequent problem because of overfitting of statistical models in small data sets, and methods exist to correct for this issue. However, many studies do not use such methods, and those that do correct for optimism use diverse methods, some of which are known to be biased. We used clinical data sets (United Kingdom Down syndrome screening data from Glasgow (1991–2003), Edinburgh (1999–2003), and Cambridge (1990–2006), as well as Scottish national pregnancy discharge data (2004–2007)) to evaluate different approaches to adjustment for optimism. We found that sample splitting, cross-validation without replication, and leave-1-out cross-validation produced optimism-adjusted estimates of the C statistic that were biased and/or associated with greater absolute error than other available methods. Cross-validation with replication, bootstrapping, and a new method (leave-pair-out cross-validation) all generated unbiased optimism-adjusted estimates of the C statistic and had similar absolute errors in the clinical data set. Larger simulation studies confirmed that all 3 methods performed similarly with 10 or more events per variable, or when the C statistic was 0.9 or greater. However, with lower events per variable or lower C statistics, bootstrapping tended to be optimistic but with lower absolute and mean squared errors than both methods of cross-validation.
Collapse
|
36
|
Investigation of continuous effect modifiers in a meta-analysis on higher versus lower PEEP in patients requiring mechanical ventilation--protocol of the ICEM study. Syst Rev 2014; 3:46. [PMID: 24887172 PMCID: PMC4035853 DOI: 10.1186/2046-4053-3-46] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Accepted: 05/07/2014] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Categorizing an inherently continuous predictor in prognostic analyses raises several critical methodological issues: dependence of the statistical significance on the number and position of the chosen cut-point(s), loss of statistical power, and faulty interpretation of the results if a non-linear association is incorrectly assumed to be linear. This also applies to a therapeutic context where investigators of randomized clinical trials (RCTs) are interested in interactions between treatment assignment and one or more continuous predictors. METHODS/DESIGN Our goal is to apply the multivariable fractional polynomial interaction (MFPI) approach to investigate interactions between continuous patient baseline variables and the allocated treatment in an individual patient data meta-analysis of three RCTs (N = 2,299) from the intensive care field. For each study, MFPI will provide a continuous treatment effect function. Functions from each of the three studies will be averaged by a novel meta-analysis approach for functions. We will plot treatment effect functions separately for each study and also the averaged function. The averaged function with a related confidence interval will provide a suitable basis to assess whether a continuous patient characteristic modifies the treatment comparison and may be relevant for clinical decision-making. The compared interventions will be a higher or lower positive end-expiratory pressure (PEEP) ventilation strategy in patients requiring mechanical ventilation. The continuous baseline variables body mass index, PaO2/FiO2, respiratory compliance, and oxygenation index will be the investigated potential effect modifiers. Clinical outcomes for this analysis will be in-hospital mortality, time to death, time to unassisted breathing, and pneumothorax. DISCUSSION This project will be the first meta-analysis to combine continuous treatment effect functions derived by the MFPI procedure separately in each of several RCTs. Such an approach requires individual patient data (IPD). They are available from an earlier IPD meta-analysis using different methods for analysis. This new analysis strategy allows assessing whether treatment effects interact with continuous baseline patient characteristics and avoids categorization-based subgroup analyses. These interaction analyses of the present study will be exploratory in nature. However, they may help to foster future research using the MFPI approach to improve interaction analyses of continuous predictors in RCTs and IPD meta-analyses. This study is registered in PROSPERO (CRD42012003129).
Collapse
|
37
|
A smooth covariate rank transformation for use in regression models with a sigmoid dose-response function. THE STATA JOURNAL 2014; 14:329-341. [PMID: 29097908 PMCID: PMC5663339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
We consider how to represent sigmoid-type regression relationships in a practical and parsimonious way. A pure sigmoid relationship has an asymptote at both ends of the range of a continuous covariate. Curves with a single asymptote are also important in practice. Many smoothers, such as fractional polynomials and restricted cubic regression splines, cannot accurately represent doubly asymptotic curves. Such smoothers may struggle even with singly asymptotic curves. Our approach to modeling sigmoid relationships involves applying a preliminary scaled rank transformation to compress the tails of the observed distribution of a continuous covariate. We include a step that provides a smooth approximation to the empirical cumulative distribution function of the covariate via the scaled ranks. The procedure defines the approximate cumulative distribution transformation of the covariate. To fit the substantive model, we apply fractional polynomial regression to the outcome with the smoothed, scaled ranks as the covariate. When the resulting fractional polynomial function is monotone, we have a sigmoid function. We demonstrate several practical applications of the approximate cumulative distribution transformation while also illustrating its ability to model some unusual functional forms. We describe a command, acd, that implements it.
Collapse
|
38
|
Reference intervals of spinal mobility measures in normal individuals: the mobility study. Ann Rheum Dis 2014; 74:1218-24. [DOI: 10.1136/annrheumdis-2013-204953] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2013] [Accepted: 03/01/2014] [Indexed: 11/04/2022]
Abstract
ObjectivesTo establish reference intervals (RIs) for spinal mobility measures as recommended for patients with axial spondyloarthritis, and to determine the effect of age, height and gender on spinal mobility, in normal individuals.MethodsA cross-sectional study (MOBILITY) was conducted among normal individuals aged 20–69 years. Recruitment was stratified by gender, age (10-year categories) and height (10 cm categories). Eleven spinal mobility measures were assessed. Age specific RIs and percentiles were derived for each measure.Results393 volunteers were included. All spinal mobility measures decreased with increasing age. Therefore, age specific RIs were developed. The 95% RIs (2.5th and 97.5th percentiles), as well as the 5th, 10th, 25th, 50th, 75th and 90th percentiles for each spinal mobility measure and different ages are presented. Mobility percentile curves were also plotted for each of the measures. For instance, the 95% RI for lateral spinal flexion was 16.2–28.0 cm for a 25-year-old subject, 13.2–25.0 cm for a 45-year-old subject and 10.1–21.9 cm for a 65-year-old subject. After adjustment for age, there was no need for gender specific RIs, while RIs of some measures are height-adjusted.ConclusionsAge specific RIs and percentiles were derived for each of the spinal mobility measures for normal individuals. These may guide clinicians when assessing the mobility of patients with axial spondyloarthritis. The RIs may serve as cut-off levels for ‘normal’ versus ‘abnormal’, whereas the mobility percentile curves may be used to assess the level of mobility of patients with axial spondyloarthritis.
Collapse
|
39
|
Restricted mean survival time: an alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome. BMC Med Res Methodol 2013; 13:152. [PMID: 24314264 PMCID: PMC3922847 DOI: 10.1186/1471-2288-13-152] [Citation(s) in RCA: 542] [Impact Index Per Article: 49.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Accepted: 11/15/2013] [Indexed: 12/28/2022] Open
Abstract
Background Designs and analyses of clinical trials with a time-to-event outcome almost invariably rely on the hazard ratio to estimate the treatment effect and implicitly, therefore, on the proportional hazards assumption. However, the results of some recent trials indicate that there is no guarantee that the assumption will hold. Here, we describe the use of the restricted mean survival time as a possible alternative tool in the design and analysis of these trials. Methods The restricted mean is a measure of average survival from time 0 to a specified time point, and may be estimated as the area under the survival curve up to that point. We consider the design of such trials according to a wide range of possible survival distributions in the control and research arm(s). The distributions are conveniently defined as piecewise exponential distributions and can be specified through piecewise constant hazards and time-fixed or time-dependent hazard ratios. Such designs can embody proportional or non-proportional hazards of the treatment effect. Results We demonstrate the use of restricted mean survival time and a test of the difference in restricted means as an alternative measure of treatment effect. We support the approach through the results of simulation studies and in real examples from several cancer trials. We illustrate the required sample size under proportional and non-proportional hazards, also the significance level and power of the proposed test. Values are compared with those from the standard approach which utilizes the logrank test. Conclusions We conclude that the hazard ratio cannot be recommended as a general measure of the treatment effect in a randomized controlled trial, nor is it always appropriate when designing a trial. Restricted mean survival time may provide a practical way forward and deserves greater attention.
Collapse
|
40
|
|
41
|
Multiple imputation for an incomplete covariate that is a ratio. Stat Med 2013; 33:88-104. [PMID: 23922236 PMCID: PMC3920636 DOI: 10.1002/sim.5935] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Accepted: 07/11/2013] [Indexed: 11/27/2022]
Abstract
We are concerned with multiple imputation of the ratio of two variables, which is to be used as a covariate in a regression analysis. If the numerator and denominator are not missing simultaneously, it seems sensible to make use of the observed variable in the imputation model. One such strategy is to impute missing values for the numerator and denominator, or the log-transformed numerator and denominator, and then calculate the ratio of interest; we call this ‘passive’ imputation. Alternatively, missing ratio values might be imputed directly, with or without the numerator and/or the denominator in the imputation model; we call this ‘active’ imputation. In two motivating datasets, one involving body mass index as a covariate and the other involving the ratio of total to high-density lipoprotein cholesterol, we assess the sensitivity of results to the choice of imputation model and, as an alternative, explore fully Bayesian joint models for the outcome and incomplete ratio. Fully Bayesian approaches using Winbugs were unusable in both datasets because of computational problems. In our first dataset, multiple imputation results are similar regardless of the imputation model; in the second, results are sensitive to the choice of imputation model. Sensitivity depends strongly on the coefficient of variation of the ratio's denominator. A simulation study demonstrates that passive imputation without transformation is risky because it can lead to downward bias when the coefficient of variation of the ratio's denominator is larger than about 0.1. Active imputation or passive imputation after log-transformation is preferable. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.
Collapse
|
42
|
Exaggeration of the prognostic effect of Mammostrat: a consequence of poor reporting? J Clin Oncol 2013; 31:2760-1. [PMID: 23775953 DOI: 10.1200/jco.2013.49.2652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
43
|
SAT0249 Reference Intervals of Spinal Mobility Measures in Normal Individuals – The Mobility Study. Ann Rheum Dis 2013. [DOI: 10.1136/annrheumdis-2013-eular.1974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
44
|
Interaction of treatment with a continuous variable: simulation study of significance level for several methods of analysis. Stat Med 2013; 32:3788-803. [PMID: 23580422 DOI: 10.1002/sim.5813] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2012] [Accepted: 03/07/2013] [Indexed: 11/08/2022]
Abstract
Interactions between treatments and covariates in RCTs are a key topic. Standard methods for modelling treatment-covariate interactions with continuous covariates are categorisation or linear functions. Both approaches are easily criticised, but for different reasons. Multivariable fractional polynomial interactions, an approach based on fractional polynomials with the linear interaction model as the simplest special case, was proposed. Four variants of multivariable fractional polynomial interaction (FLEX1-FLEX4), allowing varying flexibility in functional form, were suggested. However, their properties are unknown, and comparisons with other procedures are unavailable. Additionally, we consider various methods based on categorisation and on cubic regression splines. We present the results of a simulation study to determine the significance level (probability of a type 1 error) of various tests for interaction between a binary covariate ('treatment effect') and a continuous covariate in univariate analysis. We consider a simplified setting in which the response variable is conditionally normally distributed, given the continuous covariate. We consider two main cases with the covariate distribution well behaved (approximately symmetric) or badly behaved (positively skewed). We construct nine scenarios with different functional forms for the main effect. In the well-behaved case, significance levels are in general acceptably close to nominal and are slightly better for the larger sample size (n = 250 and 500 were investigated). In the badly behaved case, departures from nominal are more pronounced for several approaches. For a final assessment of these results and recommendations for practice, a study of power is needed.
Collapse
|
45
|
External validation of a Cox prognostic model: principles and methods. BMC Med Res Methodol 2013; 13:33. [PMID: 23496923 PMCID: PMC3667097 DOI: 10.1186/1471-2288-13-33] [Citation(s) in RCA: 567] [Impact Index Per Article: 51.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2012] [Accepted: 02/15/2013] [Indexed: 11/17/2022] Open
Abstract
Background A prognostic model should not enter clinical practice unless it has been demonstrated that it performs a useful role. External validation denotes evaluation of model performance in a sample independent of that used to develop the model. Unlike for logistic regression models, external validation of Cox models is sparsely treated in the literature. Successful validation of a model means achieving satisfactory discrimination and calibration (prediction accuracy) in the validation sample. Validating Cox models is not straightforward because event probabilities are estimated relative to an unspecified baseline function. Methods We describe statistical approaches to external validation of a published Cox model according to the level of published information, specifically (1) the prognostic index only, (2) the prognostic index together with Kaplan-Meier curves for risk groups, and (3) the first two plus the baseline survival curve (the estimated survival function at the mean prognostic index across the sample). The most challenging task, requiring level 3 information, is assessing calibration, for which we suggest a method of approximating the baseline survival function. Results We apply the methods to two comparable datasets in primary breast cancer, treating one as derivation and the other as validation sample. Results are presented for discrimination and calibration. We demonstrate plots of survival probabilities that can assist model evaluation. Conclusions Our validation methods are applicable to a wide range of prognostic studies and provide researchers with a toolkit for external validation of a published Cox model.
Collapse
|
46
|
Impact of lack-of-benefit stopping rules on treatment effect estimates of two-arm multi-stage (TAMS) trials with time to event outcome. Trials 2013; 14:23. [PMID: 23343147 PMCID: PMC3599134 DOI: 10.1186/1745-6215-14-23] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2012] [Accepted: 12/03/2012] [Indexed: 11/10/2022] Open
Abstract
Background In 2011, Royston et al. described technical details of a two-arm, multi-stage (TAMS) design. The design enables a trial to be stopped part-way through recruitment if the accumulating data suggests a lack of benefit of the experimental arm. Such interim decisions can be made using data on an available ‘intermediate’ outcome. At the conclusion of the trial, the definitive outcome is analyzed. Typical intermediate and definitive outcomes in cancer might be progression-free and overall survival, respectively. In TAMS designs, the stopping rule applied at the interim stage(s) affects the sampling distribution of the treatment effect estimator, potentially inducing bias that needs addressing. Methods We quantified the bias in the treatment effect estimator in TAMS trials according to the size of the treatment effect and for different designs. We also retrospectively ‘redesigned’ completed cancer trials as TAMS trials and used the bootstrap to quantify bias. Results In trials in which the experimental treatment is better than the control and which continue to their planned end, the bias in the estimate of treatment effect is small and of no practical importance. In trials stopped for lack of benefit at an interim stage, the treatment effect estimate is biased at the time of interim assessment. This bias is markedly reduced by further patient follow-up and reanalysis at the planned ‘end’ of the trial. Conclusions Provided that all patients in a TAMS trial are followed up to the planned end of the trial, the bias in the estimated treatment effect is of no practical importance. Bias correction is then unnecessary.
Collapse
|
47
|
Impact of lack-of-benefit stopping rules on treatment effect estimates of two-arm multi-stage (TAMS) trials with time to event outcome. Trials 2013. [PMCID: PMC3980300 DOI: 10.1186/1745-6215-14-s1-o2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
|
48
|
Comparison between splines and fractional polynomials for multivariable model building with continuous covariates: a simulation study with continuous response. Stat Med 2012; 32:2262-77. [DOI: 10.1002/sim.5639] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Accepted: 09/07/2012] [Indexed: 11/06/2022]
|
49
|
Analysing covariates with spike at zero: a modified FP procedure and conceptual issues. Biom J 2012; 54:686-700. [PMID: 22778015 DOI: 10.1002/bimj.201100263] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2011] [Revised: 05/14/2012] [Accepted: 05/14/2012] [Indexed: 11/07/2022]
Abstract
In epidemiology and in clinical research, risk factors often have special distributions. A common situation is that a proportion of individuals have exposure zero, and among those exposed, we have some continuous distribution. We call this a 'spike at zero'. Examples for this are smoking, duration of breastfeeding, or alcohol consumption. Furthermore, the empirical distribution of laboratory values and other measurements may have a semi-continuous distribution as a result of the lower detection limit of the measurement. To model the dose-response function, an extension of the fractional polynomial approach was recently proposed. In this paper, we suggest a modification of the previously suggested FP procedure. We first give the theoretical justification of this modified procedure by investigating relevant distribution classes. Here, we systematically derive the theoretical shapes of dose-response curves under given distributional assumptions (normal, log normal, gamma) in the framework of a logistic regression model. Further, we check the performance of the procedure in a simulation study and compare it to the previously suggested method, and finally we illustrate the procedures with data from a case-control study on breast cancer.
Collapse
|
50
|
A simulation study of predictive ability measures in a survival model II: explained randomness and predictive accuracy. Stat Med 2012; 31:2644-59. [PMID: 22764064 DOI: 10.1002/sim.5460] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2010] [Accepted: 03/13/2012] [Indexed: 11/10/2022]
Abstract
Several R(2) -type measures have been proposed to evaluate the predictive ability of a survival model. In Part I, we classified the measures into four categories and studied the measures in the explained variation category. In this paper, we study the remaining measures in a similar fashion, discussing their strengths and shortcomings. Simulation studies are used to examine the performance of the measures with respect to the criteria we set out in Part I. Our simulation studies showed that among the measures studied in this paper, the measures proposed by Kent and O'Quigley ρ(W)(2) (and its approximation ρ(W,A)(2)) and Schemper and Kaider R(SK)(2) perform better with respect to our criteria. However, our investigations showed that ρ(W)(2) is adversely affected by the distribution of covariate and the presence of influential observations. The results show that the other measures perform poorly, primarily because they are affected either by the degree of censoring or the follow-up period.
Collapse
|