1
|
Jiménez JL, Barrott I, Gasperoni F, Magirr D. Visualizing hypothesis tests in survival analysis under anticipated delayed effects. Pharm Stat 2024; 23:870-883. [PMID: 38708672 DOI: 10.1002/pst.2393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 12/14/2023] [Accepted: 04/04/2024] [Indexed: 05/07/2024]
Abstract
What can be considered an appropriate statistical method for the primary analysis of a randomized clinical trial (RCT) with a time-to-event endpoint when we anticipate non-proportional hazards owing to a delayed effect? This question has been the subject of much recent debate. The standard approach is a log-rank test and/or a Cox proportional hazards model. Alternative methods have been explored in the statistical literature, such as weighted log-rank tests and tests based on the Restricted Mean Survival Time (RMST). While weighted log-rank tests can achieve high power compared to the standard log-rank test, some choices of weights may lead to type-I error inflation under particular conditions. In addition, they are not linked to a mathematically unambiguous summary measure. Test statistics based on the RMST, on the other hand, allow one to investigate the average difference between two survival curves up to a pre-specified time point τ -a mathematically unambiguous summary measure. However, by emphasizing differences prior to τ , such test statistics may not fully capture the benefit of a new treatment in terms of long-term survival. In this article, we introduce a graphical approach for direct comparison of weighted log-rank tests and tests based on the RMST. This new perspective allows a more informed choice of the analysis method, going beyond power and type I error comparison.
Collapse
|
2
|
Schenk A, Berger M, Schmid M. Pseudo-value regression trees. LIFETIME DATA ANALYSIS 2024; 30:439-471. [PMID: 38403840 PMCID: PMC11297840 DOI: 10.1007/s10985-024-09618-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 01/19/2024] [Indexed: 02/27/2024]
Abstract
This paper presents a semi-parametric modeling technique for estimating the survival function from a set of right-censored time-to-event data. Our method, named pseudo-value regression trees (PRT), is based on the pseudo-value regression framework, modeling individual-specific survival probabilities by computing pseudo-values and relating them to a set of covariates. The standard approach to pseudo-value regression is to fit a main-effects model using generalized estimating equations (GEE). PRT extend this approach by building a multivariate regression tree with pseudo-value outcome and by successively fitting a set of regularized additive models to the data in the nodes of the tree. Due to the combination of tree learning and additive modeling, PRT are able to perform variable selection and to identify relevant interactions between the covariates, thereby addressing several limitations of the standard GEE approach. In addition, PRT include time-dependent effects in the node-wise models. Interpretability of the PRT fits is ensured by controlling the tree depth. Based on the results of two simulation studies, we investigate the properties of the PRT method and compare it to several alternative modeling techniques. Furthermore, we illustrate PRT by analyzing survival in 3,652 patients enrolled for a randomized study on primary invasive breast cancer.
Collapse
Affiliation(s)
- Alina Schenk
- Institute of Medical Biometry, Informatics and Epidemiology, Medical Faculty, University of Bonn, Bonn, Germany.
| | - Moritz Berger
- Institute of Medical Biometry, Informatics and Epidemiology, Medical Faculty, University of Bonn, Bonn, Germany
| | - Matthias Schmid
- Institute of Medical Biometry, Informatics and Epidemiology, Medical Faculty, University of Bonn, Bonn, Germany
| |
Collapse
|
3
|
Ahn S, Datta S. Differential network connectivity analysis for microbiome data adjusted for clinical covariates using jackknife pseudo-values. BMC Bioinformatics 2024; 25:117. [PMID: 38500042 PMCID: PMC10946111 DOI: 10.1186/s12859-024-05689-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Accepted: 02/02/2024] [Indexed: 03/20/2024] Open
Abstract
BACKGROUND A recent breakthrough in differential network (DN) analysis of microbiome data has been realized with the advent of next-generation sequencing technologies. The DN analysis disentangles the microbial co-abundance among taxa by comparing the network properties between two or more graphs under different biological conditions. However, the existing methods to the DN analysis for microbiome data do not adjust for other clinical differences between subjects. RESULTS We propose a Statistical Approach via Pseudo-value Information and Estimation for Differential Network Analysis (SOHPIE-DNA) that incorporates additional covariates such as continuous age and categorical BMI. SOHPIE-DNA is a regression technique adopting jackknife pseudo-values that can be implemented readily for the analysis. We demonstrate through simulations that SOHPIE-DNA consistently reaches higher recall and F1-score, while maintaining similar precision and accuracy to existing methods (NetCoMi and MDiNE). Lastly, we apply SOHPIE-DNA on two real datasets from the American Gut Project and the Diet Exchange Study to showcase the utility. The analysis of the Diet Exchange Study is to showcase that SOHPIE-DNA can also be used to incorporate the temporal change of connectivity of taxa with the inclusion of additional covariates. As a result, our method has found taxa that are related to the prevention of intestinal inflammation and severity of fatigue in advanced metastatic cancer patients. CONCLUSION SOHPIE-DNA is the first attempt of introducing the regression framework for the DN analysis in microbiome data. This enables the prediction of characteristics of a connectivity of a network with the presence of additional covariate information in the regression. The R package with a vignette of our methodology is available through the CRAN repository ( https://CRAN.R-project.org/package=SOHPIE ), named SOHPIE (pronounced as Sofie). The source code and user manual can be found at https://github.com/sjahnn/SOHPIE-DNA .
Collapse
Affiliation(s)
- Seungjun Ahn
- Department of Biostatistics, University of Florida, Gainesville, FL, USA
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Somnath Datta
- Department of Biostatistics, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
4
|
Wang C, Wei K, Huang C, Yu Y, Qin G. Multiply robust estimator for the difference in survival functions using pseudo-observations. BMC Med Res Methodol 2023; 23:247. [PMID: 37872495 PMCID: PMC10591363 DOI: 10.1186/s12874-023-02065-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 10/11/2023] [Indexed: 10/25/2023] Open
Abstract
BACKGROUND When estimating the causal effect on survival outcomes in observational studies, it is necessary to adjust confounding factors due to unbalanced covariates between treatment and control groups. There is no study on multiple robust method for estimating the difference in survival functions. In this study, we propose a multiply robust (MR) estimator, allowing multiple propensity score models and outcome regression models, to provide multiple protection. METHOD Based on the previous MR estimator (Han 2014) and pseudo-observation approach, we proposed a new MR estimator for estimating the difference in survival functions. The proposed MR estimator based on the pseudo-observation approach has several advantages. First, the proposed estimator has a small bias when any PS and OR models were correctly specified. Second, the proposed estimator considers the advantage pf the pseudo-observation approach, which avoids proportional hazards assumption. A Monte Carlo simulation study was performed to evaluate the performance of the proposed estimator. And the proposed estimator was used to estimate the effect of chemotherapy on triple-negative breast cancer (TNBC) in real data. RESULTS The simulation studies showed that the bias of the proposed estimator was small, and the coverage rate was close to 95% when any model for propensity score or outcome regression is correctly specified regardless of whether the proportional hazard assumption holds, finite sample size and censoring rate. And the simulation results also showed that even though the propensity score models are misspecified, the bias of the proposed estimator was still small when there is a correct model in candidate outcome regression models. And we applied the proposed estimator in real data, finding that chemotherapy could improve the prognosis of TNBC. CONCLUSIONS The proposed estimator, allowing multiple propensity score and outcome regression models, provides multiple protection for estimating the difference in survival functions. The proposed estimator provided a new choice when researchers have a "difficult time" choosing only one model for their studies.
Collapse
Affiliation(s)
- Ce Wang
- Department of Biostatistics, Key Laboratory for Health Technology Assessment, National Commission of Health, Key Laboratory of Public Health Safety of Ministry of Education, School of Public Health, Fudan University, Shanghai, China
| | - Kecheng Wei
- Department of Biostatistics, Key Laboratory for Health Technology Assessment, National Commission of Health, Key Laboratory of Public Health Safety of Ministry of Education, School of Public Health, Fudan University, Shanghai, China
| | - Chen Huang
- Department of Biostatistics, Key Laboratory for Health Technology Assessment, National Commission of Health, Key Laboratory of Public Health Safety of Ministry of Education, School of Public Health, Fudan University, Shanghai, China
| | - Yongfu Yu
- Department of Biostatistics, Key Laboratory for Health Technology Assessment, National Commission of Health, Key Laboratory of Public Health Safety of Ministry of Education, School of Public Health, Fudan University, Shanghai, China.
- Shanghai Institute of Infectious Disease and Biosecurity, Shanghai, China.
| | - Guoyou Qin
- Department of Biostatistics, Key Laboratory for Health Technology Assessment, National Commission of Health, Key Laboratory of Public Health Safety of Ministry of Education, School of Public Health, Fudan University, Shanghai, China.
- Shanghai Institute of Infectious Disease and Biosecurity, Shanghai, China.
| |
Collapse
|
5
|
Clift AK, Collins GS, Lord S, Petrou S, Dodwell D, Brady M, Hippisley-Cox J. Predicting 10-year breast cancer mortality risk in the general female population in England: a model development and validation study. Lancet Digit Health 2023; 5:e571-e581. [PMID: 37625895 DOI: 10.1016/s2589-7500(23)00113-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 04/06/2023] [Accepted: 06/12/2023] [Indexed: 08/27/2023]
Abstract
BACKGROUND Identifying female individuals at highest risk of developing life-threatening breast cancers could inform novel stratified early detection and prevention strategies to reduce breast cancer mortality, rather than only considering cancer incidence. We aimed to develop a prognostic model that accurately predicts the 10-year risk of breast cancer mortality in female individuals without breast cancer at baseline. METHODS In this model development and validation study, we used an open cohort study from the QResearch primary care database, which was linked to secondary care and national cancer and mortality registers in England, UK. The data extracted were from female individuals aged 20-90 years without previous breast cancer or ductal carcinoma in situ who entered the cohort between Jan 1, 2000, and Dec 31, 2020. The primary outcome was breast cancer-related death, which was assessed in the full dataset. Cox proportional hazards, competing risks regression, XGBoost, and neural network modelling approaches were used to predict the risk of breast cancer death within 10 years using routinely collected health-care data. Death due to causes other than breast cancer was the competing risk. Internal-external validation was used to evaluate prognostic model performance (using Harrell's C, calibration slope, and calibration in the large), performance heterogeneity, and transportability. Internal-external validation involved dataset partitioning by time period and geographical region. Decision curve analysis was used to assess clinical utility. FINDINGS We identified data for 11 626 969 female individuals, with 70 095 574 person-years of follow-up. There were 142 712 (1·2%) diagnoses of breast cancer, 24 043 (0·2%) breast cancer-related deaths, and 696 106 (6·0%) deaths from other causes. Meta-analysis pooled estimates of Harrell's C were highest for the competing risks model (0·932, 95% CI 0·917-0·946). The competing risks model was well calibrated overall (slope 1·011, 95% CI 0·978-1·044), and across different ethnic groups. Decision curve analysis suggested favourable clinical utility across all age groups. The XGBoost and neural network models had variable performance across age and ethnic groups. INTERPRETATION A model that predicts the combined risk of developing and then dying from breast cancer at the population level could inform stratified screening or chemoprevention strategies. Further evaluation of the competing risks model should comprise effect and health economic assessment of model-informed strategies. FUNDING Cancer Research UK.
Collapse
Affiliation(s)
- Ash Kieran Clift
- Cancer Research UK Oxford Centre, University of Oxford, UK; Nuffield Department of Primary Care Health Sciences, University of Oxford, UK.
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, UK
| | - Simon Lord
- Department of Oncology, University of Oxford, UK
| | - Stavros Petrou
- Nuffield Department of Primary Care Health Sciences, University of Oxford, UK
| | - David Dodwell
- Nuffield Department of Population Health, University of Oxford, UK
| | | | - Julia Hippisley-Cox
- Nuffield Department of Primary Care Health Sciences, University of Oxford, UK
| |
Collapse
|
6
|
Anyaso-Samuel S, Datta S. Adjusting for informative cluster size in pseudo-value-based regression approaches with clustered time to event data. Stat Med 2023; 42:2162-2178. [PMID: 36973919 PMCID: PMC10219850 DOI: 10.1002/sim.9716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 02/09/2023] [Accepted: 03/11/2023] [Indexed: 03/29/2023]
Abstract
Informative cluster size (ICS) arises in situations with clustered data where a latent relationship exists between the number of participants in a cluster and the outcome measures. Although this phenomenon has been sporadically reported in the statistical literature for nearly two decades now, further exploration is needed in certain statistical methodologies to avoid potentially misleading inferences. For inference about population quantities without covariates, inverse cluster size reweightings are often employed to adjust for ICS. Further, to study the effect of covariates on disease progression described by a multistate model, the pseudo-value regression technique has gained popularity in time-to-event data analysis. We seek to answer the question: "How to apply pseudo-value regression to clustered time-to-event data when cluster size is informative?" ICS adjustment by the reweighting method can be performed in two steps; estimation of marginal functions of the multistate model and fitting the estimating equations based on pseudo-value responses, leading to four possible strategies. We present theoretical arguments and thorough simulation experiments to ascertain the correct strategy for adjusting for ICS. A further extension of our methodology is implemented to include informativeness induced by the intracluster group size. We demonstrate the methods in two real-world applications: (i) to determine predictors of tooth survival in a periodontal study and (ii) to identify indicators of ambulatory recovery in spinal cord injury patients who participated in locomotor-training rehabilitation.
Collapse
Affiliation(s)
| | - Somnath Datta
- Department of Biostatistics, University of Florida, Gainesville, FL,
U.S.A
| |
Collapse
|
7
|
Clift AK, Dodwell D, Lord S, Petrou S, Brady M, Collins GS, Hippisley-Cox J. Development and internal-external validation of statistical and machine learning models for breast cancer prognostication: cohort study. BMJ 2023; 381:e073800. [PMID: 37164379 PMCID: PMC10170264 DOI: 10.1136/bmj-2022-073800] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/28/2023] [Indexed: 05/12/2023]
Abstract
OBJECTIVE To develop a clinically useful model that estimates the 10 year risk of breast cancer related mortality in women (self-reported female sex) with breast cancer of any stage, comparing results from regression and machine learning approaches. DESIGN Population based cohort study. SETTING QResearch primary care database in England, with individual level linkage to the national cancer registry, Hospital Episodes Statistics, and national mortality registers. PARTICIPANTS 141 765 women aged 20 years and older with a diagnosis of invasive breast cancer between 1 January 2000 and 31 December 2020. MAIN OUTCOME MEASURES Four model building strategies comprising two regression (Cox proportional hazards and competing risks regression) and two machine learning (XGBoost and an artificial neural network) approaches. Internal-external cross validation was used for model evaluation. Random effects meta-analysis that pooled estimates of discrimination and calibration metrics, calibration plots, and decision curve analysis were used to assess model performance, transportability, and clinical utility. RESULTS During a median 4.16 years (interquartile range 1.76-8.26) of follow-up, 21 688 breast cancer related deaths and 11 454 deaths from other causes occurred. Restricting to 10 years maximum follow-up from breast cancer diagnosis, 20 367 breast cancer related deaths occurred during a total of 688 564.81 person years. The crude breast cancer mortality rate was 295.79 per 10 000 person years (95% confidence interval 291.75 to 299.88). Predictors varied for each regression model, but both Cox and competing risks models included age at diagnosis, body mass index, smoking status, route to diagnosis, hormone receptor status, cancer stage, and grade of breast cancer. The Cox model's random effects meta-analysis pooled estimate for Harrell's C index was the highest of any model at 0.858 (95% confidence interval 0.853 to 0.864, and 95% prediction interval 0.843 to 0.873). It appeared acceptably calibrated on calibration plots. The competing risks regression model had good discrimination: pooled Harrell's C index 0.849 (0.839 to 0.859, and 0.821 to 0.876, and evidence of systematic miscalibration on summary metrics was lacking. The machine learning models had acceptable discrimination overall (Harrell's C index: XGBoost 0.821 (0.813 to 0.828, and 0.805 to 0.837); neural network 0.847 (0.835 to 0.858, and 0.816 to 0.878)), but had more complex patterns of miscalibration and more variable regional and stage specific performance. Decision curve analysis suggested that the Cox and competing risks regression models tested may have higher clinical utility than the two machine learning approaches. CONCLUSION In women with breast cancer of any stage, using the predictors available in this dataset, regression based methods had better and more consistent performance compared with machine learning approaches and may be worthy of further evaluation for potential clinical use, such as for stratified follow-up.
Collapse
Affiliation(s)
- Ash Kieran Clift
- Cancer Research UK Oxford Centre, Oxford, UK
- Nuffield Department of Primary Care Health Sciences, Radcliffe Primary Care Building, Radcliffe Observatory Quarter, University of Oxford, Oxford OX2 6GG, UK
| | - David Dodwell
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Simon Lord
- Department of Oncology, University of Oxford, Oxford, UK
| | - Stavros Petrou
- Nuffield Department of Primary Care Health Sciences, Radcliffe Primary Care Building, Radcliffe Observatory Quarter, University of Oxford, Oxford OX2 6GG, UK
| | - Michael Brady
- Department of Oncology, University of Oxford, Oxford, UK
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Julia Hippisley-Cox
- Nuffield Department of Primary Care Health Sciences, Radcliffe Primary Care Building, Radcliffe Observatory Quarter, University of Oxford, Oxford OX2 6GG, UK
| |
Collapse
|
8
|
Furberg JK, Andersen PK, Korn S, Overgaard M, Ravn H. Bivariate pseudo-observations for recurrent event analysis with terminal events. LIFETIME DATA ANALYSIS 2023; 29:256-287. [PMID: 34739680 DOI: 10.1007/s10985-021-09533-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 09/04/2021] [Indexed: 06/13/2023]
Abstract
The analysis of recurrent events in the presence of terminal events requires special attention. Several approaches have been suggested for such analyses either using intensity models or marginal models. When analysing treatment effects on recurrent events in controlled trials, special attention should be paid to competing deaths and their impact on interpretation. This paper proposes a method that formulates a marginal model for recurrent events and terminal events simultaneously. Estimation is based on pseudo-observations for both the expected number of events and survival probabilities. Various relevant hypothesis tests in the framework are explored. Theoretical derivations and simulation studies are conducted to investigate the behaviour of the method. The method is applied to two real data examples. The bivariate marginal pseudo-observation model carries the strength of a two-dimensional modelling procedure and performs well in comparison with available models. Finally, an extension to a three-dimensional model, which decomposes the terminal event per death cause, is proposed and exemplified.
Collapse
Affiliation(s)
- Julie K Furberg
- Biostatistics GLP-1 and CV 1, Novo Nordisk A/S, Vandtårnsvej 114, Søborg, Denmark.
| | - Per K Andersen
- Section of Biostatistics, University of Copenhagen, Copenhagen, Denmark
| | - Sofie Korn
- Biostatistics 1, LEO Pharma A/S, Ballerup, Denmark
| | - Morten Overgaard
- Research unit for Biostatistics, Department of Public Health, Aarhus University, Aarhus, Denmark
| | - Henrik Ravn
- Biostatistics GLP-1 and CV 1, Novo Nordisk A/S, Vandtårnsvej 114, Søborg, Denmark
| |
Collapse
|
9
|
Bouaziz O. Fast approximations of pseudo-observations in the context of right censoring and interval censoring. Biom J 2023; 65:e2200071. [PMID: 36843309 DOI: 10.1002/bimj.202200071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 10/11/2022] [Accepted: 11/18/2022] [Indexed: 02/28/2023]
Abstract
In the context of right-censored and interval-censored data, we develop asymptotic formulas to compute pseudo-observations for the survival function and the restricted mean survival time (RMST). These formulas are based on the original estimators and do not involve computation of the jackknife estimators. For right-censored data, Von Mises expansions of the Kaplan-Meier estimator are used to derive the pseudo-observations. For interval-censored data, a general class of parametric models for the survival function is studied. An asymptotic representation of the pseudo-observations is derived involving the Hessian matrix and the score vector. Theoretical results that justify the use of pseudo-observations in regression are also derived. The formula is illustrated on the piecewise-constant-hazard model for the RMST. The proposed approximations are extremely accurate, even for small sample sizes, as illustrated by Monte Carlo simulations and real data. We also study the gain in terms of computation time, as compared to the original jackknife method, which can be substantial for a large dataset.
Collapse
|
10
|
Ahn S, Grimes T, Datta S. A pseudo-value regression approach for differential network analysis of co-expression data. BMC Bioinformatics 2023; 24:8. [PMID: 36624383 PMCID: PMC9830718 DOI: 10.1186/s12859-022-05123-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Accepted: 12/22/2022] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND The differential network (DN) analysis identifies changes in measures of association among genes under two or more experimental conditions. In this article, we introduce a pseudo-value regression approach for network analysis (PRANA). This is a novel method of differential network analysis that also adjusts for additional clinical covariates. We start from mutual information criteria, followed by pseudo-value calculations, which are then entered into a robust regression model. RESULTS This article assesses the model performances of PRANA in a multivariable setting, followed by a comparison to dnapath and DINGO in both univariable and multivariable settings through variety of simulations. Performance in terms of precision, recall, and F1 score of differentially connected (DC) genes is assessed. By and large, PRANA outperformed dnapath and DINGO, neither of which is equipped to adjust for available covariates such as patient-age. Lastly, we employ PRANA in a real data application from the Gene Expression Omnibus database to identify DC genes that are associated with chronic obstructive pulmonary disease to demonstrate its utility. CONCLUSION To the best of our knowledge, this is the first attempt of utilizing a regression modeling for DN analysis by collective gene expression levels between two or more groups with the inclusion of additional clinical covariates. By and large, adjusting for available covariates improves accuracy of a DN analysis.
Collapse
Affiliation(s)
- Seungjun Ahn
- Department of Biostatistics, University of Florida, Gainesville, USA
| | - Tyler Grimes
- Department of Mathematics and Statistics, University of North Florida, Jacksonville, USA
| | - Somnath Datta
- Department of Biostatistics, University of Florida, Gainesville, USA.
| |
Collapse
|
11
|
Wang J, Marion-Gallois R. Propensity score matching and stratification using multiparty data without pooling. Pharm Stat 2023; 22:4-19. [PMID: 35733398 DOI: 10.1002/pst.2250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 04/01/2022] [Accepted: 05/29/2022] [Indexed: 02/01/2023]
Abstract
Matching and stratification based on confounding factors or propensity scores (PS) are powerful approaches for reducing confounding bias in indirect treatment comparisons. However, implementing these approaches requires pooled individual patient data (IPD). The research presented here was motivated by an indirect comparison between a single-armed trial in acute myeloid leukemia (AML), and two external AML registries with current treatments for a control. For confidentiality reasons, IPD cannot be pooled. Common approaches to adjusting confounding bias, such as PS matching or stratification, cannot be applied as 1) a model for PS, for example, a logistic model, cannot be fitted without pooling covariate data; 2) pooling response data may be necessary for some statistical inference (e.g., estimating the SE of mean difference of matched pairs) after PS matching. We propose a set of approaches that do not require pooling IPD, using a combination of methods including a linear discriminant for matching and stratification, and secure multiparty computation for estimation of within-pair sample variance and for calculations involving multiple control sources. The approaches only need to share aggregated data offline, rather than real-time secure data transfer, as required by typical secure multiparty computation for model fitting. For survival analysis, we propose an approach using restricted mean survival time. A simulation study was conducted to evaluate this approach in several scenarios, in particular, with a mixture of continuous and binary covariates. The results confirmed the robustness and efficiency of the proposed approach. A real data example is also provided for illustration.
Collapse
|
12
|
Archer L, Koshiaris C, Lay-Flurrie S, Snell KIE, Riley RD, Stevens R, Banerjee A, Usher-Smith JA, Clegg A, Payne RA, Hobbs FDR, McManus RJ, Sheppard JP. Development and external validation of a risk prediction model for falls in patients with an indication for antihypertensive treatment: retrospective cohort study. BMJ 2022; 379:e070918. [PMID: 36347531 PMCID: PMC9641577 DOI: 10.1136/bmj-2022-070918] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/21/2022] [Indexed: 11/09/2022]
Abstract
OBJECTIVE To develop and externally validate the STRAtifying Treatments In the multi-morbid Frail elderlY (STRATIFY)-Falls clinical prediction model to identify the risk of hospital admission or death from a fall in patients with an indication for antihypertensive treatment. DESIGN Retrospective cohort study. SETTING Primary care data from electronic health records contained within the UK Clinical Practice Research Datalink (CPRD). PARTICIPANTS Patients aged 40 years or older with at least one blood pressure measurement between 130 mm Hg and 179 mm Hg. MAIN OUTCOME MEASURE First serious fall, defined as hospital admission or death with a primary diagnosis of a fall within 10 years of the index date (12 months after cohort entry). Model development was conducted using a Fine-Gray approach in data from CPRD GOLD, accounting for the competing risk of death from other causes, with subsequent recalibration at one, five, and 10 years using pseudo values. External validation was conducted using data from CPRD Aurum, with performance assessed through calibration curves and the observed to expected ratio, C statistic, and D statistic, pooled across general practices, and clinical utility using decision curve analysis at thresholds around 10%. RESULTS Analysis included 1 772 600 patients (experiencing 62 691 serious falls) from CPRD GOLD used in model development, and 3 805 366 (experiencing 206 956 serious falls) from CPRD Aurum in the external validation. The final model consisted of 24 predictors, including age, sex, ethnicity, alcohol consumption, living in an area of high social deprivation, a history of falls, multiple sclerosis, and prescriptions of antihypertensives, antidepressants, hypnotics, and anxiolytics. Upon external validation, the recalibrated model showed good discrimination, with pooled C statistics of 0.833 (95% confidence interval 0.831 to 0.835) and 0.843 (0.841 to 0.844) at five and 10 years, respectively. Original model calibration was poor on visual inspection and although this was improved with recalibration, under-prediction of risk remained (observed to expected ratio at 10 years 1.839, 95% confidence interval 1.811 to 1.865). Nevertheless, decision curve analysis suggests potential clinical utility, with net benefit larger than other strategies. CONCLUSIONS This prediction model uses commonly recorded clinical characteristics and distinguishes well between patients at high and low risk of falls in the next 1-10 years. Although miscalibration was evident on external validation, the model still had potential clinical utility around risk thresholds of 10% and so could be useful in routine clinical practice to help identify those at high risk of falls who might benefit from closer monitoring or early intervention to prevent future falls. Further studies are needed to explore the appropriate thresholds that maximise the model's clinical utility and cost effectiveness.
Collapse
Affiliation(s)
- Lucinda Archer
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Constantinos Koshiaris
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, OX2 6GG, UK
| | - Sarah Lay-Flurrie
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, OX2 6GG, UK
| | - Kym I E Snell
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Richard D Riley
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Richard Stevens
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, OX2 6GG, UK
| | - Amitava Banerjee
- Institute of Health Informatics, University College London, London, UK
| | - Juliet A Usher-Smith
- Primary Care Unit, Department of Public Health and Primary Care, University of Cambridge, UK
| | - Andrew Clegg
- Academic Unit for Ageing and Stroke Research, Bradford Institute for Health Research, University of Leeds, UK
| | - Rupert A Payne
- Centre for Academic Primary Care, Population Health Sciences, University of Bristol, Bristol, UK
| | - F D Richard Hobbs
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, OX2 6GG, UK
| | - Richard J McManus
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, OX2 6GG, UK
| | - James P Sheppard
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, OX2 6GG, UK
| |
Collapse
|
13
|
Su CL, Chiou SH, Lin FC, Platt RW. Analysis of survival data with cure fraction and variable selection: A pseudo-observations approach. Stat Methods Med Res 2022; 31:2037-2053. [PMID: 35754373 PMCID: PMC9660265 DOI: 10.1177/09622802221108579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
In biomedical studies, survival data with a cure fraction (the proportion of subjects cured of disease) are commonly encountered. The mixture cure and bounded cumulative hazard models are two main types of cure fraction models when analyzing survival data with long-term survivors. In this article, in the framework of the Cox proportional hazards mixture cure model and bounded cumulative hazard model, we propose several estimators utilizing pseudo-observations to assess the effects of covariates on the cure rate and the risk of having the event of interest for survival data with a cure fraction. A variable selection procedure is also presented based on the pseudo-observations using penalized generalized estimating equations for proportional hazards mixture cure and bounded cumulative hazard models. Extensive simulation studies are conducted to examine the proposed methods. The proposed technique is demonstrated through applications to a melanoma study and a dental data set with high-dimensional covariates.
Collapse
Affiliation(s)
- Chien-Lin Su
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, Québec, Canada
- Centre for Clinical Epidemiology, Lady Davis Institute, Jewish
General Hospital, Montréal, Québec, Canada
- Peri and Post Approval Studies, Strategic and Scientific Affairs,
PPD, part of Thermo Fisher Scientific, Montréal, Québec, Canada
| | - Sy Han Chiou
- Department of Mathematical Sciences, University of Texas at Dallas,
Richardson, TX, USA
| | - Feng-Chang Lin
- Department of Biostatistics, University of North Carolina, Chapel
Hill, NC, USA
| | - Robert W Platt
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, Québec, Canada
- Centre for Clinical Epidemiology, Lady Davis Institute, Jewish
General Hospital, Montréal, Québec, Canada
| |
Collapse
|
14
|
Rong R, Ning J, Zhu H. Regression modeling of restricted mean survival time for left-truncated right-censored data. Stat Med 2022; 41:3003-3021. [PMID: 35708238 PMCID: PMC10014036 DOI: 10.1002/sim.9399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 01/27/2022] [Accepted: 03/05/2022] [Indexed: 11/10/2022]
Abstract
The restricted mean survival time (RMST) is a clinically meaningful summary measure in studies with survival outcomes. Statistical methods have been developed for regression analysis of RMST to investigate impacts of covariates on RMST, which is a useful alternative to the Cox regression analysis. However, existing methods for regression modeling of RMST are not applicable to left-truncated right-censored data that arise frequently in prevalent cohort studies, for which the sampling bias due to left truncation and informative censoring induced by the prevalent sampling scheme must be properly addressed. The pseudo-observation (PO) approach has been used in regression modeling of RMST for right-censored data and competing-risks data. For left-truncated right-censored data, we propose to directly model RMST as a function of baseline covariates based on POs under general censoring mechanisms. We adjust for the potential covariate-dependent censoring or dependent censoring by the inverse probability of censoring weighting method. We establish large sample properties of the proposed estimators and assess their finite sample performances by simulation studies under various scenarios. We apply the proposed methods to a prevalent cohort of women diagnosed with stage IV breast cancer identified from surveillance, epidemiology, and end results-medicare linked database.
Collapse
Affiliation(s)
- Rong Rong
- Department of Statistical Science, Southern Methodist University, Dallas, Texas, USA.,Division of BiostatisticsDepartment of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Jing Ning
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Hong Zhu
- Division of BiostatisticsDepartment of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
15
|
Lin J, Trinquart L. Doubly-robust estimator of the difference in restricted mean times lost with competing risks data. Stat Methods Med Res 2022; 31:1881-1903. [PMID: 35607287 DOI: 10.1177/09622802221102625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In the context of competing risks data, the subdistribution hazard ratio has limited clinical interpretability to measure treatment effects. An alternative is the difference in restricted mean times lost (RMTL), which gives the mean time lost to a specific cause of failure between treatment groups. In non-randomized studies, the average causal effect is conventionally used for decision-making about treatment and public health policies. We show how the difference in RMTL can be estimated by contrasting the integrated cumulative incidence functions from a Fine-Gray model. We also show how the difference in RMTL can be estimated by using inverse probability of treatment weighting and contrasts between weighted non-parametric estimators of the area below the cumulative incidence. We use pseudo-observation approaches to estimate both component models and we integrate them into a doubly-robust estimator. We demonstrate that this estimator is consistent when either component is correctly specified. We conduct simulation studies to assess its finite-sample performance and demonstrate its inherited consistency property from its component models. We also examine the performance of this estimator under varying degrees of covariate overlap and under a model misspecification of nonlinearity. We apply the proposed method to assess biomarker-treatment interaction in subpopulations of the POPLAR and OAK randomized controlled trials of second-line therapy for advanced non-small-cell lung cancer.
Collapse
Affiliation(s)
- Jingyi Lin
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Ludovic Trinquart
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA.,550030Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA, USA.,551843Tufts Clinical and Translational Science Institute, Tufts University, Boston, MA, USA
| |
Collapse
|
16
|
Ginestet PG, Gabriel EE, Sachs MC. Survival stacking with multiple data types using pseudo-observation-based-AUC loss. J Biopharm Stat 2022; 32:858-870. [DOI: 10.1080/10543406.2022.2041655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
| | - Erin E Gabriel
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Solna, Sweden
| | - Michael C Sachs
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Solna, Sweden
| |
Collapse
|
17
|
Riley RD, Collins GS, Ensor J, Archer L, Booth S, Mozumder SI, Rutherford MJ, van Smeden M, Lambert PC, Snell KIE. Minimum sample size calculations for external validation of a clinical prediction model with a time-to-event outcome. Stat Med 2022; 41:1280-1295. [PMID: 34915593 DOI: 10.1002/sim.9275] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 11/15/2021] [Accepted: 11/16/2021] [Indexed: 12/23/2022]
Abstract
Previous articles in Statistics in Medicine describe how to calculate the sample size required for external validation of prediction models with continuous and binary outcomes. The minimum sample size criteria aim to ensure precise estimation of key measures of a model's predictive performance, including measures of calibration, discrimination, and net benefit. Here, we extend the sample size guidance to prediction models with a time-to-event (survival) outcome, to cover external validation in datasets containing censoring. A simulation-based framework is proposed, which calculates the sample size required to target a particular confidence interval width for the calibration slope measuring the agreement between predicted risks (from the model) and observed risks (derived using pseudo-observations to account for censoring) on the log cumulative hazard scale. Precise estimation of calibration curves, discrimination, and net-benefit can also be checked in this framework. The process requires assumptions about the validation population in terms of the (i) distribution of the model's linear predictor and (ii) event and censoring distributions. Existing information can inform this; in particular, the linear predictor distribution can be approximated using the C-index or Royston's D statistic from the model development article, together with the overall event risk. We demonstrate how the approach can be used to calculate the sample size required to validate a prediction model for recurrent venous thromboembolism. Ideally the sample size should ensure precise calibration across the entire range of predicted risks, but must at least ensure adequate precision in regions important for clinical decision-making. Stata and R code are provided.
Collapse
Affiliation(s)
- Richard D Riley
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
- NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Joie Ensor
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Lucinda Archer
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Sarah Booth
- Biostatistics Research Group, Department of Health Sciences, George Davies Centre, University of Leicester, Leicester, UK
| | - Sarwar I Mozumder
- Biostatistics Research Group, Department of Health Sciences, George Davies Centre, University of Leicester, Leicester, UK
| | - Mark J Rutherford
- Biostatistics Research Group, Department of Health Sciences, George Davies Centre, University of Leicester, Leicester, UK
| | - Maarten van Smeden
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, University of Utrecht, Utrecht, The Netherlands
| | - Paul C Lambert
- Biostatistics Research Group, Department of Health Sciences, George Davies Centre, University of Leicester, Leicester, UK
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Kym I E Snell
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| |
Collapse
|
18
|
Conner SC, Beiser A, Benjamin EJ, LaValley MP, Larson MG, Trinquart L. A comparison of statistical methods to predict the residual lifetime risk. Eur J Epidemiol 2022; 37:173-194. [PMID: 34978669 PMCID: PMC8960348 DOI: 10.1007/s10654-021-00815-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Accepted: 10/13/2021] [Indexed: 02/03/2023]
Abstract
Lifetime risk measures the cumulative risk for developing a disease over one's lifespan. Modeling the lifetime risk must account for left truncation, the competing risk of death, and inference at a fixed age. In addition, statistical methods to predict the lifetime risk should account for covariate-outcome associations that change with age. In this paper, we review and compare statistical methods to predict the lifetime risk. We first consider a generalized linear model for the lifetime risk using pseudo-observations of the Aalen-Johansen estimator at a fixed age, allowing for left truncation. We also consider modeling the subdistribution hazard with Fine-Gray and Royston-Parmar flexible parametric models in left truncated data with time-covariate interactions, and using these models to predict lifetime risk. In simulation studies, we found the pseudo-observation approach had the least bias, particularly in settings with crossing or converging cumulative incidence curves. We illustrate our method by modeling the lifetime risk of atrial fibrillation in the Framingham Heart Study. We provide technical guidance to replicate all analyses in R.
Collapse
Affiliation(s)
- Sarah C Conner
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA.
| | - Alexa Beiser
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
- Department of Neurology, Boston University School of Medicine, Boston, MA, USA
| | - Emelia J Benjamin
- Framingham Heart Study, Framingham, MA, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
- Section of Cardiovascular Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Michael P LaValley
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Martin G Larson
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Ludovic Trinquart
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA.
- Tufts Clinical and Translational Science Institute, Tufts University, Boston, MA, USA.
- Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA, USA.
| |
Collapse
|
19
|
Kipourou DK, Perme MP, Rachet B, Belot A. Direct modeling of the crude probability of cancer death and the number of life years lost due to cancer without the need of cause of death: a pseudo-observation approach in the relative survival setting. Biostatistics 2022; 23:101-119. [PMID: 32374817 PMCID: PMC8759449 DOI: 10.1093/biostatistics/kxaa017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Revised: 03/19/2020] [Accepted: 03/19/2020] [Indexed: 12/30/2022] Open
Abstract
In population-based cancer studies, net survival is a crucial measure for population comparison purposes. However, alternative measures, namely the crude probability of death (CPr) and the number of life years lost (LYL) due to death according to different causes, are useful as complementary measures for reflecting different dimensions in terms of prognosis, treatment choice, or development of a control strategy. When the cause of death (COD) information is available, both measures can be estimated in competing risks setting using either cause-specific or subdistribution hazard regression models or with the pseudo-observation approach through direct modeling. We extended the pseudo-observation approach in order to model the CPr and the LYL due to different causes when information on COD is unavailable or unreliable (i.e., in relative survival setting). In a simulation study, we assessed the performance of the proposed approach in estimating regression parameters and examined models with different link functions that can provide an easier interpretation of the parameters. We showed that the pseudo-observation approach performs well for both measures and we illustrated their use on cervical cancer data from the England population-based cancer registry. A tutorial showing how to implement the method in R software is also provided.
Collapse
Affiliation(s)
- Dimitra-Kleio Kipourou
- Cancer Survival Group, Faculty of Epidemiology and Population Health, Department of Non-Communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London WC1E 7HT, UK
| | - Maja Pohar Perme
- Institute for Biostatistics and Medical Informatics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Bernard Rachet
- Cancer Survival Group, Faculty of Epidemiology and Population Health, Department of Non-Communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London WC1E 7HT, UK
| | - Aurelien Belot
- Cancer Survival Group, Faculty of Epidemiology and Population Health, Department of Non-Communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London WC1E 7HT, UK
| |
Collapse
|
20
|
Mittlböck M, Pötschger U, Heinzl H. Weighted pseudo-values for partly unobserved group membership in paediatric stem cell transplantation studies. Stat Methods Med Res 2021; 31:76-86. [PMID: 34812663 PMCID: PMC8721556 DOI: 10.1177/09622802211041756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Generalised pseudo-values have been suggested to evaluate the impact of allogeneic stem cell transplantation on childhood leukaemia. The approach compares long-term survival of two cohorts defined by the availability or non-availability of suitable donors for stem cell transplantation. A patient's cohort membership becomes known only after completed donor search with or without an identified donor. If a patient suffers an event during donor search, stem cell transplantation will no longer be indicated. In such a case, donor search will be ceased and cohort membership will remain unknown. The generalised pseudo-values approach considers donor identification as binary time-dependent covariate and uses inverse-probability-of-censoring weighting to adjust for non-identified donors. The approach leads to time-consuming computations due to multiple redefinitions of the risk set for pseudo-value calculation and an explicit adjustment for waiting-time bias. Here, the problem is looked at from a different angle. By considering the probability that a donor would have been identified after ceasing of donor search, weights for common pseudo-values are defined. This leads to a faster alternative approach as only a single risk set is necessary. Extensive computer simulations show that both, the generalised and the new weighted pseudo-values approach, provide approximately unbiased estimates. Confidence interval coverage is satisfactory for typical clinical scenarios. In situations, where donor identification takes considerably longer than usual, the weighted pseudo-values approach is preferable. Both approaches complement each other as they have different potential in addressing further aspects of the underlying medical question.
Collapse
Affiliation(s)
- Martina Mittlböck
- Center for Medical Statistics, Informatics, and Intelligent Systems, 27271Medical University of Vienna, Austria
| | | | - Harald Heinzl
- Center for Medical Statistics, Informatics, and Intelligent Systems, 27271Medical University of Vienna, Austria
| |
Collapse
|
21
|
Dutta S, Halabi S. A semiparametric modeling approach for analyzing clinical biomarkers restricted to limits of detection. Pharm Stat 2021; 20:1061-1073. [PMID: 33855778 DOI: 10.1002/pst.2125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 01/14/2021] [Accepted: 03/22/2021] [Indexed: 11/08/2022]
Abstract
Before biomarkers can be used in clinical trials or patients' management, the laboratory assays that measure their levels have to go through development and analytical validation. One of the most critical performance metrics for validation of any assay is related to the minimum amount of values that can be detected and any value below this limit is referred to as below the limit of detection (LOD). Most of the existing approaches that model such biomarkers, restricted by LOD, are parametric in nature. These parametric models, however, heavily depend on the distributional assumptions, and can result in loss of precision under the model or the distributional misspecifications. Using an example from a prostate cancer clinical trial, we show how a critical relationship between serum androgen biomarker and a prognostic factor of overall survival is completely missed by the widely used parametric Tobit model. Motivated by this example, we implement a semiparametric approach, through a pseudo-value technique, that effectively captures the important relationship between the LOD restricted serum androgen and the prognostic factor. Our simulations show that the pseudo-value based semiparametric model outperforms a commonly used parametric model for modeling below LOD biomarkers by having lower mean square errors of estimation.
Collapse
Affiliation(s)
- Sandipan Dutta
- Department of Mathematics and Statistics, Old Dominion University, Norfolk, Virginia, USA
| | - Susan Halabi
- Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina, USA
| |
Collapse
|
22
|
Regression analysis of doubly truncated data based on pseudo-observations. J Korean Stat Soc 2021. [DOI: 10.1007/s42952-021-00113-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
23
|
Goulart AC, Olmos RD, Santos IS, Tunes G, Alencar AP, Thomas N, Lip GY, Lotufo PA, Benseñor IM. The impact of atrial fibrillation and long-term oral anticoagulant use on all-cause and cardiovascular mortality: A 12-year evaluation of the prospective Brazilian Study of Stroke Mortality and Morbidity. Int J Stroke 2021; 17:48-58. [PMID: 33527882 DOI: 10.1177/1747493021995592] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
BACKGROUND Atrial fibrillation is a predictor of poor prognosis after stroke. AIMS To evaluate atrial fibrillation and all-cause and cardiovascular mortality in a stroke cohort with low socioeconomic status, taking into consideration oral anticoagulant use during 12-year follow-up. METHODS All-cause mortality was analyzed by Kaplan-Meier survival curve and Cox regression models to estimate hazard ratios and 95% confidence intervals (95% CI). For specific mortality causes, cumulative incidence functions were computed. A logit link function was used to calculate odds ratios (OR) with 95% CIs. Full models were adjusted by age, sex, oral anticoagulant use (as a time-dependent variable) and cardiovascular risk factors. RESULTS Of 1121 ischemic stroke participants, 17.8% had atrial fibrillation. Overall, 654 deaths (58.3%) were observed. Survival rate was lower (median days, interquartile range-IQR) among those with atrial fibrillation (531, IQR: 46-2039) vs. non-atrial fibrillation (1808, IQR: 334-3301), p-log rank < 0.0001). Over 12-year follow-up, previous atrial fibrillation was associated with increased mortality: all-cause (multivariable hazard ratios, 1.82; 95% CI: 1.43-2.31) and cardiovascular mortality (multivariable OR, 2.07; 95% CI: 1.36-3.14), but not stroke mortality. In the same multivariable models, oral anticoagulant use was inversely associated with all-cause mortality (oral anticoagulant time-dependent effect: multivariable hazard ratios, 0.47; 95% CI: 0.30-0.50, p = 0.002) and stroke mortality (oral anticoagulant time-dependent effect ≥ 6 months: multivariable OR, 0.09; 95% CI: 0.01-0.65, p-value = 0.02), but not cardiovascular mortality. CONCLUSIONS Among individuals with low socioeconomic status, atrial fibrillation was an independent predictor of poor survival, increasing all-cause and cardiovascular mortality risk. Long-term oral anticoagulant use was associated with a markedly reduced risk of all-cause and stroke mortality.
Collapse
Affiliation(s)
- Alessandra C Goulart
- Center for Clinical and Epidemiological Research, Hospital Universitário, Universidade de São Paulo, São Paulo, Brazil.,School of Medicine, Universidade de São Paulo, São Paulo, Brazil
| | - Rodrigo Diaz Olmos
- Center for Clinical and Epidemiological Research, Hospital Universitário, Universidade de São Paulo, São Paulo, Brazil.,School of Medicine, Universidade de São Paulo, São Paulo, Brazil
| | - Itamar S Santos
- Center for Clinical and Epidemiological Research, Hospital Universitário, Universidade de São Paulo, São Paulo, Brazil.,School of Medicine, Universidade de São Paulo, São Paulo, Brazil
| | - Gisela Tunes
- Institute of Mathematics and Statistics, Universidade de São Paulo, São Paulo, Brazil
| | - Airlane P Alencar
- Institute of Mathematics and Statistics, Universidade de São Paulo, São Paulo, Brazil
| | - Neil Thomas
- Institute for Applied Health Research, University of Birmingham, Birmingham, UK
| | - Gregory Yh Lip
- Liverpool Centre for Cardiovascular Science, University of Liverpool and Liverpool Heart & Chest Hospital, Liverpool, UK.,Aalborg Thrombosis Research Unit, Department of Clinical Medicine, Aalborg University, Aalborg, Denmark
| | - Paulo A Lotufo
- Center for Clinical and Epidemiological Research, Hospital Universitário, Universidade de São Paulo, São Paulo, Brazil.,School of Medicine, Universidade de São Paulo, São Paulo, Brazil
| | - Isabela M Benseñor
- Center for Clinical and Epidemiological Research, Hospital Universitário, Universidade de São Paulo, São Paulo, Brazil.,School of Medicine, Universidade de São Paulo, São Paulo, Brazil
| |
Collapse
|
24
|
Gardiner JC. Restricted Mean Survival Time Estimation: Nonparametric and Regression Methods. JOURNAL OF STATISTICAL THEORY AND PRACTICE 2020. [DOI: 10.1007/s42519-020-00144-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
25
|
Nygård Johansen M, Lundbye-Christensen S, Thorlund Parner E. Regression models using parametric pseudo-observations. Stat Med 2020; 39:2949-2961. [PMID: 32519771 DOI: 10.1002/sim.8586] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Revised: 03/01/2020] [Accepted: 05/05/2020] [Indexed: 11/07/2022]
Abstract
Pseudo-observations based on the nonparametric Kaplan-Meier estimator of the survival function have been proposed as an alternative to the widely used Cox model for analyzing censored time-to-event data. Using a spline-based estimator of the survival has some potential benefits over the nonparametric approach in terms of less variability. We propose to define pseudo-observations based on a flexible parametric estimator and use these for analysis in regression models to estimate parameters related to the cumulative risk. We report the results of a simulation study that compares the empirical standard errors of estimates based on parametric and nonparametric pseudo-observations in various settings. Our simulations show that in some situations there is a substantial gain in terms of reduced variability using the proposed parametric pseudo-observations compared with the nonparametric pseudo-observations. The gain can be measured as a reduction of the empirical standard error by up to about one third; corresponding to an additional 125% larger sample size. We illustrate the use of the proposed method in a brief data example.
Collapse
Affiliation(s)
| | | | - Erik Thorlund Parner
- Section for Biostatistics, Department of Public Health, Aarhus University, Aarhus, Denmark
| |
Collapse
|
26
|
Su CL, Platt RW, Plante JF. Causal inference for recurrent event data using pseudo-observations. Biostatistics 2020; 23:189-206. [PMID: 32432686 DOI: 10.1093/biostatistics/kxaa020] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 04/01/2020] [Accepted: 04/02/2020] [Indexed: 11/13/2022] Open
Abstract
Recurrent event data are commonly encountered in observational studies where each subject may experience a particular event repeatedly over time. In this article, we aim to compare cumulative rate functions (CRFs) of two groups when treatment assignment may depend on the unbalanced distribution of confounders. Several estimators based on pseudo-observations are proposed to adjust for the confounding effects, namely inverse probability of treatment weighting estimator, regression model-based estimators, and doubly robust estimators. The proposed marginal regression estimator and doubly robust estimators based on pseudo-observations are shown to be consistent and asymptotically normal. A bootstrap approach is proposed for the variance estimation of the proposed estimators. Model diagnostic plots of residuals are presented to assess the goodness-of-fit for the proposed regression models. A family of adjusted two-sample pseudo-score tests is proposed to compare two CRFs. Simulation studies are conducted to assess finite sample performance of the proposed method. The proposed technique is demonstrated through an application to a hospital readmission data set.
Collapse
Affiliation(s)
- Chien-Lin Su
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University and Centre for Clinical Epidemiology, Lady Davis Institute, Jewish General Hospital, Montréal, Québec, Canada
| | - Robert W Platt
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University and Centre for Clinical Epidemiology, Lady Davis Institute, Jewish General Hospital, Montréal, Québec, Canada
| | | |
Collapse
|
27
|
Pavlič K, Pohar Perme M. Using pseudo-observations for estimation in relative survival. Biostatistics 2020; 20:384-399. [PMID: 29547896 DOI: 10.1093/biostatistics/kxy008] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Accepted: 02/16/2018] [Indexed: 12/28/2022] Open
Abstract
A common goal in the analysis of the long-term survival related to a specific disease is to estimate a measure that is comparable between populations with different general population mortality. When cause of death is unavailable or unreliable, as for example in cancer registry studies, relative survival methodology is used-in addition to the mortality data of the patients, we use the data on the mortality of the general population. In this article, we focus on the marginal relative survival measure that summarizes the information about the disease-specific hazard. Under additional assumptions about latent times to death of each cause, this measure equals net survival. We propose a new approach to estimation based on pseudo-observations and derive two estimators of its variance. The properties of the new approach are assessed both theoretically and with simulations, showing practically no bias and a close to nominal coverage of the confidence intervals with the precise formula for the variance. The approximate formula for the variance has sufficiently good performance in large samples where the precise formula calculation becomes computationally intensive. Using bladder cancer data and simulations, we show that the behavior of the new approach is very close to that of the Pohar Perme estimator but has the important advantage of a simpler formula that does not require numerical integration and therefore lends itself more naturally to further extensions.
Collapse
Affiliation(s)
- Klemen Pavlič
- Faculty of Medicine, Institute for Biostatistics and Medical Informatics, University of Ljubljana, Vrazov trg 2, 1000 Ljubljana, Slovenia
| | - Maja Pohar Perme
- Faculty of Medicine, Institute for Biostatistics and Medical Informatics, University of Ljubljana, Vrazov trg 2, 1000 Ljubljana, Slovenia
| |
Collapse
|
28
|
Xia M, Murray S, Tayob N. Regression analysis of recurrent-event-free time from multiple follow-up windows. Stat Med 2020; 39:1-15. [PMID: 31663647 DOI: 10.1002/sim.8385] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 07/21/2019] [Accepted: 09/12/2019] [Indexed: 11/11/2022]
Abstract
This research develops multivariable restricted time models appropriate for analysis of recurrent events data, where data is repurposed into censored longitudinal time-to-first-event outcomes in τ-length follow-up windows. We develop two approaches for addressing the censored nature of the outcomes: a pseudo-observation (PO) approach and a multiple-imputation (MI) approach. Each of these approaches allows for complete data methods, such as generalized estimating equations, to be used for the analysis of the newly constructed correlated outcomes. Through simulation, this manuscript assesses the performance of the proposed PO and MI methods. Both PO and MI approaches show attractive results with either correlated or independent gap times in an individual. We also demonstrate how to apply the proposed methods in the data from azithromycin in Chronic Obstructive Pulmonary Disease Trial.
Collapse
Affiliation(s)
- Meng Xia
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
| | - Susan Murray
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
| | - Nabihah Tayob
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, Massachusetts
| |
Collapse
|
29
|
Andersen PK, Angst J, Ravn H. Modeling marginal features in studies of recurrent events in the presence of a terminal event. LIFETIME DATA ANALYSIS 2019; 25:681-695. [PMID: 30697652 DOI: 10.1007/s10985-019-09462-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Accepted: 01/21/2019] [Indexed: 06/09/2023]
Abstract
We study models for recurrent events with special emphasis on the situation where a terminal event acts as a competing risk for the recurrent events process and where there may be gaps between periods during which subjects are at risk for the recurrent event. We focus on marginal analysis of the expected number of events and show that an Aalen-Johansen type estimator proposed by Cook and Lawless is applicable in this situation. A motivating example deals with psychiatric hospital admissions where we supplement with analyses of the marginal distribution of time to the competing event and the marginal distribution of the time spent in hospital. Pseudo-observations are used for the latter purpose.
Collapse
Affiliation(s)
- Per Kragh Andersen
- Section of Biostatistics, University of Copenhagen, Ø. Farimagsgade 5, PB 2099, 1014, Copenhagen K, Denmark.
| | - Jules Angst
- Department of Psychiatry, Psychotherapy and Psychosomatics, Psychiatric Hospital, University of Zürich, Zurich, Switzerland
| | | |
Collapse
|
30
|
Overgaard M, Parner ET, Pedersen J. Pseudo-observations under covariate-dependent censoring. J Stat Plan Inference 2019. [DOI: 10.1016/j.jspi.2019.02.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
31
|
Sachs MC, Discacciati A, Everhov ÅH, Olén O, Gabriel EE. Ensemble prediction of time‐to‐event outcomes with competing risks: a case‐study of surgical complications in Crohn's disease. J R Stat Soc Ser C Appl Stat 2019. [DOI: 10.1111/rssc.12367] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Affiliation(s)
| | | | | | - Ola Olén
- Karolinska Institutet Stockholm Sweden
| | | |
Collapse
|
32
|
Sabathé C, Andersen PK, Helmer C, Gerds TA, Jacqmin-Gadda H, Joly P. Regression analysis in an illness-death model with interval-censored data: A pseudo-value approach. Stat Methods Med Res 2019; 29:752-764. [PMID: 30991888 DOI: 10.1177/0962280219842271] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Pseudo-values provide a method to perform regression analysis for complex quantities with right-censored data. A further complication, interval-censored data, appears when events such as dementia are studied in an epidemiological cohort. We propose an extension of the pseudo-value approach for interval-censored data based on a semi-parametric estimator computed using penalised likelihood and splines. This estimator takes interval-censoring and competing risks into account in an illness-death model. We apply the pseudo-value approach to three mean value parameters of interest in studies of dementia: the probability of staying alive and non-demented, the restricted mean survival time without dementia and the absolute risk of dementia. Simulation studies are conducted to examine properties of pseudo-values based on this semi-parametric estimator. The method is applied to the French cohort PAQUID, which included more than 3,000 non-demented subjects, followed for dementia for more than 25 years.
Collapse
Affiliation(s)
- Camille Sabathé
- INSERM, Bordeaux Population Health Research Center, Univ. Bordeaux, Bordeaux, France
| | - Per K Andersen
- Section of Biostatistics, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Catherine Helmer
- INSERM, Bordeaux Population Health Research Center, Univ. Bordeaux, Bordeaux, France
| | - Thomas A Gerds
- Section of Biostatistics, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Hélène Jacqmin-Gadda
- INSERM, Bordeaux Population Health Research Center, Univ. Bordeaux, Bordeaux, France
| | - Pierre Joly
- INSERM, Bordeaux Population Health Research Center, Univ. Bordeaux, Bordeaux, France
| |
Collapse
|
33
|
Pavlič K, Martinussen T, Andersen PK. Goodness of fit tests for estimating equations based on pseudo-observations. LIFETIME DATA ANALYSIS 2019; 25:189-205. [PMID: 29488163 DOI: 10.1007/s10985-018-9427-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Accepted: 02/15/2018] [Indexed: 06/08/2023]
Abstract
We study regression models for mean value parameters in survival analysis based on pseudo-observations. Such parameters include the survival probability and the cumulative incidence in a single point as well as the restricted mean life time and the cause-specific number of years lost. Goodness of fit techniques for such models based on cumulative sums of pseudo-residuals are derived including asymptotic results and Monte Carlo simulations. Practical examples from liver cirrhosis and bone marrow transplantation are also provided.
Collapse
Affiliation(s)
- Klemen Pavlič
- Institute for Biostatistics and Medical Informatics, Faculty of Medicine, University of Ljubljana, Vrazov trg 2, 1000, Ljubljana, Slovenia
| | - Torben Martinussen
- Section of Biostatistics, University of Copenhagen, Ø. Farimagsgade 5, PB 2099, 1014, Copenhagen K, Denmark
| | - Per Kragh Andersen
- Section of Biostatistics, University of Copenhagen, Ø. Farimagsgade 5, PB 2099, 1014, Copenhagen K, Denmark.
| |
Collapse
|
34
|
Wang Y, Logan BR. Testing for center effects on survival and competing risks outcomes using pseudo-value regression. LIFETIME DATA ANALYSIS 2019; 25:206-228. [PMID: 29978275 PMCID: PMC6320737 DOI: 10.1007/s10985-018-9443-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Accepted: 06/29/2018] [Indexed: 06/08/2023]
Abstract
In multi-center studies, the presence of a cluster effect leads to correlation among outcomes within a center and requires different techniques to handle such correlation. Testing for a cluster effect can serve as a pre-screening step to help guide the researcher towards the appropriate analysis. With time to event data, score tests have been proposed which test for the presence of a center effect on the hazard function. However, sometimes researchers are interested in directly modeling other quantities such as survival probabilities or cumulative incidence at a fixed time. We propose a test for the presence of a center effect acting directly on the quantity of interest using pseudo-value regression, and derive the asymptotic properties of our proposed test statistic. We examine the performance of our proposed test through simulation studies in both survival and competing risks settings. The proposed test may be more powerful than tests based on the hazard function in settings where the center effect is time-varying. We illustrate the test using a multicenter registry study of survival and competing risks outcomes after hematopoietic cell transplantation.
Collapse
Affiliation(s)
- Yanzhi Wang
- Division of Research Services/Department of Medicine, University of Illinois College of Medicine at Peoria, 1 Illini Dr., Peoria, IL, 61605, USA.
| | - Brent R Logan
- Division of Biostatistics, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| |
Collapse
|
35
|
Yokota I, Matsuyama Y. Dynamic prediction of repeated events data based on landmarking model: application to colorectal liver metastases data. BMC Med Res Methodol 2019; 19:31. [PMID: 30764772 PMCID: PMC6376774 DOI: 10.1186/s12874-019-0677-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2018] [Accepted: 02/07/2019] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND In some clinical situations, patients experience repeated events of the same type. Among these, cancer recurrences can result in terminal events such as death. Therefore, here we dynamically predicted the risks of repeated and terminal events given longitudinal histories observed before prediction time using dynamic pseudo-observations (DPOs) in a landmarking model. METHODS The proposed DPOs were calculated using Aalen-Johansen estimator for the event processes described in the multi-state model. Furthermore, in the absence of a terminal event, a more convenient approach without matrix operation was described using the ordering of repeated events. Finally, generalized estimating equations were used to calculate probabilities of repeated and terminal events, which were treated as multinomial outcomes. RESULTS Simulation studies were conducted to assess bias and investigate the efficiency of the proposed DPOs in a finite sample. Little bias was detected in DPOs even under relatively heavy censoring, and the method was applied to data from patients with colorectal liver metastases. CONCLUSIONS The proposed method enabled intuitive interpretations of terminal event settings.
Collapse
Affiliation(s)
- Isao Yokota
- Department of Biostatistics, Graduate School of Medicine, Hokkaido University, Kita 15, Nishi 7, Kita-ku, Sapporo, Hokkaido, 060-0061, Japan.
| | - Yutaka Matsuyama
- Department of Biostatistics, School of Public Health, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
36
|
Wang X, Xue X, Sun L. Regression analysis of restricted mean survival time based on pseudo-observations for competing risks data. COMMUN STAT-THEOR M 2018. [DOI: 10.1080/03610926.2017.1397174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Xin Wang
- School of Science, Beijing Information Science and Technology University, Beijing, P.R.China
| | - Xiaoming Xue
- Institute of Applied Mathematics, Academy of Mathematical and Systems Science, Chinese Academy of Sciences, Beijing, P.R.China
| | - Liuquan Sun
- Institute of Applied Mathematics, Academy of Mathematical and Systems Science, Chinese Academy of Sciences, Beijing, P.R.China
| |
Collapse
|
37
|
Bluhmki T, Allignol A, Ruckly S, Timsit JF, Wolkewitz M, Beyersmann J. Estimation of adjusted expected excess length-of-stay associated with ventilation-acquired pneumonia in intensive care: A multistate approach accounting for time-dependent mechanical ventilation. Biom J 2018; 60:1135-1150. [DOI: 10.1002/bimj.201700242] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 06/28/2018] [Accepted: 08/06/2018] [Indexed: 12/29/2022]
Affiliation(s)
| | | | | | - Jean-Francois Timsit
- UMR 1137 IAME Inserm/University Paris Diderot; Paris France
- APHP; Bichat Hospital; Intensive Care Unit; Paris France
| | - Martin Wolkewitz
- Institute for Medical Biometry and Statistics; Faculty of Medicine and Medical Center-University of Freiburg; Freiburg Germany
| | | |
Collapse
|
38
|
Grand MK, Putter H, Allignol A, Andersen PK. A note on pseudo-observations and left-truncation. Biom J 2018; 61:290-298. [DOI: 10.1002/bimj.201700274] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Revised: 06/20/2018] [Accepted: 06/22/2018] [Indexed: 11/11/2022]
Affiliation(s)
- Mia K. Grand
- Section of Biostatistics; University of Copenhagen; Copenhagen Denmark
- Department of Medical Statistics and Bioinformatics; Leiden University Medical Center; Leiden the Netherlands
| | - Hein Putter
- Department of Medical Statistics and Bioinformatics; Leiden University Medical Center; Leiden the Netherlands
| | | | - Per K. Andersen
- Section of Biostatistics; University of Copenhagen; Copenhagen Denmark
| |
Collapse
|
39
|
Multiple imputation for competing risks survival data via pseudo-observations. COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS 2018. [DOI: 10.29220/csam.2018.25.4.385] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
40
|
Overgaard M, Parner ET, Pedersen J. Estimating the variance in a pseudo‐observation scheme with competing risks. Scand Stat Theory Appl 2018. [DOI: 10.1111/sjos.12328] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
41
|
Nielsen LM, Maribo T, Kirkegaard H, Petersen KS, Lisby M, Oestergaard LG. Effectiveness of the "Elderly Activity Performance Intervention" on elderly patients' discharge from a short-stay unit at the emergency department: a quasi-experimental trial. Clin Interv Aging 2018; 13:737-747. [PMID: 29731615 PMCID: PMC5927350 DOI: 10.2147/cia.s162623] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Purpose To examine the effectiveness of the Elderly Activity Performance Intervention on reducing the risk of readmission in elderly patients discharged from a short-stay unit at the emergency department. Patients and methods The study was conducted as a nonrandomized, quasi-experimental trial. Three hundred and seventy-five elderly patients were included and allocated to the Elderly Activity Performance Intervention (n=144) or usual practice (n=231). The intervention consisted of 1) assessment of the patients’ performance of daily activities, 2) referral to further rehabilitation, and 3) follow-up visit the day after discharge. Primary outcome was readmission (yes/no) within 26 weeks. The study was registered in ClinicalTrial.gov (NCT02078466). Results No between-group differences were found in readmission. Overall, 44% of the patients in the intervention group and 42% in the usual practice group were readmitted within 26 weeks (risk difference=0.02, 95% CI: [−0.08; 0.12] and risk ratio=1.05, 95% CI: [0.83; 1.33]). No between-group differences were found in any of the secondary outcomes. Conclusion The Elderly Activity Performance Intervention showed no effectiveness in reducing the risk of readmission in elderly patients discharged from a short-stay unit at the emergency department. The study revealed that 60% of the elderly patients had a need for further rehabilitation after discharge.
Collapse
Affiliation(s)
- Louise Moeldrup Nielsen
- Department of Occupational Therapy, VIA University College.,Department of Physiotherapy and Occupational Therapy, Aarhus University Hospital
| | - Thomas Maribo
- Department of Public Health, Aarhus University, DEFACTUM
| | - Hans Kirkegaard
- Department of Clinical Medicine, Research Centre for Emergency Medicine, Aarhus University Hospital, Aarhus
| | | | - Marianne Lisby
- Department of Clinical Medicine, Research Centre for Emergency Medicine, Aarhus University Hospital, Aarhus
| | - Lisa Gregersen Oestergaard
- Department of Physiotherapy and Occupational Therapy, Aarhus University Hospital.,Department of Clinical Medicine, Aarhus University and Aarhus University Hospital, Aarhus, Denmark
| |
Collapse
|
42
|
Assessing the effect of a partly unobserved, exogenous, binary time-dependent covariate on survival probabilities using generalised pseudo-values. BMC Med Res Methodol 2018; 18:14. [PMID: 29351735 PMCID: PMC5775686 DOI: 10.1186/s12874-017-0430-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 11/15/2017] [Indexed: 11/18/2022] Open
Abstract
Background Investigating the impact of a time-dependent intervention on the probability of long-term survival is statistically challenging. A typical example is stem-cell transplantation performed after successful donor identification from registered donors. Here, a suggested simple analysis based on the exogenous donor availability status according to registered donors would allow the estimation and comparison of survival probabilities. As donor search is usually ceased after a patient’s event, donor availability status is incompletely observed, so that this simple comparison is not possible and the waiting time to donor identification needs to be addressed in the analysis to avoid bias. It is methodologically unclear, how to directly address cumulative long-term treatment effects without relying on proportional hazards while avoiding waiting time bias. Methods The pseudo-value regression technique is able to handle the first two issues; a novel generalisation of this technique also avoids waiting time bias. Inverse-probability-of-censoring weighting is used to account for the partly unobserved exogenous covariate donor availability. Results Simulation studies demonstrate unbiasedness and satisfying coverage probabilities of the new method. A real data example demonstrates that study results based on generalised pseudo-values have a clear medical interpretation which supports the clinical decision making process. Conclusions The proposed generalisation of the pseudo-value regression technique enables to compare survival probabilities between two independent groups where group membership becomes known over time and remains partly unknown. Hence, cumulative long-term treatment effects are directly addressed without relying on proportional hazards while avoiding waiting time bias. Electronic supplementary material The online version of this article (10.1186/s12874-017-0430-5) contains supplementary material, which is available to authorized users.
Collapse
|
43
|
Wang J. A simple, doubly robust, efficient estimator for survival functions using pseudo observations. Pharm Stat 2017; 17:38-48. [DOI: 10.1002/pst.1834] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Revised: 06/01/2017] [Accepted: 09/18/2017] [Indexed: 11/06/2022]
Affiliation(s)
- Jixian Wang
- Celgene International Sarl; Boudry Switzerland
| |
Collapse
|
44
|
Ranstam J, Robertsson O. The Cox model is better than the Fine and Gray model when estimating relative revision risks from arthroplasty register data. Acta Orthop 2017; 88:578-580. [PMID: 28771059 PMCID: PMC5694799 DOI: 10.1080/17453674.2017.1361130] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Background and purpose - Analysis of the revision-free survival of knee and hip prostheses has traditionally been performed using Kaplan-Meier analysis and Cox regression. The competing risk problem that is related to patients who die during follow-up has recently been increasingly discussed, not least with regard to the problem of choosing a suitable statistical method for the analysis. We compared the results from analyses of Cox models and Fine and Gray models. Methods - We used data simulation based on parameter estimates from the Swedish Knee Arthroplasty Register and assessed hypothetical effects of the studied risk factors. Results - The Cox model provided more adequate results. Interpretation - The parameter estimates from the Fine and Gray model can be misleading if interpreted in terms of relative risk.
Collapse
Affiliation(s)
- Jonas Ranstam
- Department of Clinical Sciences Lund, Orthopedics, Lund University,Correspondence:
| | - Otto Robertsson
- Department of Clinical Sciences Lund, Orthopedics, Lund University,Skane University Hospital, Lund, Sweden
| |
Collapse
|
45
|
Spitoni C, Lammens V, Putter H. Prediction errors for state occupation and transition probabilities in multi-state models. Biom J 2017; 60:34-48. [PMID: 29067699 DOI: 10.1002/bimj.201600191] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2016] [Revised: 07/19/2017] [Accepted: 08/17/2017] [Indexed: 11/09/2022]
Abstract
In this paper, we consider the estimation of prediction errors for state occupation probabilities and transition probabilities for multistate time-to-event data. We study prediction errors based on the Brier score and on the Kullback-Leibler score and prove their properness. In the presence of right-censored data, two classes of estimators, based on inverse probability weighting and pseudo-values, respectively, are proposed, and consistency properties of the proposed estimators are investigated. The second part of the paper is devoted to the estimation of dynamic prediction errors for state occupation probabilities for multistate models, conditional on being alive, and for transition probabilities. Cross-validated versions are proposed. Our methods are illustrated on the CSL1 randomized clinical trial comparing prednisone versus placebo for liver cirrhosis patients.
Collapse
Affiliation(s)
- Cristian Spitoni
- Department of Mathematics, Budapestlaan 6, 3584 CD, Utrecht, The Netherlands
| | - Violette Lammens
- Department of Mathematics, Budapestlaan 6, 3584 CD, Utrecht, The Netherlands
| | - Hein Putter
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
46
|
Overgaard M, Parner ET, Pedersen J. Asymptotic theory of generalized estimating equations based on jack-knife pseudo-observations. Ann Stat 2017. [DOI: 10.1214/16-aos1516] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
47
|
Jin Y, Lai TL. A new approach to regression analysis of censored competing-risks data. LIFETIME DATA ANALYSIS 2017; 23:605-625. [PMID: 27502000 PMCID: PMC5299091 DOI: 10.1007/s10985-016-9378-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Accepted: 07/30/2016] [Indexed: 06/06/2023]
Abstract
An approximate likelihood approach is developed for regression analysis of censored competing-risks data. This approach models directly the cumulative incidence function, instead of the cause-specific hazard function, in terms of explanatory covariates under a proportional subdistribution hazards assumption. It uses a self-consistent iterative procedure to maximize an approximate semiparametric likelihood function, leading to an asymptotically normal and efficient estimator of the vector of regression parameters. Simulation studies demonstrate its advantages over previous methods.
Collapse
Affiliation(s)
- Yuxue Jin
- Quantitative Marketing, Google, New York, NY, 10011, USA.
| | - Tze Leung Lai
- Department of Statistics, Stanford University, Stanford, CA, 94305, USA
| |
Collapse
|
48
|
Andersen PK, Syriopoulou E, Parner ET. Causal inference in survival analysis using pseudo-observations. Stat Med 2017; 36:2669-2681. [PMID: 28384840 DOI: 10.1002/sim.7297] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Revised: 02/28/2017] [Accepted: 03/11/2017] [Indexed: 11/09/2022]
Abstract
Causal inference for non-censored response variables, such as binary or quantitative outcomes, is often based on either (1) direct standardization ('G-formula') or (2) inverse probability of treatment assignment weights ('propensity score'). To do causal inference in survival analysis, one needs to address right-censoring, and often, special techniques are required for that purpose. We will show how censoring can be dealt with 'once and for all' by means of so-called pseudo-observations when doing causal inference in survival analysis. The pseudo-observations can be used as a replacement of the outcomes without censoring when applying 'standard' causal inference methods, such as (1) or (2) earlier. We study this idea for estimating the average causal effect of a binary treatment on the survival probability, the restricted mean lifetime, and the cumulative incidence in a competing risks situation. The methods will be illustrated in a small simulation study and via a study of patients with acute myeloid leukemia who received either myeloablative or non-myeloablative conditioning before allogeneic hematopoetic cell transplantation. We will estimate the average causal effect of the conditioning regime on outcomes such as the 3-year overall survival probability and the 3-year risk of chronic graft-versus-host disease. Copyright © 2017 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Per K Andersen
- Section of Biostatistics, University of Copenhagen, Ø. Farimagsgade 5, Copenhagen, PB 2099, DK-1014, Denmark
| | - Elisavet Syriopoulou
- Section of Biostatistics, University of Copenhagen, Ø. Farimagsgade 5, Copenhagen, PB 2099, DK-1014, Denmark
- Department of Health Sciences, College of Medicine Biological Sciences and Psychology, University of Leicester, University Road, Leicester, LE1 7RH, U.K
| | - Erik T Parner
- Department of Biostatistics, University of Aarhus, Bartholins Allé 2, Aarhus C, DK-8000, Denmark
| |
Collapse
|
49
|
Dutta S, Datta S, Datta S. Temporal Prediction of Future State Occupation in a Multistate Model from High-Dimensional Baseline Covariates via Pseudo-Value Regression. J STAT COMPUT SIM 2016; 87:1363-1378. [PMID: 29217870 DOI: 10.1080/00949655.2016.1263992] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
In many complex diseases such as cancer, a patient undergoes various disease stages before reaching a terminal state (say disease free or death). This fits a multistate model framework where a prognosis may be equivalent to predicting the state occupation at a future time t. With the advent of high throughput genomic and proteomic assays, a clinician may intent to use such high dimensional covariates in making better prediction of state occupation. In this article, we offer a practical solution to this problem by combining a useful technique, called pseudo value regression, with a latent factor or a penalized regression method such as the partial least squares (PLS) or the least absolute shrinkage and selection operator (LASSO), or their variants. We explore the predictive performances of these combinations in various high dimensional settings via extensive simulation studies. Overall, this strategy works fairly well provided the models are tuned properly. Overall, the PLS turns out to be slightly better than LASSO in most settings investigated by us, for the purpose of temporal prediction of future state occupation. We illustrate the utility of these pseudo-value based high dimensional regression methods using a lung cancer data set where we use the patients' baseline gene expression values.
Collapse
Affiliation(s)
- Sandipan Dutta
- Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY, USA
| | - Susmita Datta
- Department of Biostatistics, University of Florida, Gainesville, FL, USA
| | - Somnath Datta
- Department of Biostatistics, University of Florida, Gainesville, FL, USA
| |
Collapse
|
50
|
Kim S, Kim YJ. Regression analysis of interval censored competing risk data using a pseudo-value approach. COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS 2016. [DOI: 10.5351/csam.2016.23.6.555] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Sooyeon Kim
- Department of Statistics, Sookmyung Women’s University, Korea
| | - Yang-Jin Kim
- Department of Statistics, Sookmyung Women’s University, Korea
| |
Collapse
|