1
|
Chen JG, Chen HZ, Zhu J, Shen AG, Sun XY, Parkin DM. Cancer survival: left truncation and comparison of results from hospital-based cancer registry and population-based cancer registry. Front Oncol 2023; 13:1173828. [PMID: 37350938 PMCID: PMC10284078 DOI: 10.3389/fonc.2023.1173828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 05/16/2023] [Indexed: 06/24/2023] Open
Abstract
Background Cancer survival is an important indicator for evaluating cancer prognosis and cancer care outcomes. The incidence dates used in calculating survival differ between population-based registries and hospital-based registries. Studies examining the effects of the left truncation of incidence dates and delayed reporting on survival estimates are scarce in real-world applications. Methods Cancer cases hospitalized at Nantong Tumor Hospital during the years 2002-2017 were traced with their records registered in the Qidong Cancer Registry. Survival was calculated using the life table method for cancer patients with the first visit dates recorded in the hospital-based cancer registry (HBR) as the diagnosis date (OSH), those with the registered dates of population-based cancer (PBR) registered as the incidence date (OSP), and those with corrected dates when the delayed report dates were calibrated (OSC). Results Among 2,636 cases, 1,307 had incidence dates registered in PBR prior to the diagnosis dates of the first hospitalization registered in HBR, while 667 cases with incidence dates registered in PBR were later than the diagnosis dates registered in HBR. The 5-year OSH, OSP, and OSC were 36.1%, 37.4%, and 39.0%, respectively. The "lost" proportion of 5-year survival due to the left truncation for HBR data was estimated to be between 3.5% and 7.4%, and the "delayed-report" proportion of 5-year survival for PBR data was found to be 4.1%. Conclusion Left truncation of survival in HBR cases was demonstrated. The pseudo-left truncation in PBR should be reduced by controlling delayed reporting and maximizing completeness. Our study provides practical references and suggestions for evaluating the survival of cancer patients with HBR and PBR.
Collapse
Affiliation(s)
- Jian-Guo Chen
- Department of Epidemiology, Nantong Tumor Hospital, Affiliated Tumor Hospital of Nantong University, Nantong, China
- Department of Epidemiology, Qidong Liver Cancer Institute, Qidong People’s Hospital, Affiliated Qidong Hospital of Nantong University, Qidong, China
| | - Hai-Zhen Chen
- Department of Epidemiology, Nantong Tumor Hospital, Affiliated Tumor Hospital of Nantong University, Nantong, China
| | - Jian Zhu
- Department of Epidemiology, Qidong Liver Cancer Institute, Qidong People’s Hospital, Affiliated Qidong Hospital of Nantong University, Qidong, China
| | - Ai-Guo Shen
- Department of Epidemiology, Nantong Tumor Hospital, Affiliated Tumor Hospital of Nantong University, Nantong, China
| | - Xiang-Yang Sun
- Department of Epidemiology, Nantong Tumor Hospital, Affiliated Tumor Hospital of Nantong University, Nantong, China
| | - Donald Maxwell Parkin
- Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom
- Cancer Surveillance Branch, International Agency for Research on Cancer, Lyon, France
| |
Collapse
|
2
|
Linden T, Hanses F, Domingo-Fernández D, DeLong LN, Kodamullil AT, Schneider J, Vehreschild MJGT, Lanznaster J, Ruethrich MM, Borgmann S, Hower M, Wille K, Feldt T, Rieg S, Hertenstein B, Wyen C, Roemmele C, Vehreschild JJ, Jakob CEM, Stecher M, Kuzikov M, Zaliani A, Fröhlich H. Machine Learning Based Prediction of COVID-19 Mortality Suggests Repositioning of Anticancer Drug for Treating Severe Cases. ARTIFICIAL INTELLIGENCE IN THE LIFE SCIENCES 2021; 1:100020. [PMID: 34988543 PMCID: PMC8677630 DOI: 10.1016/j.ailsci.2021.100020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 11/22/2021] [Accepted: 11/22/2021] [Indexed: 02/08/2023]
Abstract
Despite available vaccinations COVID-19 case numbers around the world are still growing, and effective medications against severe cases are lacking. In this work, we developed a machine learning model which predicts mortality for COVID-19 patients using data from the multi-center 'Lean European Open Survey on SARS-CoV-2-infected patients' (LEOSS) observational study (>100 active sites in Europe, primarily in Germany), resulting into an AUC of almost 80%. We showed that molecular mechanisms related to dementia, one of the relevant predictors in our model, intersect with those associated to COVID-19. Most notably, among these molecules was tyrosine kinase 2 (TYK2), a protein that has been patented as drug target in Alzheimer's Disease but also genetically associated with severe COVID-19 outcomes. We experimentally verified that anti-cancer drugs Sorafenib and Regorafenib showed a clear anti-cytopathic effect in Caco2 and VERO-E6 cells and can thus be regarded as potential treatments against COVID-19. Altogether, our work demonstrates that interpretation of machine learning based risk models can point towards drug targets and new treatment options, which are strongly needed for COVID-19.
Collapse
Affiliation(s)
- Thomas Linden
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
- University of Bonn, Bonn-Aachen International Center for IT, Friedrich Hirzebruch-Allee 6, 53115 Bonn, Germany
| | - Frank Hanses
- Emergency Department, University Hospital Regensburg, 93053 Regensburg, Germany
- Department for Infectious Diseases and Infection Control, University Hospital Regensburg, Germany
| | - Daniel Domingo-Fernández
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
| | - Lauren Nicole DeLong
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
- University of Bonn, Bonn-Aachen International Center for IT, Friedrich Hirzebruch-Allee 6, 53115 Bonn, Germany
| | - Alpha Tom Kodamullil
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
| | - Jochen Schneider
- Technical University of Munich, School of Medicine, University Hospital rechts der Isar, Department of Internal Medicine II, 81675 Munich, Germany
| | - Maria J G T Vehreschild
- Department II of Internal Medicine, Infectious Diseases, University Hospital Frankfurt, Goethe University, 60590 Frankfurt, Germany
| | - Julia Lanznaster
- Department of Internal Medicine II, Hospital Passau, Innstraße 76, 94032 Passau, Germany
| | - Maria Madeleine Ruethrich
- Institute for Infection Medicine and Hospital Hygiene, University Hospital Jena, 07743 Jena, Germany
| | - Stefan Borgmann
- Department of Infectious Diseases and Infection Control, Hospital Ingolstadt, 85049 Ingolstadt, Germany
| | - Martin Hower
- Department of Pneumology, Infectious Diseases and Intensive Care, Klinikum Dortmund gGmbH, Hospital of University Witten / Herdecke, 44137 Dortmund, Germany
| | - Kai Wille
- University Clinic for Haematology, Oncology, Haemostaseology and Palliative Care, Johannes Wesling Medical Centre Minden, 32429 Minden, Germany
| | - Torsten Feldt
- Department of Gastroenterology, Hepatology and Infectious Diseases, University Hospital Düsseldorf, Medical Faculty of Heinrich Heine University Düsseldorf, Moorenstrasse 5, 40225 Düsseldorf, Germany
| | - Siegbert Rieg
- Department of Medicine II, University Hospital Freiburg, 79110 Freiburg, Germany
| | - Bernd Hertenstein
- Department of Medicine II, University Hospital Freiburg, 79110 Freiburg, Germany
| | - Christoph Wyen
- Christoph Wyen, Praxis am Ebertplatz Cologne, 50668 Cologne, Germany
| | - Christoph Roemmele
- Internal Medicine III - Gastroenterology and Infectious Diseases, University Hospital Augsburg, 86156 Augsburg, Germany
| | - Jörg Janne Vehreschild
- Department II of Internal Medicine, Infectious Diseases, University Hospital Frankfurt, Goethe University, 60590 Frankfurt, Germany
| | - Carolin E M Jakob
- Department I for Internal Medicine, University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany
| | - Melanie Stecher
- Fraunhofer Institute for Translational Medicine and Pharmacologie (ITMP), VolksparkLabs, Schnackenburgallee 114, 22535 Hamburg, Germany
| | - Maria Kuzikov
- Department for Infectious Diseases and Infection Control, University Hospital Regensburg, Germany
| | - Andrea Zaliani
- Department for Infectious Diseases and Infection Control, University Hospital Regensburg, Germany
| | - Holger Fröhlich
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
- University of Bonn, Bonn-Aachen International Center for IT, Friedrich Hirzebruch-Allee 6, 53115 Bonn, Germany
| |
Collapse
|
3
|
Chiou SH, Betensky RA, Balasubramanian R. The missing indicator approach for censored covariates subject to limit of detection in logistic regression models. Ann Epidemiol 2019; 38:57-64. [PMID: 31604610 PMCID: PMC6812630 DOI: 10.1016/j.annepidem.2019.07.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Revised: 07/12/2019] [Accepted: 07/24/2019] [Indexed: 12/14/2022]
Abstract
PURPOSE In several biomedical studies, one or more exposures of interest may be subject to nonrandom missingness because of the failure of the measurement assay at levels below its limit of detection. This issue is commonly encountered in studies of the metabolome using tandem mass spectrometry-based technologies. Owing to a large number of metabolites measured in these studies, preserving statistical power is of utmost interest. In this article, we evaluate the small sample properties of the missing indicator approach in logistic and conditional logistic regression models. METHODS For nested case-control or matched case control study designs, we evaluate the bias, power, and type I error associated with the missing indicator method using simulation. We compare the missing indicator approach to complete case analysis and several imputation approaches. RESULTS We show that under a variety of settings, the missing indicator approach outperforms complete case analysis and other imputation approaches with regard to bias, mean squared error, and power. CONCLUSIONS For nested case-control and matched study designs of modest sample sizes, the missing indicator model minimizes loss of information and thus provides an attractive alternative to the oft-used complete case analysis and other imputation approaches.
Collapse
Affiliation(s)
- Sy Han Chiou
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA
| | - Rebecca A Betensky
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA
| | - Raji Balasubramanian
- Department of Biostatistics and Epidemiology, University of Massachusetts - Amherst, Amherst, MA.
| |
Collapse
|
4
|
Frankel M, Fan L, Yeatts SD, Jeromin A, Vos PE, Wagner AK, Wolf BJ, Pauls Q, Lunney M, Merck LH, Hall CL, Palesch YY, Silbergleit R, Wright DW. Association of Very Early Serum Levels of S100B, Glial Fibrillary Acidic Protein, Ubiquitin C-Terminal Hydrolase-L1, and Spectrin Breakdown Product with Outcome in ProTECT III. J Neurotrauma 2019; 36:2863-2871. [PMID: 30794101 DOI: 10.1089/neu.2018.5809] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Rapid risk-stratification of patients with acute traumatic brain injury (TBI) would inform management decisions and prognostication. The objective of this serum biomarker study (Biomarkers of Injury and Outcome [BIO]-Progesterone for Traumatic Brain Injury, Experimental Clinical Treatment [ProTECT]) was to test the hypothesis that serum biomarkers of structural brain injury, measured at a single, very early time-point, add value beyond relevant clinical covariates when predicting unfavorable outcome 6 months after moderate-to-severe acute TBI. BIO-ProTECT utilized prospectively collected samples obtained from subjects with moderate-to-severe TBI enrolled in the ProTECT III clinical trial of progesterone. Serum samples were obtained within 4 h after injury. Glial fibrillary acidic protein (GFAP), S100B, αII-spectrin breakdown product of molecular weight 150 (SBDP150), and ubiquitin C-terminal hydrolase-L1 (UCH-L1) were measured. The association between log-transformed biomarker levels and poor outcome, defined by a Glasgow Outcome Scale-Extended (GOS-E) score of 1-4 at 6 months post-injury, were estimated via logistic regression. Prognostic models and a biomarker risk score were developed using bootstrapping techniques. Of 882 ProTECT III subjects, samples were available for 566. Each biomarker was associated with 6-month GOS-E (p < 0.001). Compared with a model containing baseline patient variables/characteristics, inclusion of S100B and GFAP significantly improved prognostic capacity (p ≤ 0.05 both comparisons); conversely, UCH-L1 and SBDP did not. A final predictive model incorporating baseline patient variables/characteristics and biomarker data (S100B and GFAP) had the best prognostic capability (area under the curve [AUC] = 0.85, 95% confidence interval [CI]: CI 0.81-0.89). Very early measurements of brain-specific biomarkers are independently associated with 6-month outcome after moderate-to-severe TBI and enhance outcome prediction.
Collapse
Affiliation(s)
- Michael Frankel
- Department of Neurology, Emory University School of Medicine and Grady Hospital, Atlanta, Georgia
| | - Liqiong Fan
- Novartis Institutes of Biomedical Research, Cambridge, Massachusetts
| | - Sharon D Yeatts
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| | | | - Pieter E Vos
- Department of Neurology, Slingeland Hospital Doetinchem, The Netherlands
| | - Amy K Wagner
- Department of Physical Medicine and Rehabilitation and Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Bethany J Wolf
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| | - Qi Pauls
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| | | | - Lisa H Merck
- Division of Emergency Neurosciences and Critical Care Research, The Warren Alpert Medical School of Brown University, Rhode Island Hospital, Providence, Rhode Island
| | - Casey L Hall
- Department of Neurology, Emory University School of Medicine and Grady Hospital, Atlanta, Georgia
| | - Yuko Y Palesch
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| | - Robert Silbergleit
- Department of Emergency Medicine, University of Michigan, Ann Arbor, Michigan
| | - David W Wright
- Department of Emergency Medicine, Emory University School of Medicine and Grady Hospital, Atlanta, Georgia
| |
Collapse
|
5
|
Atem FD, Matsouaka RA, Zimmern VE. Cox regression model with randomly censored covariates. Biom J 2019; 61:1020-1032. [PMID: 30908720 DOI: 10.1002/bimj.201800275] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Revised: 02/07/2019] [Accepted: 02/07/2019] [Indexed: 11/11/2022]
Abstract
This paper deals with a Cox proportional hazards regression model, where some covariates of interest are randomly right-censored. While methods for censored outcomes have become ubiquitous in the literature, methods for censored covariates have thus far received little attention and, for the most part, dealt with the issue of limit-of-detection. For randomly censored covariates, an often-used method is the inefficient complete-case analysis (CCA) which consists in deleting censored observations in the data analysis. When censoring is not completely independent, the CCA leads to biased and spurious results. Methods for missing covariate data, including type I and type II covariate censoring as well as limit-of-detection do not readily apply due to the fundamentally different nature of randomly censored covariates. We develop a novel method for censored covariates using a conditional mean imputation based on either Kaplan-Meier estimates or a Cox proportional hazards model to estimate the effects of these covariates on a time-to-event outcome. We evaluate the performance of the proposed method through simulation studies and show that it provides good bias reduction and statistical efficiency. Finally, we illustrate the method using data from the Framingham Heart Study to assess the relationship between offspring and parental age of onset of cardiovascular events.
Collapse
Affiliation(s)
- Folefac D Atem
- Department of Biostatistics and Data Science, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Roland A Matsouaka
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA.,Program for Comparative Effectiveness Methodology, Duke Clinical Research Institute, Duke University, Durham, NC, USA
| | - Vincent E Zimmern
- Department of Pediatrics, University of Texas Southwestern Medical School, Dallas, TX, USA.,Department of Pediatrics, Children Hospital Dallas, Dallas, TX, USA
| |
Collapse
|
6
|
Ahn S, Lim J, Paik MC, Sacco RL, Elkind MS. Cox model with interval-censored covariate in cohort studies. Biom J 2018; 60:797-814. [PMID: 29775990 DOI: 10.1002/bimj.201700090] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 12/19/2017] [Accepted: 02/27/2018] [Indexed: 11/07/2022]
Abstract
In cohort studies the outcome is often time to a particular event, and subjects are followed at regular intervals. Periodic visits may also monitor a secondary irreversible event influencing the event of primary interest, and a significant proportion of subjects develop the secondary event over the period of follow-up. The status of the secondary event serves as a time-varying covariate, but is recorded only at the times of the scheduled visits, generating incomplete time-varying covariates. While information on a typical time-varying covariate is missing for entire follow-up period except the visiting times, the status of the secondary event are unavailable only between visits where the status has changed, thus interval-censored. One may view interval-censored covariate of the secondary event status as missing time-varying covariates, yet missingness is partial since partial information is provided throughout the follow-up period. Current practice of using the latest observed status produces biased estimators, and the existing missing covariate techniques cannot accommodate the special feature of missingness due to interval censoring. To handle interval-censored covariates in the Cox proportional hazards model, we propose an available-data estimator, a doubly robust-type estimator as well as the maximum likelihood estimator via EM algorithm and present their asymptotic properties. We also present practical approaches that are valid. We demonstrate the proposed methods using our motivating example from the Northern Manhattan Study.
Collapse
Affiliation(s)
- Soohyun Ahn
- Department of Mathematics, Ajou University, Suwon, Korea
| | - Johan Lim
- Department of Statistics, Seoul National University, Seoul, Korea
| | | | - Ralph L Sacco
- Department of Neurology, Miller School of Medicine, University of Miami, Miami, FL, USA
| | | |
Collapse
|
7
|
Bernhardt PW, Zhang D, Wang HJ. A Fast EM Algorithm for Fitting Joint Models of a Binary Response and Multiple Longitudinal Covariates Subject to Detection Limits. Comput Stat Data Anal 2015; 85:37-53. [PMID: 25598564 PMCID: PMC4295570 DOI: 10.1016/j.csda.2014.11.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Joint modeling techniques have become a popular strategy for studying the association between a response and one or more longitudinal covariates. Motivated by the GenIMS study, where it is of interest to model the event of survival using censored longitudinal biomarkers, a joint model is proposed for describing the relationship between a binary outcome and multiple longitudinal covariates subject to detection limits. A fast, approximate EM algorithm is developed that reduces the dimension of integration in the E-step of the algorithm to one, regardless of the number of random effects in the joint model. Numerical studies demonstrate that the proposed approximate EM algorithm leads to satisfactory parameter and variance estimates in situations with and without censoring on the longitudinal covariates. The approximate EM algorithm is applied to analyze the GenIMS data set.
Collapse
Affiliation(s)
- Paul W. Bernhardt
- Department of Mathematics and Statistics, Villanova University, Villanova, PA, USA
| | - Daowen Zhang
- Department of Statistics, North Carolina State University, Raleigh, NC, USA
| | - Huixia Judy Wang
- Department of Statistics, George Washington University, Washington, DC, USA
| |
Collapse
|
8
|
Langohr K, Melis GG. Estimation and residual analysis with R for a linear regression model with an interval-censored covariate. Biom J 2014; 56:867-85. [PMID: 25103399 DOI: 10.1002/bimj.201300204] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2013] [Revised: 04/15/2014] [Accepted: 04/17/2014] [Indexed: 11/06/2022]
Abstract
Interval-censored covariates are sometimes encountered in longitudinal studies and considered as possible predictors in a regression model. This paper, motivated by an AIDS study, proposes an implementation in R for the estimation of parameters and the assessment of the assumptions of a linear regression model with an interval-censored covariate. The properties of the parameters estimators and the behavior of three proposed residuals are addressed through two simulation studies. Also, guidelines are provided to check the goodness-of-fit of the fitted model in terms of the length of the censoring interval of the covariate. The methodology is illustrated with real data coming from the AIDS study. R functions and scripts are provided.
Collapse
Affiliation(s)
- Klaus Langohr
- Departament d'Estadística i Investigació Operativa, Universitat Politècnica de Catalunya/BARCELONATECH, C/Jordi Girona, 1-3, 08034, Barcelona, Spain
| | - Guadalupe Gómez Melis
- Departament d'Estadística i Investigació Operativa, Universitat Politècnica de Catalunya/BARCELONATECH, C/Jordi Girona, 1-3, 08034, Barcelona, Spain
| |
Collapse
|
9
|
Bernhardt PW, Wang HJ, Zhang D. Flexible Modeling of Survival Data with Covariates Subject to Detection Limits via Multiple Imputation. Comput Stat Data Anal 2014; 69. [PMID: 24204085 DOI: 10.1016/j.csda.2013.07.027] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Models for survival data generally assume that covariates are fully observed. However, in medical studies it is not uncommon for biomarkers to be censored at known detection limits. A computationally-efficient multiple imputation procedure for modeling survival data with covariates subject to detection limits is proposed. This procedure is developed in the context of an accelerated failure time model with a flexible seminonparametric error distribution. The consistency and asymptotic normality of the multiple imputation estimator are established and a consistent variance estimator is provided. An iterative version of the proposed multiple imputation algorithm that approximates the EM algorithm for maximum likelihood is also suggested. Simulation studies demonstrate that the proposed multiple imputation methods work well while alternative methods lead to estimates that are either biased or more variable. The proposed methods are applied to analyze the dataset from a recently-conducted GenIMS study.
Collapse
Affiliation(s)
- Paul W Bernhardt
- Department of Statistics, North Carolina State University, Raleigh, NC, USA
| | | | | |
Collapse
|