Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: van der Ploeg T, Datema F, Baatenburg de Jong R, Steyerberg EW. Prediction of survival with alternative modeling techniques using pseudo values. PLoS One 2014;9:e100234. [PMID: 24950066 DOI: 10.1371/journal.pone.0100234] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 05/24/2014] [Indexed: 11/19/2022] Open

For:	van der Ploeg T, Datema F, Baatenburg de Jong R, Steyerberg EW. Prediction of survival with alternative modeling techniques using pseudo values. PLoS One 2014;9:e100234. [PMID: 24950066 DOI: 10.1371/journal.pone.0100234] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 05/24/2014] [Indexed: 11/19/2022] Open

Number

Cited by Other Article(s)

Clift AK, Tan PS, Patone M, Liao W, Coupland C, Bashford-Rogers R, Sivakumar S, Hippisley-Cox J. Predicting the risk of pancreatic cancer in adults with new-onset diabetes: development and internal-external validation of a clinical risk prediction model. Br J Cancer 2024;130:1969-1978. [PMID: 38702436 PMCID: PMC11183048 DOI: 10.1038/s41416-024-02693-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 04/08/2024] [Accepted: 04/11/2024] [Indexed: 05/06/2024] Open

Schenk A, Berger M, Schmid M. Pseudo-value regression trees. LIFETIME DATA ANALYSIS 2024;30:439-471. [PMID: 38403840 PMCID: PMC11297840 DOI: 10.1007/s10985-024-09618-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 01/19/2024] [Indexed: 02/27/2024]

Clift AK, Collins GS, Lord S, Petrou S, Dodwell D, Brady M, Hippisley-Cox J. Predicting 10-year breast cancer mortality risk in the general female population in England: a model development and validation study. Lancet Digit Health 2023;5:e571-e581. [PMID: 37625895 DOI: 10.1016/s2589-7500(23)00113-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 04/06/2023] [Accepted: 06/12/2023] [Indexed: 08/27/2023]

Abstract

BACKGROUND

Identifying female individuals at highest risk of developing life-threatening breast cancers could inform novel stratified early detection and prevention strategies to reduce breast cancer mortality, rather than only considering cancer incidence. We aimed to develop a prognostic model that accurately predicts the 10-year risk of breast cancer mortality in female individuals without breast cancer at baseline.

METHODS

In this model development and validation study, we used an open cohort study from the QResearch primary care database, which was linked to secondary care and national cancer and mortality registers in England, UK. The data extracted were from female individuals aged 20-90 years without previous breast cancer or ductal carcinoma in situ who entered the cohort between Jan 1, 2000, and Dec 31, 2020. The primary outcome was breast cancer-related death, which was assessed in the full dataset. Cox proportional hazards, competing risks regression, XGBoost, and neural network modelling approaches were used to predict the risk of breast cancer death within 10 years using routinely collected health-care data. Death due to causes other than breast cancer was the competing risk. Internal-external validation was used to evaluate prognostic model performance (using Harrell's C, calibration slope, and calibration in the large), performance heterogeneity, and transportability. Internal-external validation involved dataset partitioning by time period and geographical region. Decision curve analysis was used to assess clinical utility.

FINDINGS

We identified data for 11 626 969 female individuals, with 70 095 574 person-years of follow-up. There were 142 712 (1·2%) diagnoses of breast cancer, 24 043 (0·2%) breast cancer-related deaths, and 696 106 (6·0%) deaths from other causes. Meta-analysis pooled estimates of Harrell's C were highest for the competing risks model (0·932, 95% CI 0·917-0·946). The competing risks model was well calibrated overall (slope 1·011, 95% CI 0·978-1·044), and across different ethnic groups. Decision curve analysis suggested favourable clinical utility across all age groups. The XGBoost and neural network models had variable performance across age and ethnic groups.

INTERPRETATION

A model that predicts the combined risk of developing and then dying from breast cancer at the population level could inform stratified screening or chemoprevention strategies. Further evaluation of the competing risks model should comprise effect and health economic assessment of model-informed strategies.

FUNDING

Cancer Research UK.

Collapse

Clift AK, Dodwell D, Lord S, Petrou S, Brady M, Collins GS, Hippisley-Cox J. Development and internal-external validation of statistical and machine learning models for breast cancer prognostication: cohort study. BMJ 2023;381:e073800. [PMID: 37164379 PMCID: PMC10170264 DOI: 10.1136/bmj-2022-073800] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/28/2023] [Indexed: 05/12/2023]

Abstract

OBJECTIVE

To develop a clinically useful model that estimates the 10 year risk of breast cancer related mortality in women (self-reported female sex) with breast cancer of any stage, comparing results from regression and machine learning approaches.

DESIGN

Population based cohort study.

SETTING

QResearch primary care database in England, with individual level linkage to the national cancer registry, Hospital Episodes Statistics, and national mortality registers.

PARTICIPANTS

141 765 women aged 20 years and older with a diagnosis of invasive breast cancer between 1 January 2000 and 31 December 2020.

MAIN OUTCOME MEASURES

Four model building strategies comprising two regression (Cox proportional hazards and competing risks regression) and two machine learning (XGBoost and an artificial neural network) approaches. Internal-external cross validation was used for model evaluation. Random effects meta-analysis that pooled estimates of discrimination and calibration metrics, calibration plots, and decision curve analysis were used to assess model performance, transportability, and clinical utility.

RESULTS

During a median 4.16 years (interquartile range 1.76-8.26) of follow-up, 21 688 breast cancer related deaths and 11 454 deaths from other causes occurred. Restricting to 10 years maximum follow-up from breast cancer diagnosis, 20 367 breast cancer related deaths occurred during a total of 688 564.81 person years. The crude breast cancer mortality rate was 295.79 per 10 000 person years (95% confidence interval 291.75 to 299.88). Predictors varied for each regression model, but both Cox and competing risks models included age at diagnosis, body mass index, smoking status, route to diagnosis, hormone receptor status, cancer stage, and grade of breast cancer. The Cox model's random effects meta-analysis pooled estimate for Harrell's C index was the highest of any model at 0.858 (95% confidence interval 0.853 to 0.864, and 95% prediction interval 0.843 to 0.873). It appeared acceptably calibrated on calibration plots. The competing risks regression model had good discrimination: pooled Harrell's C index 0.849 (0.839 to 0.859, and 0.821 to 0.876, and evidence of systematic miscalibration on summary metrics was lacking. The machine learning models had acceptable discrimination overall (Harrell's C index: XGBoost 0.821 (0.813 to 0.828, and 0.805 to 0.837); neural network 0.847 (0.835 to 0.858, and 0.816 to 0.878)), but had more complex patterns of miscalibration and more variable regional and stage specific performance. Decision curve analysis suggested that the Cox and competing risks regression models tested may have higher clinical utility than the two machine learning approaches.

CONCLUSION

In women with breast cancer of any stage, using the predictors available in this dataset, regression based methods had better and more consistent performance compared with machine learning approaches and may be worthy of further evaluation for potential clinical use, such as for stratified follow-up.

Collapse

Sun X, Chintakunta PK, Badachhape AA, Bhavane R, Lee H, Yang DS, Starosolski Z, Ghaghada KB, Vekilov PG, Annapragada AV, Tanifum EA. Rational Design of a Self-Assembling High Performance Organic Nanofluorophore for Intraoperative NIR-II Image-Guided Tumor Resection of Oral Cancer. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023;10:e2206435. [PMID: 36721029 PMCID: PMC10074073 DOI: 10.1002/advs.202206435] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 12/30/2022] [Indexed: 06/18/2023]

Kotevski DP, Smee RI, Field M, Broadley K, Vajdic CM. The Utility of Oncology Information Systems for Prognostic Modelling in Head and Neck Cancer. J Med Syst 2023;47:9. [PMID: 36640212 PMCID: PMC9840592 DOI: 10.1007/s10916-023-01907-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 01/03/2023] [Indexed: 01/15/2023]

Abstract

Cancer centres rely on electronic information in oncology information systems (OIS) to guide patient care. We investigated the completeness and accuracy of routinely collected head and neck cancer (HNC) data sourced from an OIS for suitability in prognostic modelling and other research. Three hundred and fifty-three adults diagnosed from 2000 to 2017 with head and neck squamous cell carcinoma, treated with radiotherapy, were eligible. Thirteen clinically relevant variables in HNC prognosis were extracted from a single-centre OIS and compared to that compiled separately in a research dataset. These two datasets were compared for agreement using Cohen's kappa coefficient for categorical variables, and intraclass correlation coefficients for continuous variables. Research data was 96% complete compared to 84% for OIS data. Agreement was perfect for gender (κ = 1.000), high for age (κ = 0.993), site (κ = 0.992), T (κ = 0.851) and N (κ = 0.812) stage, radiotherapy dose (κ = 0.889), fractions (κ = 0.856), and duration (κ = 0.818), and chemotherapy treatment (κ = 0.871), substantial for overall stage (κ = 0.791) and vital status (κ = 0.689), moderate for grade (κ = 0.547), and poor for performance status (κ = 0.110). Thirty-one other variables were poorly captured and could not be statistically compared. Documentation of clinical information within the OIS for HNC patients is routine practice; however, OIS data was less correct and complete than data collected for research purposes. Substandard collection of routine data may hinder advancements in patient care. Improved data entry, integration with clinical activities and workflows, system usability, data dictionaries, and training are necessary for OIS data to generate robust research. Data mining from clinical documents may supplement structured data collection.

Collapse

Kotevski DP, Smee RI, Vajdic CM, Field M. Machine Learning and Nomogram Prognostic Modeling for 2-Year Head and Neck Cancer-Specific Survival Using Electronic Health Record Data: A Multisite Study. JCO Clin Cancer Inform 2023;7:e2200128. [PMID: 36596211 DOI: 10.1200/cci.22.00128] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Abstract

PURPOSE

There is limited knowledge of the prediction of 2-year cancer-specific survival (CSS) in the head and neck cancer (HNC) population. The aim of this study is to develop and validate machine learning models and a nomogram for the prediction of 2-year CSS in patients with HNC using real-world data collected by major teaching and tertiary referral hospitals in New South Wales (NSW), Australia.

MATERIALS AND METHODS

Data collected in oncology information systems at multiple NSW Cancer Centres were extracted for 2,953 eligible adults diagnosed between 2000 and 2017 with squamous cell carcinoma of the head and neck. Death data were sourced from the National Death Index using record linkage. Machine learning and Cox regression/nomogram models were developed and internally validated in Python and R, respectively.

RESULTS

Machine learning models demonstrated highest performance (C-index) in the larynx and nasopharynx cohorts (0.82), followed by the oropharynx (0.79) and the hypopharynx and oral cavity cohorts (0.73). In the whole HNC population, C-indexes of 0.79 and 0.70 and Brier scores of 0.10 and 0.27 were reported for the machine learning and nomogram model, respectively. Cox regression analysis identified age, T and N classification, and time-corrected biologic equivalent dose in two gray fractions as independent prognostic factors for 2-year CSS. N classification was the most important feature used for prediction in the machine learning model followed by age.

CONCLUSION

Machine learning and nomogram analysis predicted 2-year CSS with high performance using routinely collected and complete clinical information extracted from oncology information systems. These models function as visual decision-making tools to guide radiotherapy treatment decisions and provide insight into the prediction of survival outcomes in patients with HNC.

Collapse

Lyu Z, Jiang H, Xiao F, Rong J, Zhang T, Wandell B, Farrell J. Simulations of fluorescence imaging in the oral cavity. BIOMEDICAL OPTICS EXPRESS 2021;12:4276-4292. [PMID: 34457414 PMCID: PMC8367257 DOI: 10.1364/boe.429995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 05/31/2021] [Accepted: 06/01/2021] [Indexed: 06/13/2023]

van der Ploeg T, Gobbens R. A Comparison of Different Modelling Techniques in Predicting Mortality with the Tilburg Frailty Indicator (Preprint). JMIR Med Inform 2021;10:e31480. [PMID: 35353054 PMCID: PMC8992962 DOI: 10.2196/31480] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 11/10/2021] [Accepted: 01/08/2022] [Indexed: 11/13/2022] Open

Barroso EM, Aaboubout Y, van der Sar LC, Mast H, Sewnaik A, Hardillo JA, Ten Hove I, Nunes Soares MR, Ottevanger L, Bakker Schut TC, Puppels GJ, Koljenović S. Performance of Intraoperative Assessment of Resection Margins in Oral Cancer Surgery: A Review of Literature. Front Oncol 2021;11:628297. [PMID: 33869013 PMCID: PMC8044914 DOI: 10.3389/fonc.2021.628297] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 03/15/2021] [Indexed: 12/15/2022] Open

Abstract

Introduction

Achieving adequate resection margins during oral cancer surgery is important to improve patient prognosis. Surgeons have the delicate task of achieving an adequate resection and safeguarding satisfactory remaining function and acceptable physical appearance, while relying on visual inspection, palpation, and preoperative imaging. Intraoperative assessment of resection margins (IOARM) is a multidisciplinary effort, which can guide towards adequate resections. Different forms of IOARM are currently used, but it is unknown how accurate these methods are in predicting margin status. Therefore, this review aims to investigate: 1) the IOARM methods currently used during oral cancer surgery, 2) their performance, and 3) their clinical relevance.

Methods

A literature search was performed in the following databases: Embase, Medline, Web of Science Core Collection, Cochrane Central Register of Controlled Trials, and Google Scholar (from inception to January 23, 2020). IOARM performance was assessed in terms of accuracy, sensitivity, and specificity in predicting margin status, and the reduction of inadequate margins. Clinical relevance (i.e., overall survival, local recurrence, regional recurrence, local recurrence-free survival, disease-specific survival, adjuvant therapy) was recorded if available.

Results

Eighteen studies were included in the review, of which 10 for soft tissue and 8 for bone. For soft tissue, defect-driven IOARM-studies showed the average accuracy, sensitivity, and specificity of 90.9%, 47.6%, and 84.4%, and specimen-driven IOARM-studies showed, 91.5%, 68.4%, and 96.7%, respectively. For bone, specimen-driven IOARM-studies performed better than defect-driven, with an average accuracy, sensitivity, and specificity of 96.6%, 81.8%, and 98%, respectively. For both, soft tissue and bone, IOARM positively impacts patient outcome.

Conclusion

IOARM improves margin-status, especially the specimen-driven IOARM has higher performance compared to defect-driven IOARM. However, this conclusion is limited by the low number of studies reporting performance results for defect-driven IOARM. The current methods suffer from inherent disadvantages, namely their subjective character and the fact that only a small part of the resection surface can be assessed in a short time span, causing sampling errors. Therefore, a solution should be sought in the field of objective techniques that can rapidly assess the whole resection surface.

Collapse

Affiliation(s)

Elisa M Barroso Department of Pathology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands.,Department of Oral and Maxillofacial Surgery, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands
Yassine Aaboubout Department of Pathology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands.,Department of Otorhinolaryngology and Head and Neck Surgery, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands
Lisette C van der Sar Department of Pathology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands
Hetty Mast Department of Oral and Maxillofacial Surgery, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands
Aniel Sewnaik Department of Otorhinolaryngology and Head and Neck Surgery, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands
Jose A Hardillo Department of Otorhinolaryngology and Head and Neck Surgery, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands
Ivo Ten Hove Department of Oral and Maxillofacial Surgery, Leiden UMC, Leiden University Medical Center, Leiden, Netherlands
Maria R Nunes Soares Department of Pathology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands.,Department of Dermatology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands
Lars Ottevanger Department of Pathology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands.,Department of Dermatology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands
Tom C Bakker Schut Department of Dermatology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands
Gerwin J Puppels Department of Dermatology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands
Senada Koljenović Department of Pathology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, Netherlands

Collapse

Aaboubout Y, ten Hove I, Smits RWH, Hardillo JA, Puppels GJ, Koljenovic S. Specimen-driven intraoperative assessment of resection margins should be standard of care for oral cancer patients. Oral Dis 2021;27:111-116. [PMID: 32816373 PMCID: PMC7821253 DOI: 10.1111/odi.13619] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 08/13/2020] [Indexed: 12/24/2022]

Huang X, Ribeiro JD, Franklin JC. The Differences Between Individuals Engaging in Nonsuicidal Self-Injury and Suicide Attempt Are Complex (vs. Complicated or Simple). Front Psychiatry 2020;11:239. [PMID: 32317991 PMCID: PMC7154073 DOI: 10.3389/fpsyt.2020.00239] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 03/11/2020] [Indexed: 12/22/2022] Open

Abstract

BACKGROUND

Why do some people engage in nonsuicidal self-injury (NSSI) while others attempt suicide? One way to advance knowledge about this question is to shed light on the differences between people who engage in NSSI and people who attempt suicide. These groups could differ in three broad ways. First, these two groups may differ in a simple way, such that one or a small set of factors is both necessary and sufficient to accurately distinguish the two groups. Second, they might differ in a complicated way, meaning that a specific set of a large number of factors is both necessary and sufficient to accurately classify them. Third, they might differ in a complex way, with no necessary factor combinations and potentially no sufficient factor combinations. In this scenario, at the group level, complicated algorithms would either be insufficient (i.e., no complicated algorithm produces good accuracy) or unnecessary (i.e., many complicated algorithms produce good accuracy) to distinguish between groups. This study directly tested these three possibilities in a sample of people with a history of NSSI and/or suicide attempt.

METHOD

A total of 954 participants who have either engaged in NSSI and/or suicide attempt in their lifetime were recruited from online forums. Participants completed a series of measures on factors commonly associated with NSSI and suicide attempt. To test for simple differences, univariate logistic regressions were conducted. One theoretically informed multiple logistic regression model with suicidal desire, capability for suicide, and their interaction term was considered as well. To examine complicated and complex differences, multiple logistic regression and machine learning analyses were conducted.

RESULTS

No simple algorithm (i.e., single factor or small set of factors) accurately distinguished between groups. Complicated algorithms constructed with cross-validation methods produced fair accuracy; complicated algorithms constructed with bootstrap optimism methods produced good accuracy, but multiple different algorithms with this method produced similar results.

CONCLUSIONS

Findings were consistent with complex differences between people who engage in NSSI and suicide attempts. Specific complicated algorithms were either insufficient (cross-validation results) or unnecessary (bootstrap optimism results) to distinguish between these groups with high accuracy.

Collapse

Resteghini C, Trama A, Borgonovi E, Hosni H, Corrao G, Orlandi E, Calareso G, De Cecco L, Piazza C, Mainardi L, Licitra L. Big Data in Head and Neck Cancer. Curr Treat Options Oncol 2018;19:62. [DOI: 10.1007/s11864-018-0585-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Arostegui I, Gonzalez N, Fernández-de-Larrea N, Lázaro-Aramburu S, Baré M, Redondo M, Sarasqueta C, Garcia-Gutierrez S, Quintana JM. Combining statistical techniques to predict postsurgical risk of 1-year mortality for patients with colon cancer. Clin Epidemiol 2018;10:235-251. [PMID: 29563837 PMCID: PMC5846756 DOI: 10.2147/clep.s146729] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Barroso EM, Ten Hove I, Bakker Schut TC, Mast H, van Lanschot CGF, Smits RWH, Caspers PJ, Verdijk R, Noordhoek Hegt V, Baatenburg de Jong RJ, Wolvius EB, Puppels GJ, Koljenović S. Raman spectroscopy for assessment of bone resection margins in mandibulectomy for oral cavity squamous cell carcinoma. Eur J Cancer 2018;92:77-87. [PMID: 29428867 DOI: 10.1016/j.ejca.2018.01.068] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Revised: 12/22/2017] [Accepted: 01/07/2018] [Indexed: 10/18/2022]

Affiliation(s)

Elisa M Barroso Department of Oral and Maxillofacial Surgery, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Ivo Ten Hove Department of Oral and Maxillofacial Surgery, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Tom C Bakker Schut Center for Optical Diagnostics and Therapy, Department of Dermatology, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands.
Hetty Mast Department of Oral and Maxillofacial Surgery, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Cornelia G F van Lanschot Department of Otorhinolaryngology and Head and Neck Surgery, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Roeland W H Smits Department of Otorhinolaryngology and Head and Neck Surgery, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Peter J Caspers Center for Optical Diagnostics and Therapy, Department of Dermatology, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Rob Verdijk Department of Pathology, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Vincent Noordhoek Hegt Department of Pathology, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Robert J Baatenburg de Jong Department of Otorhinolaryngology and Head and Neck Surgery, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Eppo B Wolvius Department of Oral and Maxillofacial Surgery, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Gerwin J Puppels Center for Optical Diagnostics and Therapy, Department of Dermatology, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands
Senada Koljenović Department of Pathology, Erasmus University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, The Netherlands

Collapse

Modern modeling techniques had limited external validity in predicting mortality from traumatic brain injury. J Clin Epidemiol 2016;78:83-89. [PMID: 26987507 DOI: 10.1016/j.jclinepi.2016.03.002] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2015] [Revised: 03/01/2016] [Accepted: 03/05/2016] [Indexed: 01/08/2023]

Abstract

BACKGROUND AND OBJECTIVE

Prediction of medical outcomes may potentially benefit from using modern statistical modeling techniques. We aimed to externally validate modeling strategies for prediction of 6-month mortality of patients suffering from traumatic brain injury (TBI) with predictor sets of increasing complexity.

METHODS

We analyzed individual patient data from 15 different studies including 11,026 TBI patients. We consecutively considered a core set of predictors (age, motor score, and pupillary reactivity), an extended set with computed tomography scan characteristics, and a further extension with two laboratory measurements (glucose and hemoglobin). With each of these sets, we predicted 6-month mortality using default settings with five statistical modeling techniques: logistic regression (LR), classification and regression trees, random forests (RFs), support vector machines (SVM) and neural nets. For external validation, a model developed on one of the 15 data sets was applied to each of the 14 remaining sets. This process was repeated 15 times for a total of 630 validations. The area under the receiver operating characteristic curve (AUC) was used to assess the discriminative ability of the models.

RESULTS

For the most complex predictor set, the LR models performed best (median validated AUC value, 0.757), followed by RF and support vector machine models (median validated AUC value, 0.735 and 0.732, respectively). With each predictor set, the classification and regression trees models showed poor performance (median validated AUC value, <0.7). The variability in performance across the studies was smallest for the RF- and LR-based models (inter quartile range for validated AUC values from 0.07 to 0.10).

CONCLUSION

In the area of predicting mortality from TBI, nonlinear and nonadditive effects are not pronounced enough to make modern prediction methods beneficial.

Collapse

van der Ploeg T, Austin PC, Steyerberg EW. Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med Res Methodol 2014;14:137. [PMID: 25532820 PMCID: PMC4289553 DOI: 10.1186/1471-2288-14-137] [Citation(s) in RCA: 349] [Impact Index Per Article: 34.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2014] [Accepted: 12/19/2014] [Indexed: 12/27/2022] Open

Abstract

BACKGROUND

Modern modelling techniques may potentially provide more accurate predictions of binary outcomes than classical techniques. We aimed to study the predictive performance of different modelling techniques in relation to the effective sample size ("data hungriness").

METHODS

We performed simulation studies based on three clinical cohorts: 1282 patients with head and neck cancer (with 46.9% 5 year survival), 1731 patients with traumatic brain injury (22.3% 6 month mortality) and 3181 patients with minor head injury (7.6% with CT scan abnormalities). We compared three relatively modern modelling techniques: support vector machines (SVM), neural nets (NN), and random forests (RF) and two classical techniques: logistic regression (LR) and classification and regression trees (CART). We created three large artificial databases with 20 fold, 10 fold and 6 fold replication of subjects, where we generated dichotomous outcomes according to different underlying models. We applied each modelling technique to increasingly larger development parts (100 repetitions). The area under the ROC-curve (AUC) indicated the performance of each model in the development part and in an independent validation part. Data hungriness was defined by plateauing of AUC and small optimism (difference between the mean apparent AUC and the mean validated AUC <0.01).

RESULTS

We found that a stable AUC was reached by LR at approximately 20 to 50 events per variable, followed by CART, SVM, NN and RF models. Optimism decreased with increasing sample sizes and the same ranking of techniques. The RF, SVM and NN models showed instability and a high optimism even with >200 events per variable.

CONCLUSIONS

Modern modelling techniques such as SVM, NN and RF may need over 10 times as many events per variable to achieve a stable AUC and a small optimism than classical modelling techniques such as LR. This implies that such modern techniques should only be used in medical prediction problems if very large data sets are available.

Collapse