Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Hanke M, Dijkstra L, Foraita R, Didelez V. Variable selection in linear regression models: Choosing the best subset is not always the best choice. Biom J 2024;66:e2200209. [PMID: 37643390 DOI: 10.1002/bimj.202200209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 06/19/2023] [Accepted: 06/22/2023] [Indexed: 08/31/2023]

Riley RD, Collins GS. Stability of clinical prediction models developed using statistical or machine learning methods. Biom J 2023;65:e2200302. [PMID: 37466257 PMCID: PMC10952221 DOI: 10.1002/bimj.202200302] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 04/26/2023] [Accepted: 05/02/2023] [Indexed: 07/20/2023]

Shi Y, Du Z, Zhang J, Han F, Chen F, Wang D, Liu M, Zhang H, Dong C, Sui S. Construction and evaluation of hourly average indoor PM_2.5 concentration prediction models based on multiple types of places. Front Public Health 2023;11:1213453. [PMID: 37637795 PMCID: PMC10447970 DOI: 10.3389/fpubh.2023.1213453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/28/2023] [Indexed: 08/29/2023] Open

Abstract

Background

People usually spend most of their time indoors, so indoor fine particulate matter (PM2.5) concentrations are crucial for refining individual PM2.5 exposure evaluation. The development of indoor PM2.5 concentration prediction models is essential for the health risk assessment of PM2.5 in epidemiological studies involving large populations.

Methods

In this study, based on the monitoring data of multiple types of places, the classical multiple linear regression (MLR) method and random forest regression (RFR) algorithm of machine learning were used to develop hourly average indoor PM2.5 concentration prediction models. Indoor PM2.5 concentration data, which included 11,712 records from five types of places, were obtained by on-site monitoring. Moreover, the potential predictor variable data were derived from outdoor monitoring stations and meteorological databases. A ten-fold cross-validation was conducted to examine the performance of all proposed models.

Results

The final predictor variables incorporated in the MLR model were outdoor PM2.5 concentration, type of place, season, wind direction, surface wind speed, hour, precipitation, air pressure, and relative humidity. The ten-fold cross-validation results indicated that both models constructed had good predictive performance, with the determination coefficients (R2) of RFR and MLR were 72.20 and 60.35%, respectively. Generally, the RFR model had better predictive performance than the MLR model (RFR model developed using the same predictor variables as the MLR model, R2 = 71.86%). In terms of predictors, the importance results of predictor variables for both types of models suggested that outdoor PM2.5 concentration, type of place, season, hour, wind direction, and surface wind speed were the most important predictor variables.

Conclusion

In this research, hourly average indoor PM2.5 concentration prediction models based on multiple types of places were developed for the first time. Both the MLR and RFR models based on easily accessible indicators displayed promising predictive performance, in which the machine learning domain RFR model outperformed the classical MLR model, and this result suggests the potential application of RFR algorithms for indoor air pollutant concentration prediction.

Collapse

Comparison of variable selection procedures and investigation of the role of shrinkage in linear regression-protocol of a simulation study in low-dimensional data. PLoS One 2022;17:e0271240. [PMID: 36191290 PMCID: PMC9529280 DOI: 10.1371/journal.pone.0271240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 06/24/2022] [Indexed: 11/06/2022] Open

Lee HJ, Nguyen AT, Ki SY, Lee JE, Do LN, Park MH, Lee JS, Kim HJ, Park I, Lim HS. Classification of MR-Detected Additional Lesions in Patients With Breast Cancer Using a Combination of Radiomics Analysis and Machine Learning. Front Oncol 2021;11:744460. [PMID: 34926256 PMCID: PMC8679659 DOI: 10.3389/fonc.2021.744460] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 11/08/2021] [Indexed: 01/02/2023] Open

Abstract ObjectiveThis study was conducted in order to investigate the feasibility of using radiomics analysis (RA) with machine learning algorithms based on breast magnetic resonance (MR) images for discriminating malignant from benign MR-detected additional lesions in patients with primary breast cancer.Materials and MethodsOne hundred seventy-four MR-detected additional lesions (benign, n = 86; malignancy, n = 88) from 158 patients with ipsilateral primary breast cancer from a tertiary medical center were included in this retrospective study. The entire data were randomly split to training (80%) and independent test sets (20%). In addition, 25 patients (benign, n = 21; malignancy, n = 15) from another tertiary medical center were included for the external test. Radiomics features that were extracted from three regions-of-interest (ROIs; intratumor, peritumor, combined) using fat-saturated T1-weighted images obtained by subtracting pre- from postcontrast images (SUB) and T2-weighted image (T2) were utilized to train the support vector machine for the binary classification. A decision tree method was utilized to build a classifier model using clinical imaging interpretation (CII) features assessed by radiologists. Area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, and specificity were used to compare the diagnostic performance.ResultsThe RA models trained using radiomics features from the intratumor-ROI showed comparable performance to the CII model (accuracy, AUROC: 73.3%, 69.6% for the SUB RA model; 70.0%, 75.1% for the T2 RA model; 73.3%, 72.0% for the CII model). The diagnostic performance increased when the radiomics and CII features were combined to build a fusion model. The fusion model that combines the CII features and radiomics features from multiparametric MRI data demonstrated the highest performance with an accuracy of 86.7% and an AUROC of 91.1%. The external test showed a similar pattern where the fusion models demonstrated higher levels of performance compared with the RA- or CII-only models. The accuracy and AUROC of the SUB+T2 RA+CII model in the external test were 80.6% and 91.4%, respectively.ConclusionOur study demonstrated the feasibility of using RA with machine learning approach based on multiparametric MRI for quantitatively characterizing MR-detected additional lesions. The fusion model demonstrated an improved diagnostic performance over the models trained with either RA or CII alone. Collapse

Al-Shatanawi TN, Sakka SA, Kheirallah KA, Al-Mistarehi AH, Al-Tamimi S, Alrabadi N, Alsulaiman J, Al Khader A, Abdallah F, Tawalbeh LI, Saleh T, Hijazi W, Alnsour AR, Younes NA. Self-Reported Obsession Toward COVID-19 Preventive Measures Among Undergraduate Medical Students During the Early Phase of Pandemic in Jordan. Front Public Health 2021;9:719668. [PMID: 34820347 PMCID: PMC8606560 DOI: 10.3389/fpubh.2021.719668] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 09/28/2021] [Indexed: 12/16/2022] Open

Abstract

Background: Coronavirus disease 2019 (COVID-19) pandemic and its associated precautionary measures have substantial impacts not only on the medical, economic, and social context but also on psychological health. This study aimed to assess the obsession toward COVID-19 preventive measures among undergraduate medical students during the early phase of the pandemic in Jordan.

Methods: Online questionnaires were distributed between March 16, 2020 and March 19, 2020. Socio-demographic characteristics were collected, and self-reported obsession toward COVID-19 preventive measures was assessed using a single question.COVID-19 knowledge, risk perception, and precautionary measures were evaluated using scales. Using the chi-square test, Student t-test, and one-way ANOVA, we assessed the differences in the obsession of students with socio-demographic characteristics and scores of the scales.

Results: A total of 1,404 participants (60% were female participants) completed the survey with a participation rate of 15.6%. Obsession with preventive measures was reported by 6.8%. Obsession was significantly more common among women (9.2%) than men (3.3%) and students who attended COVID-19 lectures (9.5%) than those who did not attend such lectures (5.8%) (p < 0.001 and p = 0.015, respectively). Obsessed participants reported significantly higher levels of COVID-19 knowledge (p = 0.012) and precautionary measures (p < 0.001). COVID-19 risk perception had a mild effect size difference but with no statistical significance (p = 0.075). There were no significant differences in the academic levels of participants (p = 0.791) and universities (p = 0.807) between students who were obsessed and those who were not.

Conclusions: Obsession is one of the significant but unspoken psychological effects of COVID-19 precautionary measures among undergraduate medical students. Medical schools should be equipped with means to handle pandemic psychological effects.

Collapse

Gragnano A, Miglioretti M, Magon G, Pravettoni G. Work with cancer or stop working after diagnosis? Variables affecting the decision. Work 2021;70:177-185. [PMID: 34511522 DOI: 10.3233/wor-213563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Moriarty AS, Paton LW, Snell KIE, Riley RD, Buckman JEJ, Gilbody S, Chew-Graham CA, Ali S, Pilling S, Meader N, Phillips B, Coventry PA, Delgadillo J, Richards DA, Salisbury C, McMillan D. The development and validation of a prognostic model to PREDICT Relapse of depression in adult patients in primary care: protocol for the PREDICTR study. Diagn Progn Res 2021;5:12. [PMID: 34215317 PMCID: PMC8254312 DOI: 10.1186/s41512-021-00101-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 05/19/2021] [Indexed: 12/23/2022] Open

Abstract

BACKGROUND

Most patients who present with depression are treated in primary care by general practitioners (GPs). Relapse of depression is common (at least 50% of patients treated for depression will relapse after a single episode) and leads to considerable morbidity and decreased quality of life for patients. The majority of patients will relapse within 6 months, and those with a history of relapse are more likely to relapse in the future than those with no such history. GPs see a largely undifferentiated case-mix of patients, and once patients with depression reach remission, there is limited guidance to help GPs stratify patients according to risk of relapse. We aim to develop a prognostic model to predict an individual's risk of relapse within 6-8 months of entering remission. The long-term objective is to inform the clinical management of depression after the acute phase.

METHODS

We will develop a prognostic model using secondary analysis of individual participant data drawn from seven RCTs and one longitudinal cohort study in primary or community care settings. We will use logistic regression to predict the outcome of relapse of depression within 6-8 months. We plan to include the following established relapse predictors in the model: residual depressive symptoms, number of previous depressive episodes, co-morbid anxiety and severity of index episode. We will use a "full model" development approach, including all available predictors. Performance statistics (optimism-adjusted C-statistic, calibration-in-the-large, calibration slope) and calibration plots (with smoothed calibration curves) will be calculated. Generalisability of predictive performance will be assessed through internal-external cross-validation. Clinical utility will be explored through net benefit analysis.

DISCUSSION

We will derive a statistical model to predict relapse of depression in remitted depressed patients in primary care. Assuming the model has sufficient predictive performance, we outline the next steps including independent external validation and further assessment of clinical utility and impact.

STUDY REGISTRATION

ClinicalTrials.gov ID: NCT04666662.

Collapse

Affiliation(s)

Andrew S Moriarty Department of Health Sciences, University of York, York, England. Hull York Medical School, University of York, York, England.
Lewis W Paton Department of Health Sciences, University of York, York, England
Kym I E Snell Centre for Prognosis Research, School of Medicine, Keele University, Keele, England
Richard D Riley Centre for Prognosis Research, School of Medicine, Keele University, Keele, England
Joshua E J Buckman Centre for Outcomes and Research Effectiveness, Research Department of Clinical, Educational and Health Psychology, University College London, London, England iCope - Camden and Islington Psychological Therapies Services, Camden & Islington NHS Foundation Trust, London, England
Simon Gilbody Department of Health Sciences, University of York, York, England Hull York Medical School, University of York, York, England
Carolyn A Chew-Graham School of Medicine, Keele University, Keele, England
Shehzad Ali Department of Health Sciences, University of York, York, England Department of Epidemiology and Biostatistics, Schulich School of Medicine & Dentistry, Western University, London, ON, Canada
Stephen Pilling Centre for Outcomes and Research Effectiveness, Research Department of Clinical, Educational and Health Psychology, University College London, London, England Camden & Islington NHS Foundation Trust, St Pancras Hospital, London, England
Nick Meader Centre for Reviews and Dissemination, University of York, York, England
Bob Phillips Centre for Reviews and Dissemination, University of York, York, England
Peter A Coventry Department of Health Sciences, University of York, York, England
Jaime Delgadillo Department of Psychology, University of Sheffield, Sheffield, England
David A Richards Institute of Health Research, College of Medicine and Health, University of Exeter, Exeter, England Department of Health and Caring Sciences, Western Norway University of Applied Sciences, Inndalsveien 28, 5063 Bergen, Norway, USA
Chris Salisbury Centre for Academic Primary Care, University of Bristol, Bristol, England
Dean McMillan Department of Health Sciences, University of York, York, England Hull York Medical School, University of York, York, England

Collapse

Gravesteijn BY, Sewalt CA, Venema E, Nieboer D, Steyerberg EW. Missing Data in Prediction Research: A Five-Step Approach for Multiple Imputation, Illustrated in the CENTER-TBI Study. J Neurotrauma 2021;38:1842-1857. [PMID: 33470157 DOI: 10.1089/neu.2020.7218] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Wallisch C, Dunkler D, Rauch G, de Bin R, Heinze G. Selection of variables for multivariable models: Opportunities and limitations in quantifying model stability by resampling. Stat Med 2020;40:369-381. [PMID: 33089538 PMCID: PMC7820988 DOI: 10.1002/sim.8779] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Revised: 07/02/2020] [Accepted: 09/29/2020] [Indexed: 12/14/2022]

Spanish Influenza Score (SIS): Usefulness of machine learning in the development of an early mortality prediction score in severe influenza. Med Intensiva 2020;45:69-79. [PMID: 32798052 DOI: 10.1016/j.medin.2020.05.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 05/22/2020] [Accepted: 05/23/2020] [Indexed: 02/07/2023]

Tsarouchi MI, Vlachopoulos GF, Karahaliou AN, Costaridou LI. Diagnostic value of apparent diffusion coefficient lesion texture biomarkers in breast MRI. HEALTH AND TECHNOLOGY 2020. [DOI: 10.1007/s12553-020-00452-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Sauerbrei W, Perperoglou A, Schmid M, Abrahamowicz M, Becher H, Binder H, Dunkler D, Harrell FE, Royston P, Heinze G. State of the art in selection of variables and functional forms in multivariable analysis-outstanding issues. Diagn Progn Res 2020;4:3. [PMID: 32266321 PMCID: PMC7114804 DOI: 10.1186/s41512-020-00074-3] [Citation(s) in RCA: 102] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 03/18/2020] [Indexed: 12/18/2022] Open

Abstract

BACKGROUND

How to select variables and identify functional forms for continuous variables is a key concern when creating a multivariable model. Ad hoc 'traditional' approaches to variable selection have been in use for at least 50 years. Similarly, methods for determining functional forms for continuous variables were first suggested many years ago. More recently, many alternative approaches to address these two challenges have been proposed, but knowledge of their properties and meaningful comparisons between them are scarce. To define a state of the art and to provide evidence-supported guidance to researchers who have only a basic level of statistical knowledge, many outstanding issues in multivariable modelling remain. Our main aims are to identify and illustrate such gaps in the literature and present them at a moderate technical level to the wide community of practitioners, researchers and students of statistics.

METHODS

We briefly discuss general issues in building descriptive regression models, strategies for variable selection, different ways of choosing functional forms for continuous variables and methods for combining the selection of variables and functions. We discuss two examples, taken from the medical literature, to illustrate problems in the practice of modelling.

RESULTS

Our overview revealed that there is not yet enough evidence on which to base recommendations for the selection of variables and functional forms in multivariable analysis. Such evidence may come from comparisons between alternative methods. In particular, we highlight seven important topics that require further investigation and make suggestions for the direction of further research.

CONCLUSIONS

Selection of variables and of functional forms are important topics in multivariable analysis. To define a state of the art and to provide evidence-supported guidance to researchers who have only a basic level of statistical knowledge, further comparative research is required.

Collapse

Hunkin H, King DL, Zajac IT. Perceived acceptability of wearable devices for the treatment of mental health problems. J Clin Psychol 2020;76:987-1003. [PMID: 32022908 DOI: 10.1002/jclp.22934] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Varathan N, Wijekoon P. Optimal stochastic restricted logistic estimator. Stat Pap (Berl) 2019. [DOI: 10.1007/s00362-019-01121-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Rubo M, Gamer M. Visuo-tactile congruency influences the body schema during full body ownership illusion. Conscious Cogn 2019;73:102758. [PMID: 31176847 PMCID: PMC6694184 DOI: 10.1016/j.concog.2019.05.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 05/06/2019] [Accepted: 05/22/2019] [Indexed: 12/21/2022]

Jenkins LC, Chang WJ, Buscemi V, Liston M, Toson B, Nicholas M, Graven-Nielsen T, Ridding M, Hodges PW, McAuley JH, Schabrun SM. Do sensorimotor cortex activity, an individual's capacity for neuroplasticity, and psychological features during an episode of acute low back pain predict outcome at 6 months: a protocol for an Australian, multisite prospective, longitudinal cohort study. BMJ Open 2019;9:e029027. [PMID: 31123007 PMCID: PMC6538004 DOI: 10.1136/bmjopen-2019-029027] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 02/20/2019] [Accepted: 03/20/2019] [Indexed: 12/23/2022] Open

Algamal ZY. Shrinkage parameter selection via modified cross-validation approach for ridge regression model. COMMUN STAT-SIMUL C 2018. [DOI: 10.1080/03610918.2018.1508704] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Thao LTP, Geskus R. A comparison of model selection methods for prediction in the presence of multiply imputed data. Biom J 2018;61:343-356. [PMID: 30353591 PMCID: PMC6492211 DOI: 10.1002/bimj.201700232] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 09/24/2018] [Accepted: 09/24/2018] [Indexed: 12/01/2022]

Giungato P, Renna M, Rana R, Licen S, Barbieri P. Characterization of dried and freeze-dried sea fennel (Crithmum maritimum L.) samples with headspace gas-chromatography/mass spectrometry and evaluation of an electronic nose discrimination potential. Food Res Int 2018;115:65-72. [PMID: 30599983 DOI: 10.1016/j.foodres.2018.07.067] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 07/10/2018] [Accepted: 07/31/2018] [Indexed: 02/06/2023]

Heinze G, Wallisch C, Dunkler D. Variable selection - A review and recommendations for the practicing statistician. Biom J 2018;60:431-449. [PMID: 29292533 PMCID: PMC5969114 DOI: 10.1002/bimj.201700067] [Citation(s) in RCA: 716] [Impact Index Per Article: 119.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Revised: 11/13/2017] [Accepted: 11/17/2017] [Indexed: 12/12/2022]

De Bin R, Sauerbrei W. Handling co-dependence issues in resampling-based variable selection procedures: a simulation study. J STAT COMPUT SIM 2017. [DOI: 10.1080/00949655.2017.1378654] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Evaluation of Industrial Roasting Degree of Coffee Beans by Using an Electronic Nose and a Stepwise Backward Selection of Predictors. FOOD ANAL METHOD 2017. [DOI: 10.1007/s12161-017-0909-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Göbl CS, Bozkurt L, Tura A, Pacini G, Kautzky-Willer A, Mittlböck M. Application of Penalized Regression Techniques in Modelling Insulin Sensitivity by Correlated Metabolic Parameters. PLoS One 2015;10:e0141524. [PMID: 26544569 PMCID: PMC4636325 DOI: 10.1371/journal.pone.0141524] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 10/09/2015] [Indexed: 12/20/2022] Open

Ding H, Dong W. Chaotic feature analysis and forecasting of Liujiang River runoff. Soft comput 2015. [DOI: 10.1007/s00500-015-1661-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

Moons KGM, Altman DG, Reitsma JB, Ioannidis JPA, Macaskill P, Steyerberg EW, Vickers AJ, Ransohoff DF, Collins GS. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med 2015;162:W1-73. [PMID: 25560730 DOI: 10.7326/m14-0698] [Citation(s) in RCA: 2907] [Impact Index Per Article: 323.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Sauerbrei W, Buchholz A, Boulesteix AL, Binder H. On stability issues in deriving multivariable regression models. Biom J 2014;57:531-55. [DOI: 10.1002/bimj.201300222] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Revised: 07/16/2014] [Accepted: 08/10/2014] [Indexed: 11/09/2022]

Musoro JZ, Zwinderman AH, Puhan MA, ter Riet G, Geskus RB. Validation of prediction models based on lasso regression with multiply imputed data. BMC Med Res Methodol 2014;14:116. [PMID: 25323009 PMCID: PMC4209042 DOI: 10.1186/1471-2288-14-116] [Citation(s) in RCA: 74] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2014] [Accepted: 10/10/2014] [Indexed: 01/22/2023] Open

Abstract

Background

In prognostic studies, the lasso technique is attractive since it improves the quality of predictions by shrinking regression coefficients, compared to predictions based on a model fitted via unpenalized maximum likelihood. Since some coefficients are set to zero, parsimony is achieved as well. It is unclear whether the performance of a model fitted using the lasso still shows some optimism. Bootstrap methods have been advocated to quantify optimism and generalize model performance to new subjects. It is unclear how resampling should be performed in the presence of multiply imputed data.

Method

The data were based on a cohort of Chronic Obstructive Pulmonary Disease patients. We constructed models to predict Chronic Respiratory Questionnaire dyspnea 6 months ahead. Optimism of the lasso model was investigated by comparing 4 approaches of handling multiply imputed data in the bootstrap procedure, using the study data and simulated data sets. In the first 3 approaches, data sets that had been completed via multiple imputation (MI) were resampled, while the fourth approach resampled the incomplete data set and then performed MI.

Results

The discriminative model performance of the lasso was optimistic. There was suboptimal calibration due to over-shrinkage. The estimate of optimism was sensitive to the choice of handling imputed data in the bootstrap resampling procedure. Resampling the completed data sets underestimates optimism, especially if, within a bootstrap step, selected individuals differ over the imputed data sets. Incorporating the MI procedure in the validation yields estimates of optimism that are closer to the true value, albeit slightly too larger.

Conclusion

Performance of prognostic models constructed using the lasso technique can be optimistic as well. Results of the internal validation are sensitive to how bootstrap resampling is performed.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2288-14-116) contains supplementary material, which is available to authorized users.

Collapse