501
|
Frostholm L, Fink P, Oernboel E, Christensen KS, Toft T, Olesen F, Weinman J. The uncertain consultation and patient satisfaction: the impact of patients' illness perceptions and a randomized controlled trial on the training of physicians' communication skills. Psychosom Med 2005; 67:897-905. [PMID: 16314594 DOI: 10.1097/01.psy.0000188403.94327.5b] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE To identify predictors of patient satisfaction among a range of patient and practitioner variables. In particular, to focus on patients' illness perceptions and the impact of a randomized controlled trial on the training of physicians in general communication skills and how to treat patients presenting with poorly defined illness. METHODS A randomized controlled follow-up study conducted in 28 general practices in Aarhus County, Denmark. Half of the physicians were randomized into an educational program on treatment of patients presenting with medically unexplained symptoms (somatization). One thousand seven hundred eighty-five general practice attenders presenting a new health problem completed questionnaires on illness perceptions, physical functioning, and mental distress before the consultation. After the consultation, a questionnaire including relational and communicative domains of patient satisfaction with the current consultation was completed. The physicians completed a questionnaire for each patient on diagnostics and prognostics. Predictors of patient satisfaction were determined by logistic regression. RESULTS A large number of patient and practitioner variables predicted satisfaction in univariate logistic regression models. Results from a multivariate logistic model showed that the illness perceptions "uncertainty" (patient not knowing what is wrong) and "emotional representations" (the complaint making the patient feel worried, depressed, helpless, afraid, hopeless) predicted dissatisfaction at OR (CI) = 1.8 (1.3-2.4), p < .001 and OR (CI) = 1.5 (1-2.3), p = .03 respectively. Trained physicians were associated with dissatisfaction at OR (CI) 0.7 (0.5-1), p = .06 in the multivariate model. Furthermore, uncertain patients consulting a trained physician were less likely to be dissatisfied OR (CI) = 0.6 (0.3-1), p = .04. CONCLUSIONS A randomized controlled trial on the training of general practitioners' communication skills improved patient satisfaction. Illness perceptions predict satisfaction. In particular, patients feeling uncertain and negatively emotionally involved in their health problem were more inclined to being dissatisfied with the consultation.
Collapse
Affiliation(s)
- Lisbeth Frostholm
- Research Clinic for Functional Disorders and Psychosomatics, Aarhus University Hospital, Aarhus N, Denmark
| | | | | | | | | | | | | |
Collapse
|
502
|
Hukkelhoven CWPM, Steyerberg EW, Habbema JDF, Farace E, Marmarou A, Murray GD, Marshall LF, Maas AIR. Predicting Outcome after Traumatic Brain Injury: Development and Validation of a Prognostic Score Based on Admission Characteristics. J Neurotrauma 2005; 22:1025-39. [PMID: 16238481 DOI: 10.1089/neu.2005.22.1025] [Citation(s) in RCA: 205] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The early prediction of outcome after traumatic brain injury (TBI) is important for several purposes, but no prognostic models have yet been developed with proven generalizability across different settings. The objective of this study was to develop and validate prognostic models that use information available at admission to estimate 6-month outcome after severe or moderate TBI. To this end, this study evaluated mortality and unfavorable outcome, that is, death, and vegetative or severe disability on the Glasgow Outcome Scale (GOS), at 6 months post-injury. Prospectively collected data on 2269 patients from two multi-center clinical trials were used to develop prognostic models for each outcome with logistic regression analysis. We included seven predictive characteristics-age, motor score, pupillary reactivity, hypoxia, hypotension, computed tomography classification, and traumatic subarachnoid hemorrhage. The models were validated internally with bootstrapping techniques. External validity was determined in prospectively collected data from two relatively unselected surveys in Europe (n = 796) and in North America (n = 746). We evaluated the discriminative ability, that is, the ability to distinguish patients with different outcomes, with the area under the receiver operating characteristic curve (AUC). Further, we determined calibration, that is, agreement between predicted and observed outcome, with the Hosmer-Lemeshow goodness-of-fit test. The models discriminated well in the development population (AUC 0.78-0.80). External validity was even better (AUC 0.83-0.89). Calibration was less satisfactory, with poor external validity in the North American survey (p < 0.001). Especially, observed risks were higher than predicted for poor prognosis patients. A score chart was derived from the regression models to facilitate clinical application. Relatively simple prognostic models using baseline characteristics can accurately predict 6-month outcome in patients with severe or moderate TBI. The high discriminative ability indicates the potential of this model for classifying patients according to prognostic risk.
Collapse
Affiliation(s)
- Chantal W P M Hukkelhoven
- Center for Clinical Decision Sciences, Department of Public Health, Erasmus MC, Rotterdam, The Netherlands
| | | | | | | | | | | | | | | |
Collapse
|
503
|
Steyerberg EW, Homs MYV, Stokvis A, Essink-Bot ML, Siersema PD. Stent placement or brachytherapy for palliation of dysphagia from esophageal cancer: a prognostic model to guide treatment selection. Gastrointest Endosc 2005; 62:333-40. [PMID: 16111947 DOI: 10.1016/s0016-5107(05)01587-7] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/29/2004] [Accepted: 03/25/2005] [Indexed: 02/08/2023]
Abstract
BACKGROUND Brachytherapy was found to be preferable to metal stent placement for the palliation of dysphagia because of inoperable esophageal cancer in the randomized SIREC trial. The benefit of brachytherapy, however, only occurred after a relatively long survival. The objective is to develop a model that distinguishes patients with a poor prognosis from those with a relatively good prognosis. METHODS Survival was analyzed with Cox regression analysis. Dysphagia-adjusted survival (alive with no or mild dysphagia) was studied with Kaplan-Meier analysis. Patient data is from the multicenter, randomized, controlled trial (SIREC, n = 209) and a consecutive series (n = 396). Patients received a stent or single-dose brachytherapy. RESULTS Significant prognostic factors for survival included tumor length, World Health Organization performance score, and the presence of metastases (multivariable p < 0.001). A simple score, which also included age and gender, could satisfactorily separate patients with a poor, intermediate, and relatively good prognosis within the SIREC trial. For the poor prognosis group, the difference in dysphagia-adjusted survival was 23 days in favor of stent placement compared with brachytherapy (77 vs. 54 days, p = 0.16). For the other prognostic groups, brachytherapy resulted in a better dysphagia-adjusted survival. CONCLUSIONS A simple prognostic score may help to identify patients with a poor prognosis in whom stent placement is at least equivalent to brachytherapy. If further validated, this score can provide an evidence-based tool for the selection of palliative treatment in esophageal cancer patients.
Collapse
Affiliation(s)
- Ewout W Steyerberg
- Department of Public Health, Erasmus MC, University Medical Centre Rotterdam, The Netherlands
| | | | | | | | | |
Collapse
|
504
|
Suarthana E, Vergouwe Y, Nieuwenhuijsen M, Heederik D, Grobbee DE, Meijer E. Diagnostic model for sensitization in workers exposed to occupational high molecular weight allergens. Am J Ind Med 2005; 48:168-74. [PMID: 16094609 DOI: 10.1002/ajim.20199] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
BACKGROUND Occupational allergy has great impact on workers exposed to high molecular weight (HMW) allergens. The present study is aimed to develop and validate a generic diagnostic model for sensitization to HMW allergens, defined as positive IgE. METHODS The model was developed in pooled data from Dutch laboratory animal (LA) workers and bakers using logistic regression analysis. Validity was assessed internally by bootstrapping procedure, and externally in British LA workers. RESULTS The model included working hours/week, work-related symptoms, total IgE, and IgE to common allergen. Significant interactions between the type of work and the predictors resulted in different scores for LA workers and bakers. Internal and external validation showed that the model was satisfactorily calibrated and discriminated between workers at high and low risk of being sensitized. CONCLUSIONS It is possible to develop a generic model for sensitization to occupational HMW allergens. However, the weighing of predictors differs across specific work environments.
Collapse
Affiliation(s)
- Eva Suarthana
- Institute for Risk Assessment Sciences, Environmental and Occupational Health Group, Utrecht University, Utrecht, The Netherlands
| | | | | | | | | | | |
Collapse
|
505
|
Janssens ACJW, Deng Y, Borsboom GJJM, Eijkemans MJC, Habbema JDF, Steyerberg EW. A new logistic regression approach for the evaluation of diagnostic test results. Med Decis Making 2005; 25:168-77. [PMID: 15800301 DOI: 10.1177/0272989x05275154] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The value of a dichotomous diagnostic test is often described in terms of sensitivity, specificity, and likelihood ratios (LRs). Although it is known that these test characteristics vary between subgroups of patients, they are generally interpreted, on average, without considering information on patient characteristics, such as clinical signs and symptoms, or on previous test results. This article presents a reformulation of the logistic regression model that allows to calculate the LRs of diagnostic test results conditional on these covariates. The proposed method starts with estimating logistic regression models for the prior and posterior odds of disease. The regression model for the prior odds is based on patient characteristics, whereas the regression model for the posterior odds also includes the diagnostic test of interest. Following the Bayes theorem, the authors demontsrate that the regression model for the LR can be derived from taking the differences between the regression coefficients of the 2 models. In a clinical example, they demonstrate that the LRs of positive and negative test results and the sensitivity and specificity of the diagnostic test varied considerably between patients with different risk profiles, even when a constant odds ratio was assumed. The proposed logistic regression approach proves an efficient method to determine the performance of tests at the level of the individual patient risk profile and to examine the effect of patient characteristics on diagnostic test characteristics.
Collapse
Affiliation(s)
- A Cecile J W Janssens
- Center for Clinical Decision Sciences, Department of Public Health, Erasmus MC, Netherlands.
| | | | | | | | | | | |
Collapse
|
506
|
Abstract
Health beliefs have been shown to influence a myriad of medical treatment decisions. More recently, the impact of health beliefs on treatment decisions for mental illness has become a focus of study. This study examines the health beliefs and treatment behavior of veterans with posttraumatic stress disorder (PTSD). Using standard survey methodology, we assessed beliefs about the cause of PTSD, expected duration and controllability of symptoms, and life consequences of having PTSD. Treatment participation and medication compliance were assessed, as were common treatment correlates, such as patient-provider relationships, dosing frequency, side effect severity, number of prescribed medications, and use of drugs or alcohol to control PTSD symptoms. Explanatory models of PTSD, perceived controllability, and use of benzodiazepines were found to predict psychiatric medication use. Negative life consequences of PTSD were associated with participation in psychotherapy. Assessment of health beliefs may help providers to understand their patients' treatment behavior and to facilitate treatment engagement.
Collapse
Affiliation(s)
- Michele Spoont
- Center for Chronic Disease Outcome Research, VA Medical Center, Minneapolis, MN 55417, USA
| | | | | |
Collapse
|
507
|
Schepers VP, Visser-Meily AM, Ketelaar M, Lindeman E. Prediction of Social Activity 1 Year Poststroke. Arch Phys Med Rehabil 2005; 86:1472-6. [PMID: 16003683 DOI: 10.1016/j.apmr.2004.11.039] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
OBJECTIVE To develop an easy-to-use prediction rule for social activity 1 year poststroke that can identify patients at risk for social inactivity. DESIGN Inception cohort. SETTING Rehabilitation center. PARTICIPANTS Patients with a first-ever supratentorial stroke were selected in 4 Dutch rehabilitation centers. Data of 250 patients were available for analysis. Potential prognostic factors measured at admission were sex, age, marital status, prestroke employment status, educational level, type of stroke, hemisphere, motor impairment, trunk control, communication, and activities of daily living (ADL) dependency. INTERVENTIONS Not applicable. MAIN OUTCOME MEASURE Social activity measured by the Frenchay Activities Index (FAI) at 1 year poststroke. RESULTS Multivariate backward linear regression analysis identified sex, age, marital status, motor impairment, communication, and ADL dependency as important predictors of the FAI score 1 year poststroke. An easy-to-use score chart was constructed that could identify patients at risk for social inactivity. The score chart proved to be well able to discriminate poor social functioning from moderate to good social functioning (area under the curve = .85). CONCLUSIONS Identifying patients at risk enables health care professionals to focus on the social activity of this patient subgroup at an early stage in the care process.
Collapse
Affiliation(s)
- Vera P Schepers
- Center of Excellence for Rehabilitation Medicine Utrecht, Rehabilitation Center De Hoogstraat, Utrecht, The Netherlands.
| | | | | | | |
Collapse
|
508
|
Steyerberg EW, Eijkemans MJC, Boersma E, Habbema JDF. Equally valid models gave divergent predictions for mortality in acute myocardial infarction patients in a comparison of logistic [corrected] regression models. J Clin Epidemiol 2005; 58:383-90. [PMID: 15862724 DOI: 10.1016/j.jclinepi.2004.07.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2004] [Revised: 06/29/2004] [Accepted: 07/12/2004] [Indexed: 10/25/2022]
Abstract
OBJECTIVE Models that predict mortality after acute myocardial infarction (AMI) contain different predictors and are based on different populations. We studied the agreement and validity of predictions for individual patients. STUDY DESIGN AND SETTING We compared predictions from five predictive logistic regression models for short-term mortality after AMI. Three models were developed previously, and two models were developed in the GUSTO-I data, where all five models were applied (n =40,830, 7.0% 30-day mortality). Agreement was studied with weighted kappa statistics of categorized predictions. Validity was assessed by comparing observed frequencies with predictions (indicating calibration) and by the area under the receiver operating characteristic curve (AUC), indicating discriminative ability. RESULTS The predictions from the five models varied considerably for individual patients, with low agreement between most (kappa <0.6). Risk predictions from the three previously developed models were on average too high, which could be corrected by re-calibration of the model intercept. The AUC ranged from 0.76-0.78 and increased to 0.78-0.79 with re-estimated regression coefficients that were optimal for the GUSTO-I patients. The two more detailed GUSTO-I based models performed better (AUC approximately 0.82). CONCLUSION Models with different predictors may have a similar validity while the agreement between predictions for individual patients is poor. The main concerns in the applicability of predictive models for AMI should relate to the selected predictors and average calibration.
Collapse
Affiliation(s)
- Ewout W Steyerberg
- Department of Public Health, Center for Clinical Decision Sciences, Ee2093, Erasmus MC, University Medical Center Rotterdam, PO Box 1738, 3000 DR Rotterdam, The Netherlands.
| | | | | | | |
Collapse
|
509
|
|
510
|
Schurink CAM, Lucas PJF, Hoepelman IM, Bonten MJM. Computer-assisted decision support for the diagnosis and treatment of infectious diseases in intensive care units. THE LANCET. INFECTIOUS DISEASES 2005; 5:305-12. [PMID: 15854886 DOI: 10.1016/s1473-3099(05)70115-8] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Diagnosing nosocomial infections in critically ill patients admitted to intensive care units (ICUs) is a challenge because signs and symptoms are usually non-specific for a particular infection. In addition, the choice of treatment, or the decision not to treat, can be difficult. Models and computer-based decision-support systems have been developed to assist ICU physicians in the management of infectious diseases. We discuss the historical development, possibilities, and limitations of various computer-based decision-support models for infectious diseases, with special emphasis on Bayesian approaches. Although Bayesian decision-support systems are potentially useful for medical decision making in infectious disease management, clinical experience with them is limited and prospective evaluation is needed to determine whether their use can improve the quality of patient care.
Collapse
Affiliation(s)
- C A M Schurink
- Department of Medicine, Division of Acute Medicine and Infectious Diseases, University Medical Centre, Utrecht, Netherlands.
| | | | | | | |
Collapse
|
511
|
Vergouwe Y, Steyerberg EW, Eijkemans MJC, Habbema JDF. Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol 2005; 58:475-83. [PMID: 15845334 DOI: 10.1016/j.jclinepi.2004.06.017] [Citation(s) in RCA: 431] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2003] [Revised: 05/26/2004] [Accepted: 06/21/2004] [Indexed: 11/24/2022]
Abstract
BACKGROUND AND OBJECTIVES The performance of a prediction model is usually worse in external validation data compared to the development data. We aimed to determine at which effective sample sizes (i.e., number of events) relevant differences in model performance can be detected with adequate power. METHODS We used a logistic regression model to predict the probability that residual masses of patients treated for metastatic testicular cancer contained only benign tissue. We performed standard power calculations and Monte Carlo simulations to estimate the numbers of events that are required to detect several types of model invalidity with 80% power at the 5% significance level. RESULTS A validation sample with 111 events was required to detect that a model predicted too high probabilities, when predictions were on average 1.5 times too high on the odds scale. A decrease in discriminative ability of the model, indicated by a decrease in the c-statistic from 0.83 to 0.73, required 81 to 106 events, depending on the specific scenario. CONCLUSION We suggest a minimum of 100 events and 100 nonevents for external validation samples. Specific hypotheses may, however, require substantially higher effective sample sizes to obtain adequate power.
Collapse
Affiliation(s)
- Yvonne Vergouwe
- Department of Public Health, Erasmus MC, P.O. Box 1738, 3000 DR Rotterdam, The Netherlands.
| | | | | | | |
Collapse
|
512
|
Litaker D, Flocke SA, Frolkis JP, Stange KC. Physicians' attitudes and preventive care delivery: insights from the DOPC study. Prev Med 2005; 40:556-63. [PMID: 15749138 DOI: 10.1016/j.ypmed.2004.07.015] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
BACKGROUND Interventions that modify physician attitudes to enhance preventive service delivery are common, yet other factors may be relatively more important in determining whether these services are provided. We assessed associations between physicians' attitudes and delivery of preventive care, compared with factors related to the patient, visit, or practice. METHODS One hundred twenty-eight primary care physicians rated the importance of five preventive services and their effectiveness at delivering them. We assessed whether their patients had received cervical smears, prostate-specific antigen (PSA) testing, smoking cessation advice, recommendation to use aspirin to prevent myocardial infarction, or weight-maintenance counseling, when appropriate. Multilevel models assessed associations between physician attitudinal characteristics and a patient's likelihood of being up to date for each service. RESULTS Importance of PSA screening and tobacco cessation counseling were weakly associated with patients' receipt of preventive care; no association between attitudes and other services was observed. Factors such as having a visit for well care and use of prevention flowcharts were associated with delivery of preventive services to a greater extent. CONCLUSIONS Physicians' attitudes toward prevention are necessary, but not sufficient in ensuring the delivery of preventive services. Future interventions should address visit- and practice-specific factors more closely associated with preventive care.
Collapse
Affiliation(s)
- David Litaker
- Louis Stokes Cleveland Department of Veterans Affairs Medical Center, Case Western Reserve University, Cleveland, OH 44106, USA.
| | | | | | | |
Collapse
|
513
|
Hukkelhoven CWPM, Steyerberg EW, Habbema JDF, Maas AIR. Admission of patients with severe and moderate traumatic brain injury to specialized ICU facilities: a search for triage criteria. Intensive Care Med 2005; 31:799-806. [PMID: 15834705 DOI: 10.1007/s00134-005-2628-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2004] [Accepted: 03/15/2005] [Indexed: 12/01/2022]
Abstract
OBJECTIVE To investigate whether triage for direct admission of patients with traumatic brain injury to a trauma center is facilitated by predicting the risk of potentially removable lesions or raised intracranial pressure (ICP). DESIGN AND SETTING Cohort study in a level I university trauma center. PATIENTS AND PARTICIPANTS A prospective cohort of primarily (n=200) and secondarily (n=75) referred patients with moderate or severe traumatic brain injury. MEASUREMENTS AND RESULTS Predictive characteristics for the risk of surgically removable lesions and the risk of raised ICP (repeatedly > or = 20 mmHg) were identified and included in prognostic models. These models were validated internally with bootstrapping techniques and externally on a historic sample (n=205) regarding discriminative ability (AUC). Among the cohort patients, 67% had raised ICP and 54% had surgically removable lesions. Both outcomes occurred more frequently in patients secondarily referred, but the incidence in patients primarily referred was also high (62% and 33% respectively). No strong predictors of raised ICP were identified. Age and pupillary reactivity were significant predictors of surgically removable lesions. The models discriminated reasonably for surgically removable lesions (AUC=0.78 at development and AUC=0.67 at external validation) but not for raised ICP (AUC=0.59 at development and AUC=0.50 at external validation). CONCLUSIONS It is difficult accurately to identify patients in need of specialized intensive care using baseline characteristics. The high incidence of both outcomes in patients primarily referred support direct admission of more and particularly older patients with severe or moderate brain trauma to level I trauma centers.
Collapse
Affiliation(s)
- Chantal W P M Hukkelhoven
- Department of Public Health, Center for Clinical Decision Science, Erasmus MC, Rotterdam, The Netherlands
| | | | | | | |
Collapse
|
514
|
Jannink MJ, Ijzerman MJ, Groothuis-Oudshoorn K, Stewart RE, Groothoff JW, Lankhorst GJ. Use of orthopedic shoes in patients with degenerative disorders of the foot. Arch Phys Med Rehabil 2005; 86:687-92. [PMID: 15827918 DOI: 10.1016/j.apmr.2004.06.069] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVES To study the actual use of orthopedic shoes by patients with degenerative foot disorders and to identify factors associated with use and nonuse, based on the parameters of the International Organization for Standardization definition of usability: effectiveness, efficiency, satisfaction, and context of use. DESIGN Multicenter, prospective cohort study. SETTING Outpatient clinics of 7 rehabilitation centers in the Netherlands. PARTICIPANTS One hundred consecutive patients with degenerative foot disorders. INTERVENTIONS Not applicable. MAIN OUTCOME MEASURES Usability was assessed by means of the Questionnaire for Usability Evaluation of orthopedic shoes. RESULTS Seventy of 93 patients with degenerative foot disorders wore their orthopedic shoes for more than 3 days a week after 3 months of follow-up. Factors significantly associated with the actual use of orthopedic shoes were (1) increase in stance duration (effectiveness odds ratio [OR]=2.14; 95% confidence interval [CI], 1.19-3.85), (2) decrease in skin abnormalities (effectiveness OR=1.35; 95% CI, 1.02-1.8]), (3) problems experienced with putting on and taking off orthopedic shoes (efficiency OR=.46; 95% CI, .26-.82), and (4) cosmetic appearance of orthopedic shoes (satisfaction OR=1.54; 95% CI, 1.1-2.15). The overall fit of the multiple logistic regression model ( R 2 ) was 56.3%. CONCLUSIONS By adding efficiency and satisfaction factors and not focusing only on the effectiveness factors, the amount of explained variance increases, and it becomes possible to evaluate and design products for people with special needs more comprehensively.
Collapse
|
515
|
Moran JL, Solomon PJ, Warn DE. Methodology in meta–analysis: a study from Critical Care meta–analytic practice. HEALTH SERVICES AND OUTCOMES RESEARCH METHODOLOGY 2004. [DOI: 10.1007/s10742-006-6829-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
516
|
Shahian DM, Blackstone EH, Edwards FH, Grover FL, Grunkemeier GL, Naftel DC, Nashef SAM, Nugent WC, Peterson ED. Cardiac Surgery Risk Models: A Position Article. Ann Thorac Surg 2004; 78:1868-77. [PMID: 15511504 DOI: 10.1016/j.athoracsur.2004.05.054] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Differences in medical outcomes may result from disease severity, treatment effectiveness, or chance. Because most outcome studies are observational rather than randomized, risk adjustment is necessary to account for case mix. This has usually been accomplished through the use of standard logistic regression models, although Bayesian models, hierarchical linear models, and machine-learning techniques such as neural networks have also been used. Many factors are essential to insuring the accuracy and usefulness of such models, including selection of an appropriate clinical database, inclusion of critical core variables, precise definitions for predictor variables and endpoints, proper model development, validation, and audit. Risk models may be used to assess the impact of specific predictors on outcome, to aid in patient counseling and treatment selection, to profile provider quality, and to serve as the basis of continuous quality improvement activities.
Collapse
|
517
|
Trampuz A, Hanssen AD, Osmon DR, Mandrekar J, Steckelberg JM, Patel R. Synovial fluid leukocyte count and differential for the diagnosis of prosthetic knee infection. Am J Med 2004; 117:556-62. [PMID: 15465503 DOI: 10.1016/j.amjmed.2004.06.022] [Citation(s) in RCA: 354] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/06/2004] [Revised: 06/10/2004] [Accepted: 06/10/2004] [Indexed: 12/17/2022]
Abstract
PURPOSE Criteria for the interpretation of synovial fluid are well established for native joint disorders but lacking for the evaluation of prosthetic joint failure. Our aim was to define cutoff values for synovial fluid leukocyte count and neutrophil percentage for differentiating aseptic failure and prosthetic joint infection. METHODS We performed a prospective study of 133 patients in whom synovial fluid specimens were collected before total knee arthroplasty revision between January 1998 and December 2003. Patients with underlying inflammatory joint disease were excluded. RESULTS Aseptic failure was diagnosed in 99 patients and prosthetic joint infection was diagnosed in 34 patients. The synovial fluid leukocyte count was significantly higher in patients with prosthetic joint infection (median, 18.9 x 10(3)/microL; range, 0.3 to 178 x 10(3)/microL) than in those with aseptic failure (median, 0.3 x 10(3)/microL; range, 0.1 to 16 x 10(3)/microL; P <0.0001); the neutrophil percentage was also significantly higher in patients with prosthetic joint infection (median [range], 92% [55% to 100%] vs. 7% [0% to 79%], P <0.0001). A leukocyte count of >1.7 x 10(3)/microL had a sensitivity of 94% and a specificity of 88% for diagnosing prosthetic joint infection; a differential of >65% neutrophils had a sensitivity of 97% and a specificity of 98%. Staphylococcus aureus was the only pathogen associated with leukocyte counts >100 x 10(3)/microL. CONCLUSION A synovial fluid leukocyte differential of >65% neutrophils (or a leukocyte count of >1.7 x 10(3)/microL) is a sensitive and specific test for the diagnosis of prosthetic knee infection in patients without underlying inflammatory joint disease.
Collapse
Affiliation(s)
- Andrej Trampuz
- Division of Infectious Diseases, Department of Internal Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota 55905, USA
| | | | | | | | | | | |
Collapse
|
518
|
Marras TK, Morris A, Gonzalez LC, Daley CL. Mortality Prediction in PulmonaryMycobacterium KansasiiInfection and Human Immunodeficiency Virus. Am J Respir Crit Care Med 2004; 170:793-8. [PMID: 15215152 DOI: 10.1164/rccm.200402-162oc] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
In the setting of human immunodeficiency virus (HIV) infection, the clinical implications of American Thoracic Society (ATS) diagnostic criteria and the significance of a single positive respiratory culture for Mycobacterium kansasii are unknown. We retrospectively studied HIV-infected patients with pulmonary M. kansasii isolated between 1989 and 2002 at one institution. Of 127 patients, 33% fulfilled ATS disease criteria. Twenty-nine percent received at least three active drugs for at least 3 months, and 53% died. In survival analysis, a lower CD4 count (hazard ratio [HR], 1.6; 95% confidence interval [CI], 1.1-2.3) and positive smear microscopy (HR, 2.8; 95% CI, 1.3-6.1) were associated with mortality, whereas antiretroviral therapy (HR, 0.3; 95% CI, 0.1-0.8) and M. kansasii treatment (HR, 0.4; 95% CI, 0.2-0.9) were associated with survival. ATS criteria did not predict mortality (HR, 0.9; 95% CI, 0.4-1.9). Fifteen patients (12%) apparently had indolent infection, not requiring immediate therapy. They had fewer positive cultures and lower rates of positive smear microscopy and ATS-defined disease. In HIV-infected patients with pulmonary M. kansasii infection, predictors of survival include higher CD4 counts, antiretroviral therapy, negative smear microscopy, and adequate treatment for M. kansasii infection, but not ATS diagnostic criteria. Withholding treatment in HIV-infected patients with respiratory M. kansasii isolates should only be considered with negative smear microscopy, few positive cultures, and mild immunosuppression.
Collapse
Affiliation(s)
- Theodore K Marras
- Department of Medicine (Respiratory), University of Toronto, Ontario, Canada.
| | | | | | | |
Collapse
|
519
|
Marras TK, Chan CK, Lipton JH, Messner HA, Szalai JP, Laupacis A. Long-term pulmonary function abnormalities and survival after allogeneic marrow transplantation. Bone Marrow Transplant 2004; 33:509-17. [PMID: 14716347 DOI: 10.1038/sj.bmt.1704377] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We studied long-term pulmonary function testing (PFT) in a retrospective cohort of 6-month survivors of allogeneic marrow transplant (BMT) between 1980 and 1997. Of 593 patients, 73, 71 and 65% had adequate data to assess for obstruction, restriction and diffusion impairments respectively. Over 5 years, mean declines in 1-s forced expiratory volume/forced vital capacity (FEV1/FVC), total lung capacity (TLC) and diffusion were 4, 7 and 17%, respectively. TLC and diffusion tended to subsequently increase. In all, 6, 12 and 35% of patients met criteria for obstruction, restriction and impaired diffusion, respectively. Obstruction was less common in recent transplants (5 vs 15%, P=0.004), while restriction and diffusion impairment rates remained stable. There was significantly greater mortality with obstruction (HR 2.0 (1.04-3.95)), and a nonstatistically significant higher mortality rate with restriction (HR 1.6 (0.95-2.75)), but not with impaired diffusion (HR=0.99 (0.65-1.50)). cGVHD (OR 16.7 (2.2-129.8)) and busulfan (OR 2.9 (1.01-8.24)) were associated with obstruction. Marrow from nonsibling or mismatched donors (OR 4.9 (2.2-10.7)) was associated with restriction. In summary, after BMT, decreased diffusion capacity is common and benign; obstruction has decreased in frequency, is rare without cGVHD, and is associated with mortality; nonsibling and mismatched donor are risk factors for restriction.
Collapse
Affiliation(s)
- T K Marras
- Joint Division of Respirology, Department of Medicine, University Health Network and Mount Sinai Hospital, University of Toronto, Toronto, Ontario, Canada.
| | | | | | | | | | | |
Collapse
|
520
|
Hunt SC, Richardson RD, Engel CC, Atkins DC, McFall M. Gulf War Veterans’ Illnesses: A Pilot Study of the Relationship of Illness Beliefs to Symptom Severity and Functional Health Status. J Occup Environ Med 2004; 46:818-27. [PMID: 15300134 DOI: 10.1097/01.jom.0000135529.88068.04] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
This investigation describes the illness beliefs of veterans regarding their Gulf War-related health concerns and investigates the relationship of these illness beliefs to physical and mental health functioning. Gulf War veterans (N = 583) presenting for evaluation at a Veteran's Affairs and Department of Defense facility completed self-report measures of symptom-related beliefs, psychosocial distress, and functional status. Hierarchical multiple regression analyses were performed to determine the extent that symptom-related beliefs impacted symptom-reporting and functional status independent of demographic factors and psychiatric illness. Several beliefs predicted physical symptom reporting and functional impairment in physical health and mental health domains after controlling for demographic variables and psychiatric illness. Gulf War veterans' illness beliefs may impact clinical outcomes. Discussing illness beliefs and providing accurate information is an important component of medical care for Gulf War veterans.
Collapse
Affiliation(s)
- Stephen C Hunt
- Veterans Affairs Puget Sound Health Care System, Seattle, Washington 98108, USA.
| | | | | | | | | |
Collapse
|
521
|
Teixeira PJ, Going SB, Houtkooper LB, Cussler EC, Metcalfe LL, Blew RM, Sardinha LB, Lohman TG. Pretreatment predictors of attrition and successful weight management in women. Int J Obes (Lond) 2004; 28:1124-33. [PMID: 15263921 DOI: 10.1038/sj.ijo.0802727] [Citation(s) in RCA: 273] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
OBJECTIVE This study analyzed baseline behavioral and psychosocial differences between successful and nonsuccessful participants in a behavioral weight management program. Success was defined by commonly used health-related criteria (5% weight loss). Noncompletion was also used as a marker of a failed attempt at weight control. SUBJECTS A total of 158 healthy overweight and obese women (age, 48.0+/-4.5 y; BMI, 31.0+/-3.8 kg/m(2); body fat, 44.5+/-5.3%). INTERVENTION Subjects participated in a 16-week lifestyle weight loss program consisting of group-based behavior therapy to improve diet and increase physical activity, and were followed for 1 y after treatment. METHODS At baseline, all women completed a comprehensive behavioral and psychosocial battery assessing dieting/weight history, dietary intake and eating behaviors, exercise, self-efficacy, outcome evaluations, body image, and other variables considered relevant for weight management. Participants who maintained a weight loss of 5% or more at 16 months (or 10% or more of initial fat mass) were classified as successful. Nonsuccessful participants were those who dropped out and completers who had not lost weight at follow-up. RESULTS Of all participants, 30% (n=47) did not complete initial treatment and/or missed follow-up assessments (noncompleters). Noncompletion was independently associated with more previous weight loss attempts, poorer quality of life, more stringent weight outcome evaluations, and lower reported carbohydrate intake at baseline. In logistic regression, completion status was predicted correctly in 84% of all cases (chi(2)=45.5, P<0.001), using baseline information only. Additional predictors of attrition were initial weight, exercise minutes, fiber intake, binge eating, psychological health, and body image. A large variation in weight loss/maintenance results was observed (range: 37.2 kg for 16-month weight change). Independent baseline predictors of success at 16 months were more moderate weight outcome evaluations, lower level of previous dieting, higher exercise self-efficacy, and smaller waist-to-hip ratio. Success status at follow-up was predicted correctly in 74% of all starting cases (chi(2)=33.6, P<0.001). CONCLUSION Psychosocial and behavioral variables (eg, dieting history, dietary intake, outcome evaluations, exercise self-efficacy, and quality of life) may be useful as pretreatment predictors of success level and/or attrition in previously overweight and mildly obese women who volunteer for behavioral weight control programs. These factors can be used in developing readiness profiles for weight management, a potentially important tool to address the issue of low success/completion rates in the current management of obesity.
Collapse
Affiliation(s)
- P J Teixeira
- Department of Exercise and Health, Faculty of Human Movement, Technical University of Lisbon, Lisbon, Portugal.
| | | | | | | | | | | | | | | |
Collapse
|
522
|
Rietveld E, De Jonge HCC, Polder JJ, Vergouwe Y, Veeze HJ, Moll HA, Steyerberg EW. Anticipated costs of hospitalization for respiratory syncytial virus infection in young children at risk. Pediatr Infect Dis J 2004; 23:523-9. [PMID: 15194833 DOI: 10.1097/01.inf.0000129690.35341.8d] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
BACKGROUND Reliable estimates of hospitalization costs for severe respiratory syncytial virus (RSV) infection are necessary to perform economic analyses of preventive strategies of severe RSV disease. We aimed to develop a model that predicts anticipated mean RSV hospitalization costs of groups of young children at risk for hospitalization, but not yet hospitalized, based on readily available child characteristics. METHODS We determined real direct medical costs of RSV hospitalization from a societal perspective, using a bottom-up strategy, in 3458 infants and young children hospitalized for severe RSV disease during the RSV seasons 1996-1997 to 1999-2000 in the Southwest of the Netherlands. We used a linear regression model to predict anticipated mean RSV hospitalization costs of groups of children at risk, based on 4 child characteristics [age, gestational age, birth weight and bronchopulmonary dysplasia (BPD)], expressed in EC Euros as of the year 2000. FINDINGS The mean RSV hospitalization costs of all patients were 3110 Euros. RSV hospitalization costs were higher for patients with lower gestational age (5555 Euros; gestational age, </=28 weeks), lower birth weight (3895 Euros; birth weight </=2500 g), BPD (5785 Euros; with BPD) and young age (4730 Euros; first month of life). The linear regression model had an adjusted R of 0.08. This indicates a low explanatory ability for hospitalization costs of individual children. However, the model could accurately estimate the anticipated mean hospitalization costs of groups of children with the same characteristics. INTERPRETATION RSV hospitalization costs were substantial, especially of specific high risk groups. Anticipated mean hospitalization costs of groups of children at risk for RSV hospitalization, but not yet hospitalized, could well be estimated with 4 child characteristics (age, gestational age, birth weight and BPD). These estimated costs can be used for economic analyses of preventive strategies for severe RSV disease.
Collapse
Affiliation(s)
- Edwin Rietveld
- Department of Pediatrics, Erasmus MC-Sophia, Rotterdam, The Netherlands
| | | | | | | | | | | | | |
Collapse
|
523
|
Twardella D, Popanda O, Helmbold I, Ebbeler R, Benner A, von Fournier D, Haase W, Sautter-Bihl ML, Wenz F, Schmezer P, Chang-Claude J. Personal characteristics, therapy modalities and individual DNA repair capacity as predictive factors of acute skin toxicity in an unselected cohort of breast cancer patients receiving radiotherapy. Radiother Oncol 2004; 69:145-53. [PMID: 14643951 DOI: 10.1016/s0167-8140(03)00166-x] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
BACKGROUND AND PURPOSE Intrinsic and extrinsic factors can affect the occurrence of side effects of radiotherapy. The influence of therapy modalities, personal characteristics and individual DNA repair capacity on the risk of acute skin toxicity was thus evaluated. MATERIALS AND METHODS In a prospective study of 478 female breast cancer patients receiving adjuvant radiotherapy of the breast after breast-conserving surgery, acute skin toxicity was documented systematically using a modified version of the common toxicity criteria. Prognostic personal and treatment characteristics were identified for the entire cohort. Individual DNA repair capacity was determined in a subgroup of 113 patients with alkaline comet assay using phytohemagglutinin stimulated lymphocytes. Using proportional hazards analysis to account for cumulative biologically effective radiation dose, the hazard for the development of acute skin reactions (moist desquamation) associated with DNA repair capacity was modeled. RESULTS Of the 478 participants, 84 presented with acute reactions by the end of treatment. Higher body mass index was significantly associated with an increased risk for acute reactions (hazard ratio=1.09 per 1 kg/m(2)), adjusted for treating hospital and photon beam quality. The comet assay parameters examined, including background DNA damage in non-irradiated cells, DNA damage induced by 5 Gy, and DNA repair capacity, were not significantly associated with risk of acute skin toxicity. CONCLUSIONS Higher BMI is predictive of acute skin toxicity, however, individual repair parameters as determined by the alkaline comet assay are not informative enough. More comprehensive analyses including late effects of radiotherapy and repair kinetics optimized for different radiation-induced DNA lesions are warranted.
Collapse
Affiliation(s)
- Dorothee Twardella
- German Cancer Research Center, Division of Clinical Epidemiology, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
524
|
van Dijk MR, Steyerberg EW, Stenning SP, Dusseldorp E, Habbema JDF. Survival of patients with nonseminomatous germ cell cancer: a review of the IGCC classification by Cox regression and recursive partitioning. Br J Cancer 2004; 90:1176-83. [PMID: 15026798 PMCID: PMC2409665 DOI: 10.1038/sj.bjc.6601665] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
The International Germ Cell Consensus (IGCC) classification identifies good, intermediate and poor prognosis groups among patients with metastatic nonseminomatous germ cell tumours (NSGCT). It uses the risk factors primary site, presence of nonpulmonary visceral metastases and tumour markers alpha-fetoprotein (AFP), human chorionic gonadotrophin (HCG) and lactic dehydrogenase (LDH). The IGCC classification is easy to use and remember, but lacks flexibility. We aimed to examine the extent of any loss in discrimination within the IGCC classification in comparison with alternative modelling by formal weighing of the risk factors. We analysed survival of 3048 NSGCT patients with Cox regression and recursive partitioning for alternative classifications. Good, intermediate and poor prognosis groups were based on predicted 5-year survival. Classifications were further refined by subgrouping within the poor prognosis group. Performance was measured primarily by a bootstrap corrected c-statistic to indicate discriminative ability for future patients. The weights of the risk factors in the alternative classifications differed slightly from the implicit weights in the IGCC classification. Discriminative ability, however, did not increase clearly (IGCC classification, c=0.732; Cox classification, c=0.730; Recursive partitioning classification, c=0.709). Three subgroups could be identified within the poor prognosis groups, resulting in classifications with five prognostic groups and slightly better discriminative ability (c=0.740). In conclusion, the IGCC classification in three prognostic groups is largely supported by Cox regression and recursive partitioning. Cox regression was the most promising tool to define a more refined classification. British Journal of Cancer (2004) 90, 1176-1183. doi:10.1038/sj.bjc.6601665 www.bjcancer.com Published online 24 February 2004
Collapse
Affiliation(s)
- M R van Dijk
- Department of Public Health, Erasmus MC - University Medical Center Rotterdam, PO Box 1738, 3000 DR Rotterdam, The Netherlands.
| | | | | | | | | |
Collapse
|
525
|
Wang D, Zhang W, Bakhai A. Comparison of Bayesian model averaging and stepwise methods for model selection in logistic regression. Stat Med 2004; 23:3451-67. [PMID: 15505893 DOI: 10.1002/sim.1930] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Logistic regression is the standard method for assessing predictors of diseases. In logistic regression analyses, a stepwise strategy is often adopted to choose a subset of variables. Inference about the predictors is then made based on the chosen model constructed of only those variables retained in that model. This method subsequently ignores both the variables not selected by the procedure, and the uncertainty due to the variable selection procedure. This limitation may be addressed by adopting a Bayesian model averaging approach, which selects a number of all possible such models, and uses the posterior probabilities of these models to perform all inferences and predictions. This study compares the Bayesian model averaging approach with the stepwise procedures for selection of predictor variables in logistic regression using simulated data sets and the Framingham Heart Study data. The results show that in most cases Bayesian model averaging selects the correct model and out-performs stepwise approaches at predicting an event of interest.
Collapse
Affiliation(s)
- Duolao Wang
- Department of Epidemiology and Population Health, Medical Statistics Unit, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK.
| | | | | |
Collapse
|
526
|
Abstract
PURPOSE Neuropathic foot ulcers are a serious complication of diabetes. The purpose of this study was to develop a clinically useful prognostic model for identifying ulcers that are not likely to heal. METHODS Using an administrative and medical records database from a large wound care system, we designed a cohort study of patients with diabetic neuropathic foot ulcer. Clinicians followed a standard algorithm of good wound care, wound débridement, and wound offloading. The outcome was a healed wound by week 20 of care. For patients with more than one wound, we investigate the wound labeled as the primary wound. We evaluated several prognostic models of varying mathematical complexity. RESULTS We studied 27630 patients with a diabetic neuropathic foot ulcer, of whom 12983 (47%) healed by week 20 of care. The simplest model counted 1 point each if the wound was older than 2 months, larger than 2 cm(2), or had a grade > or =3 (on a 6-point scale). The likelihood that a wound would not heal was 0.35 for a count of 0, 0.47 for a count of 1, 0.66 for a count of 2, and 0.81 for a count of 3 in the validation data set. CONCLUSION A simple prognostic model can be developed using prognostic factors that are already part of the wound care examination. Applications of this model could include determining who will do well with standard care and as an aid in the design of clinical trials.
Collapse
Affiliation(s)
- David J Margolis
- Department of Dermatology, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104, USA.
| | | | | | | |
Collapse
|
527
|
Bleeker SE, Moll HA, Steyerberg EW, Donders ART, Derksen-Lubsen G, Grobbee DE, Moons KGM. External validation is necessary in prediction research:. J Clin Epidemiol 2003; 56:826-32. [PMID: 14505766 DOI: 10.1016/s0895-4356(03)00207-5] [Citation(s) in RCA: 481] [Impact Index Per Article: 22.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
BACKGROUND AND OBJECTIVES Prediction models tend to perform better on data on which the model was constructed than on new data. This difference in performance is an indication of the optimism in the apparent performance in the derivation set. For internal model validation, bootstrapping methods are recommended to provide bias-corrected estimates of model performance. Results are often accepted without sufficient regard to the importance of external validation. This report illustrates the limitations of internal validation to determine generalizability of a diagnostic prediction model to future settings. METHODS A prediction model for the presence of serious bacterial infections in children with fever without source was derived and validated internally using bootstrap resampling techniques. Subsequently, the model was validated externally. RESULTS In the derivation set (n=376), nine predictors were identified. The apparent area under the receiver operating characteristic curve (95% confidence interval) of the model was 0.83 (0.78-0.87) and 0.76 (0.67-0.85) after bootstrap correction. In the validation set (n=179) the performance was 0.57 (0.47-0.67). CONCLUSION For relatively small data sets, internal validation of prediction models by bootstrap techniques may not be sufficient and indicative for the model's performance in future patients. External validation is essential before implementing prediction models in clinical practice.
Collapse
Affiliation(s)
- S E Bleeker
- Erasmus Medical Center/Sophia Children's Hospital Department of Pediatrics, Room Sp 1545 Dr Molewaterplein 60, 3015 GJ Rotterdam, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
528
|
Blackstone EH, Cosgrove DM, Jamieson WRE, Birkmeyer NJ, Lemmer JH, Miller DC, Butchart EG, Rizzoli G, Yacoub M, Chai A. Prosthesis size and long-term survival after aortic valve replacement. J Thorac Cardiovasc Surg 2003; 126:783-96. [PMID: 14502155 DOI: 10.1016/s0022-5223(03)00591-9] [Citation(s) in RCA: 161] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVE This study was undertaken to quantify the relationship between prosthesis size adjusted for patient size (prosthesis-patient size) and long-term survival after aortic valve replacement. METHODS Data from nine representative sources on 13,258 aortic valve replacements provided 69,780 patient-years of follow-up (mean 5.3 +/- 4.7 years), with reliable survival estimates to 15 years. Prostheses included 5757 stented porcine xenografts, 3198 stented bovine pericardial xenografts, 3583 mechanical valves, and 720 allografts. Manufacturers' labeled prosthesis size was 19 mm or smaller in 1109 patients. Expressions of prosthesis-patient size assessed were indexed internal prosthesis orifice area (in centimeters squared per square meter of body surface area) and standardized internal prosthesis orifice size (Z, the number of SDs from mean normal native aortic valve size). Multivariable hazard domain analysis with balancing score and risk factor adjustment quantified the association of prosthesis-patient size with survival. RESULTS Prosthesis-patient size down to at least 1.1 cm(2)/m(2) or -3 Z did not adversely affect intermediate- or long-term survival (P >.2). However, 30-day mortality increased 1% to 2% when indexed orifice area fell below 1.2 cm(2)/m(2) (P =.002) or standardized orifice size fell below -2.5 Z (P =.0003). The increased early risk affected fewer than 1% of patients receiving bioprostheses but about 25% of those receiving mechanical devices. CONCLUSIONS Aortic prosthesis-patient size down to 1.1 cm(2)/m(2) or -3 Z did not reduce intermediate- or long-term survival after aortic valve replacement. However, patient-prosthesis size under 1.2 cm(2)/m(2) or -2.5 Z was associated with a 1% to 2% increase in 30-day mortality. Prosthesis-patient sizes this small or smaller were rarely implanted in patients receiving bioprostheses.
Collapse
Affiliation(s)
- Eugene H Blackstone
- Department of Thoracic and Cardiovascular Surgery, The Cleveland Clinic Foundation, 9500 Euclid Avenue, Desk F25, Cleveland, OH 44195, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
529
|
Terrin N, Schmid CH, Griffith JL, D'Agostino RB, Selker HP. External validity of predictive models: a comparison of logistic regression, classification trees, and neural networks. J Clin Epidemiol 2003; 56:721-9. [PMID: 12954463 DOI: 10.1016/s0895-4356(03)00120-3] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
BACKGROUND AND OBJECTIVE The utility of predictive models depends on their external validity, that is, their ability to maintain accuracy when applied to patients and settings different from those on which the models were developed. We report a simulation study that compared the external validity of standard logistic regression (LR1), logistic regression with piecewise-linear and quadratic terms (LR2), classification trees, and neural networks (NNETs). METHODS We developed predictive models on data simulated from a specified population and on data from perturbed forms of the population not representative of the original distribution. All models were tested on new data generated from the population. RESULTS The performance of LR2 was superior to that of the other model types when the models were developed on data sampled from the population (mean receiver operating characteristic [ROC] areas 0.769, 0.741, 0.724, and 0.682, for LR2, LR1, NNETs, and trees, respectively) and when they were developed on nonrepresentative data (mean ROC areas 0.734, 0.713, 0.703, and 0.667). However, when the models developed using nonrepresentative data were compared with models developed from data sampled from the population, LR2 had the greatest loss in performance. CONCLUSION Our results highlight the necessity of external validation to test the transportability of predictive models.
Collapse
Affiliation(s)
- Norma Terrin
- Division of Clinical Care Research, Department of Medicine, Tufts-New England Medical Center, and Tufts University School of Medicine, 750 Washington Street, Boston, MA 02111, USA.
| | | | | | | | | |
Collapse
|
530
|
Svensson LG, Blackstone EH, Cosgrove DM. Surgical options in young adults with aortic valve disease. Curr Probl Cardiol 2003; 28:417-80. [PMID: 14647130 DOI: 10.1016/j.cpcardiol.2003.08.002] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Lars G Svensson
- Department of Thoracic and Cardiovascular Surgery, The Cleveland Clinic Foundation, Ohio 44195, USA.
| | | | | |
Collapse
|
531
|
Internal and external validation of predictive models: a simulation study of bias and precision in small samples. J Clin Epidemiol 2003; 56:441-7. [PMID: 12812818 DOI: 10.1016/s0895-4356(03)00047-7] [Citation(s) in RCA: 382] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We performed a simulation study to investigate the accuracy of bootstrap estimates of optimism (internal validation) and the precision of performance estimates in independent validation samples (external validation). We combined two data sets containing children presenting with fever without source (n=376+179=555; 120 bacterial infections). Random samples were drawn from this combined data set for the development (n=376) and validation (n=179) of logistic regression models. The models included statistically significant predictors for infection selected from a set of 57 candidate predictors. Model development, including the selection of predictors, and validation were repeated in a bootstrapping procedure. The resulting expected optimism estimate in the receiver operating characteristic (ROC) area was compared with the observed optimism according to independent validation samples. The average apparent ROC area was 0.74, which was expected (based on bootstrapping) to decrease by 0.07 to 0.67, whereas the observed decrease in the validation samples was 0.09 to 0.65. Omitting the selection of predictors from the bootstrap procedure led to a severe underestimation of the optimism (decrease 0.006). The standard error of the observed ROC area in the independent validation samples was large (0.05). We recommend bootstrapping for internal validation because it gives reasonably valid estimates of the expected optimism in predictive performance provided that any selection of predictors is taken into account. For external validation, substantial sample sizes should be used for sufficient power to detect clinically important changes in performance as compared with the internally validated estimate.
Collapse
|
532
|
Ambler G, Brady AR, Royston P. Simplifying a prognostic model: a simulation study based on clinical data. Stat Med 2002; 21:3803-22. [PMID: 12483768 DOI: 10.1002/sim.1422] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Prognostic models are designed to predict a clinical outcome in individuals or groups of individuals with a particular disease or condition. To avoid bias many researchers advocate the use of full models developed by prespecifying predictors. Variable selection is not employed and the resulting models may be large and complicated. In practice more parsimonious models that retain most of the prognostic information may be preferred. We investigate the effect on various performance measures, including mean square error and prognostic classification, of three methods for estimating full models (including penalized estimation and Tibshirani's lasso) and consider two methods (backwards elimination and a new proposal called stepdown) for simplifying full models. Simulation studies based on two medical data sets suggest that simplified models can be found that perform nearly as well as, or sometimes even better than, full models. Optimizing the Akaike information criterion appears to be appropriate for choosing the degree of simplification.
Collapse
Affiliation(s)
- Gareth Ambler
- Department of Statistical Science, University College, 1-19 Torrington Place, London WC1E 7HB, UK.
| | | | | |
Collapse
|
533
|
Tabaei BP, Herman WH. A multivariate logistic regression equation to screen for diabetes: development and validation. Diabetes Care 2002; 25:1999-2003. [PMID: 12401746 DOI: 10.2337/diacare.25.11.1999] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
OBJECTIVE To develop and validate an empirical equation to screen for diabetes. RESEARCH DESIGN AND METHODS A predictive equation was developed using multiple logistic regression analysis and data collected from 1,032 Egyptian subjects with no history of diabetes. The equation incorporated age, sex, BMI, postprandial time (self-reported number of hours since last food or drink other than water), and random capillary plasma glucose as independent covariates for prediction of undiagnosed diabetes. These covariates were based on a fasting plasma glucose level >/=126 mg/dl and/or a plasma glucose level 2 h after a 75-g oral glucose load >/=200 mg/dl. The equation was validated using data collected from an independent sample of 1,065 American subjects. Its performance was also compared with that of recommended and proposed static plasma glucose cut points for diabetes screening. RESULTS The predictive equation was calculated with the following logistic regression parameters: P = 1/(1 - e(-x)), where x = -10.0382 + [0.0331 (age in years) + 0.0308 (random plasma glucose in mg/dl) + 0.2500 (postprandial time assessed as 0 to >/=8 h) + 0.5620 (if female) + 0.0346 (BMI)]. The cut point for the prediction of previously undiagnosed diabetes was defined as a probability value >/=0.20. The equation's sensitivity was 65%, specificity 96%, and positive predictive value (PPV) 67%. When applied to a new sample, the equation's sensitivity was 62%, specificity 96%, and PPV 63%. CONCLUSIONS This multivariate logistic equation improves on currently recommended methods of screening for undiagnosed diabetes and can be easily implemented in a inexpensive handheld programmable calculator to predict previously undiagnosed diabetes.
Collapse
Affiliation(s)
- Bahman P Tabaei
- Department of Internal Medicine, University of Michigan Health System, Ann Arbor, Michigan 48109, USA
| | | |
Collapse
|
534
|
Wählby U, Jonsson EN, Karlsson MO. Comparison of stepwise covariate model building strategies in population pharmacokinetic-pharmacodynamic analysis. AAPS PHARMSCI 2002; 4:E27. [PMID: 12645999 PMCID: PMC2751316 DOI: 10.1208/ps040427] [Citation(s) in RCA: 156] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The aim of this study was to compare 2 stepwise covariate model-building strategies, frequently used in the analysis of pharmacokinetic-pharmacodynamic (PK-PD) data using nonlinear mixed-effects models, with respect to included covariates and predictive performance. In addition, the effects of stepwise regression on the estimated covariate coefficients were assessed. Using simulated and real PK data, covariate models were built applying (1) stepwise generalized additive models (GAM) for identifying potential covariates, followed by backward elimination in the computer program NONMEM, and (2) stepwise forward inclusion and backward elimination in NONMEM. Different versions of these procedures were tried (eg, treating different study occasions as separate individuals in the GAM, or fixing a part of the parameters when the NONMEM procedure was used). The final covariate models were compared, including their ability to predict a separate data set or their performance in cross-validation. The bias in the estimated coefficients (selection bias) was assessed. The model-building procedures performed similarly in the data sets explored. No major differences in the resulting covariate models were seen, and the predictive performances overlapped. Therefore, the choice of model-building procedure in these examples could be based on other aspects such as analyst- and computer-time efficiency. There was a tendency to selection bias in the estimates, although this was small relative to the overall variability in the estimates. The predictive performances of the stepwise models were also reasonably good. Thus, selection bias seems to be a minor problem in this typical PK covariate analysis.
Collapse
Affiliation(s)
- Ulrika Wählby
- Division of Pharmacokinetics and Drug Therapy, Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24 Uppsala, Sweden.
| | | | | |
Collapse
|
535
|
Margolis DJ, Bilker W, Boston R, Localio R, Berlin JA. Statistical characteristics of area under the receiver operating characteristic curve for a simple prognostic model using traditional and bootstrapped approaches. J Clin Epidemiol 2002; 55:518-24. [PMID: 12007556 DOI: 10.1016/s0895-4356(01)00512-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Abstract
Prognostic models are increasingly common in the biomedical literature. These models are frequently evaluated with respect to their ability to discriminate between those with and without an outcome. The area under the receiver-operating curve (AROC) is often used to assess discrimination. In this study, we introduce a bootstrap method, and, using Monte Carlo simulation, we compare three different bootstrap approaches with four commonly used methods in their ability to accurately estimate 95% confidence intervals (CIs) around the AROC for a simple prognostic model. We also evaluated the power of a bootstrap method and the commonly used trapezoid rule to compare different prognostic models. We show that several good methods exist for calculating 95% CIs of AROC, but the maximum likelihood estimation method should not be used with small sample sizes. We further show that for our simple prognostic model a bootstrap z-statistic approach is preferred over the trapezoidal method when comparing the AROCs of two related models.
Collapse
Affiliation(s)
- David J Margolis
- Department of Dermatology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, 423 Guardian Drive, Philadelphia, PA 19004, USA
| | | | | | | | | |
Collapse
|
536
|
Sherer M, Sander AM, Nick TG, High WM, Malec JF, Rosenthal M. Early cognitive status and productivity outcome after traumatic brain injury: findings from the TBI model systems. Arch Phys Med Rehabil 2002; 83:183-92. [PMID: 11833021 DOI: 10.1053/apmr.2002.28802] [Citation(s) in RCA: 122] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
OBJECTIVE To evaluate the contribution of early cognitive assessment to the prediction of productivity outcome after traumatic brain injury (TBI) adjusted for severity of injury, demographic factors, and preinjury employment status. DESIGN Inception cohort. SETTING Six inpatient brain injury rehabilitation programs. PARTICIPANTS A total of 388 adults with TBI whose posttraumatic amnesia (PTA) resolved before discharge from inpatient rehabilitation. INTERVENTIONS Administered neuropsychologic tests during inpatient stay on emergence from PTA. Follow-up interview and evaluation. Predictor measures also determined. MAIN OUTCOME MEASURE Productivity status at follow-up 12 months postinjury. RESULTS Multiple logistic regression analysis revealed that preinjury productivity status, duration of PTA, education level, and early cognitive status each made significant, independent contributions to the prediction of productivity status at follow-up. When adjusted for all other predictors, persons scoring at the 75th percentile on early cognitive status (less impaired) had 1.61 times greater odds (95% confidence interval [CI], 1.07-2.41) of being productive follow-up than those scoring at the 25th percentile (more impaired). Without adjustment, persons scoring at the 75th percentile had 2.46 times greater odds (95% CI, 1.77-3.43) of being productive at follow-up. CONCLUSIONS Findings support the utility of early cognitive assessment by using neuropsychologic tests. In addition to other benefits, early cognitive assessment makes an independent contribution to prediction of late outcome. Findings support the clinical practice of performing initial neuropsychologic evaluations after resolution of PTA.
Collapse
Affiliation(s)
- Mark Sherer
- Methodist Rehabilitation Center, Jackson, MS 39216, USA.
| | | | | | | | | | | |
Collapse
|
537
|
Steyerberg EW, Vergouwe Y, Keizer HJ, Habbema JD. Residual mass histology in testicular cancer: development and validation of a clinical prediction rule. Stat Med 2001; 20:3847-59. [PMID: 11782038 DOI: 10.1002/sim.915] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
After chemotherapy for metastatic non-seminomatous testicular cancer, surgical resection is a generally accepted treatment to remove remnants of the initial metastases, since residual tumour may still be present (mature teratoma or viable cancer cells). In this paper, we review the development and external validation of a logistic regression model to predict the absence of residual tumour. Three sources of information were used. A quantitative review identified six relevant predictors from 19 published studies (996 resections). Second, a development data set included individual data of 544 patients from six centres. This data set was used to assess the predictive relationships of five continuous predictors, which resulted in dichotomization for two, and a log, square root, and linear transformation for three other predictors. The multiple logistic regression coefficients were reduced with a shrinkage factor (0.95) to improve calibration, based on a bootstrapping procedure. Third, a validation data set included 172 more recently treated patients. The model showed adequate calibration and good discrimination in the development and in the validation sample (areas under the ROC curve 0.83 and 0.82). This study illustrates that a careful modelling strategy may result in an adequate predictive model. Further study of model validity may stimulate application in clinical practice.
Collapse
Affiliation(s)
- E W Steyerberg
- Center for Clinical Decision Sciences, Department of Public Health, Erasmus Medical Center Rotterdam, P.O. Box 1738, 3000 DR Rotterdam, The Netherlands.
| | | | | | | |
Collapse
|
538
|
Ambalavanan N, Carlo WA. Comparison of the prediction of extremely low birth weight neonatal mortality by regression analysis and by neural networks. Early Hum Dev 2001; 65:123-37. [PMID: 11641033 DOI: 10.1016/s0378-3782(01)00228-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
AIMS To compare the prediction of mortality in individual extremely low birth weight (ELBW) neonates by regression analysis and by artificial neural networks. STUDY DESIGN A database of 23 variables on 810 ELBW neonates admitted to a tertiary care center was divided into training, validation, and test sets. Logistic regression and neural network models were developed on the training set, validated, and outcome (mortality) predicted on the test set. Stepwise regression identified significant variables in the full set. Regression models and neural networks were then tested using data sets with only the identified significant variables, and then with variables excluded one at a time. RESULTS The area under the curve (AUC) of receiver operating characteristic (ROC) curves for neural networks and regression was similar (AUC 0.87+/-0.03; p=0.31). Birthweight or gestational age and the 5-min Apgar score contributed most to AUC. CONCLUSIONS Both neural networks and regression analysis predicted mortality with reasonable accuracy. For both models, analyzing selected variables was superior to full data set analysis. We speculate neural networks may not be superior to regression when no clear non-linear relationships exist.
Collapse
Affiliation(s)
- N Ambalavanan
- Division of Neonatology, Department of Pediatrics, University of Alabama at Birmingham, 525 New Hillman Bldg., 619 South 19th Street, Birmingham, AL 35233-7335, USA.
| | | |
Collapse
|
539
|
|
540
|
Kumar R, McKinney WP, Raj G, Heudebert GR, Heller HJ, Koetting M, McIntire DD. Adverse cardiac events after surgery: assessing risk in a veteran population. J Gen Intern Med 2001; 16:507-18. [PMID: 11556926 PMCID: PMC1495256 DOI: 10.1046/j.1525-1497.2001.016008507.x] [Citation(s) in RCA: 88] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
OBJECTIVE To establish rates of and risk factors for cardiac complications after noncardiac surgery in veterans. DESIGN Prospective cohort study. SETTING A large urban veterans affairs hospital. PARTICIPANTS One thousand patients with known or suspected cardiac problems undergoing 1,121 noncardiac procedures. MEASUREMENTS Patients were assessed preoperatively for important clinical variables. Postoperative evaluation was done by an assessor blinded to preoperative status with a daily physical examination, electrocardiogram, and creatine kinase with MB fraction until postoperative day 6, day of discharge, death, or reoperation (whichever occurred earliest). Serial electrocardiograms, enzymes, and chest radiographs were obtained as indicated. Severe cardiac complications included cardiac death, cardiac arrest, myocardial infarction, ventricular tachycardia, and fibrillation and pulmonary edema. Serious cardiac complications included the above, heart failure, and unstable angina. MAIN RESULTS Severe and serious complications were seen in 24% and 32% of aortic, 8.3% and 10% of carotid, 11.8% and 14.7% of peripheral vascular, 9.0% and 13.1% of intraabdominal/intrathoracic, 2.9% and 3.3% of intermediate-risk (head and neck and major orthopedic procedures), and 0.27% and 1.1% of low-risk procedures respectively. The five associated patient-specific risk factors identified by logistic regression are: myocardial infarction < 6 months (odds ratio [OR], 4.5; 95% confidence interval [CI], 1.9 to 12.9), emergency surgery (OR, 2.6; 95% CI, 1.2 to 5.6), myocardial infarction > 6 months (OR, 2.2; 95% CI, 1.4 to 3.5), heart failure ever (OR, 1.9; 95% CI, 1.2 to 3.0), and rhythm other than sinus (OR, 1.7; 95% CI, 0.9 to 3.2). Inclusion of the planned operative procedure significantly improves the predictive ability of our risk model. CONCLUSIONS Five patient-specific risk factors are associated with high risk for cardiac complications in the perioperative period of noncardiac surgery in veterans. Inclusion of the operative procedure significantly improves the predictive ability of the risk model. Overall cardiac complication rates (pretest probabilities) are established for these patients. A simple nomogram is presented for calculation of post-test probabilities by incorporating the operative procedure.
Collapse
Affiliation(s)
- R Kumar
- Received from the Section of General Internal Medicine, Department of Internal Medicine, Veterans Affairs Medical Center, U.T. Southwestern Medical School, Dallas, TX, USA.
| | | | | | | | | | | | | |
Collapse
|
541
|
Steyerberg EW, Harrell FE, Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema JD. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol 2001; 54:774-81. [PMID: 11470385 DOI: 10.1016/s0895-4356(01)00341-9] [Citation(s) in RCA: 1751] [Impact Index Per Article: 76.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The performance of a predictive model is overestimated when simply determined on the sample of subjects that was used to construct the model. Several internal validation methods are available that aim to provide a more accurate estimate of model performance in new subjects. We evaluated several variants of split-sample, cross-validation and bootstrapping methods with a logistic regression model that included eight predictors for 30-day mortality after an acute myocardial infarction. Random samples with a size between n = 572 and n = 9165 were drawn from a large data set (GUSTO-I; n = 40,830; 2851 deaths) to reflect modeling in data sets with between 5 and 80 events per variable. Independent performance was determined on the remaining subjects. Performance measures included discriminative ability, calibration and overall accuracy. We found that split-sample analyses gave overly pessimistic estimates of performance, with large variability. Cross-validation on 10% of the sample had low bias and low variability, but was not suitable for all performance measures. Internal validity could best be estimated with bootstrapping, which provided stable estimates with low bias. We conclude that split-sample validation is inefficient, and recommend bootstrapping for estimation of internal validity of a predictive logistic regression model.
Collapse
Affiliation(s)
- E W Steyerberg
- Center for Clinical Decision Sciences, Ee 2091, Department of Public Health, Erasmus University, P.O. Box 1738, 3000 DR, Rotterdam, The Netherlands.
| | | | | | | | | | | |
Collapse
|
542
|
MURPHY MICHAEL, WANG DUOLAO. Do previous birth interval and mother's education influence infant survival? A Bayesian model averaging analysis of Chinese data. Population Studies 2001. [DOI: 10.1080/00324720127679] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|