1
|
Harris LK, Troelsen A, Terluin B, Gromov K, Ingelsrud LH. Minimal important change thresholds change over time after knee and hip arthroplasty. J Clin Epidemiol 2024; 169:111316. [PMID: 38458544 DOI: 10.1016/j.jclinepi.2024.111316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 02/27/2024] [Accepted: 02/29/2024] [Indexed: 03/10/2024]
Abstract
OBJECTIVES The minimal important change (MIC) reflects what patients, on average, consider the smallest improvement in a score that is important to them. MIC thresholds may vary across patient populations, interventions used, posttreatment time points and derivation methods. We determine and compare MIC thresholds for the Oxford Knee Score and Oxford Hip Score (OKS/OHS) at 3 months postoperatively to 12- and 24-month thresholds in patients undergoing knee or hip arthroplasty. STUDY DESIGN AND SETTING This cohort study used data from patients undergoing total knee arthroplasty (TKA), unicompartmental knee arthroplasty (UKA), or total hip arthroplasty (THA) at a public hospital between February 2016 and February 2023. At 3, 12, and 24 months postoperatively, patients responded to the OKS/OHS and a 7-point anchor question determining experienced changes in knee or hip pain and functional limitations. We used the adjusted predictive modeling method that accounts for the proportion improved and the reliability of the anchor question to determine MIC thresholds and their mean differences between time points. RESULTS Complete data were obtained from 695/957 (73%), 1179/1703 (69%), and 1080/1607 (67%) patients undergoing TKA, 474/610 (78%), 438/603 (73%), and 355/507 (70%) patients undergoing UKA, and 965/1315 (73%), 978/1409 (69%), and 1059/1536 (69%) patients undergoing THA at 3, 12, and 24 months, respectively. The median age ranged from 68 to 70 years and 55% to 60% were females. The proportions improved ranged between 83% and 95%. The OKS/OHS MIC thresholds were 0.1, 4.2, and 5.1 for TKA, 1.8, 5.6, and 3.4 for UKA, and 1.3, 6.1, and 6.0 for THA at 3, 12, and 24 months postoperatively, respectively. The reliability ranged between 0.64 and 0.82, and the MIC values increased between three and 12 months but not between 12 and 24 months. CONCLUSION Any absence of deterioration in pain and function is considered important at 3 months after knee or hip arthroplasty. Increasing thresholds over time suggest patients raise their standards for what constitutes a minimal important improvement over the first postoperative year. Besides improving our understanding of patients' views on postoperative outcomes, these clinical thresholds may aid in interpreting registry-based treatment outcome evaluations.
Collapse
Affiliation(s)
- Lasse K Harris
- Department of Orthopaedic Surgery, Copenhagen University Hospital Hvidovre, Copenhagen, Denmark; Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Anders Troelsen
- Department of Orthopaedic Surgery, Copenhagen University Hospital Hvidovre, Copenhagen, Denmark; Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Berend Terluin
- Department of General Practice, Amsterdam UMC Location, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands; Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Kirill Gromov
- Department of Orthopaedic Surgery, Copenhagen University Hospital Hvidovre, Copenhagen, Denmark; Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Lina H Ingelsrud
- Department of Orthopaedic Surgery, Copenhagen University Hospital Hvidovre, Copenhagen, Denmark
| |
Collapse
|
2
|
Harrison CJ, Plessen CY, Liegl G, Rodrigues JN, Sabah SA, Beard DJ, Fischer F. Item response theory assumptions were adequately met by the Oxford hip and knee scores. J Clin Epidemiol 2023; 158:166-176. [PMID: 37105320 DOI: 10.1016/j.jclinepi.2023.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 04/12/2023] [Accepted: 04/19/2023] [Indexed: 04/29/2023]
Abstract
OBJECTIVES To develop item response theory (IRT) models for the Oxford hip and knee scores which convert patient responses into continuous scores with quantifiable precision and provide these as web applications for efficient score conversion. STUDY DESIGN AND SETTING Data from the National Health Service patient-reported outcome measures program were used to test the assumptions of IRT (unidimensionality, monotonicity, local independence, and measurement invariance) before fitting models to preoperative response patterns obtained from patients undergoing primary elective hip or knee arthroplasty. The hip and knee datasets contained 321,147 and 355,249 patients, respectively. RESULTS Scree plots, Kaiser criterion analyses, and confirmatory factor analyses confirmed unidimensionality and Mokken analysis confirmed monotonicity of both scales. In each scale, all item pairs shared a residual correlation of ≤ 0.20. At the test level, both scales showed measurement invariance by age and gender. Both scales provide precise measurement in preoperative settings but demonstrate poorer precision and ceiling effects in postoperative settings. CONCLUSION We provide IRT parameters and web applications that can convert Oxford Hip Score or Oxford Knee Score response sets into continuous measurements and quantify individual measurement error. These can be used in sensitivity analyses or to administer truncated and individualized computerized adaptive tests.
Collapse
Affiliation(s)
- Conrad J Harrison
- Surgical Intervention Trials Unit, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK.
| | - Constantin Yves Plessen
- Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany
| | - Gregor Liegl
- Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany
| | - Jeremy N Rodrigues
- Clinical Trials Unit, University of Warwick, Coventry, UK; Department of Plastic Surgery, Stoke Mandeville Hospital, Buckinghamshire Hospitals NHS Trust, Aylesbury, UK
| | - Shiraz A Sabah
- Surgical Intervention Trials Unit, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - David J Beard
- Surgical Intervention Trials Unit, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Felix Fischer
- Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany
| |
Collapse
|
3
|
Morales-Murillo CP, García-Grau P, McWilliam RA, Grau Sevilla MD. Rasch Analysis of Authentic Evaluation of Young Children's Functioning in Classroom Routines. Front Psychol 2021; 12:615489. [PMID: 33854460 PMCID: PMC8039286 DOI: 10.3389/fpsyg.2021.615489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 02/26/2021] [Indexed: 11/24/2022] Open
Abstract
This study evaluated the functioning of children in early childhood education classroom routines, using the 3M Functioning in Preschool Routines Scale. A total of 366 children aged 36 to 70 months and 22 teachers from six early childhood education centers in Spain participated in the study. The authors used the Rasch model to determine the item fit and the difficulty of the items in relation to children's ability levels in this age range. The Rasch Differential Item Functioning (DIF) analysis by child age groups showed that the item difficulty differed according to the children's age and according to their levels of competence. The results of this study supported the reliability and validity of the 3M scale for assessing children's functioning in preschool classroom routines. A few items, however, were identified as needing to be reworded and more difficult items needed to be added to increase the scale difficulty level to match the performance of children with higher ability levels. The authors introduced the new and reworded items based on the results of this study and the corresponding ICF codes per item. Moreover, the authors indicate how to use the ICF Performance Qualifiers in relation to the 3M scale response categories for developing a functioning profile for the child.
Collapse
Affiliation(s)
| | - Pau García-Grau
- Campus Capacitas-Catholic University of Valencia San Vicente Mártir (UCV), Valencia, Spain.,Catholic University of Valencia San Vicente Mártir, Valencia, Spain
| | - R A McWilliam
- University of Alabama, Tuscaloosa, AL, United States
| | | |
Collapse
|
4
|
Hendriks AAJ, Smith SC, Chrysanthaki T, Cano SJ, Black N. DEMQOL and DEMQOL-Proxy: a Rasch analysis. Health Qual Life Outcomes 2017; 15:164. [PMID: 28830525 PMCID: PMC5567633 DOI: 10.1186/s12955-017-0733-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 08/02/2017] [Indexed: 11/30/2022] Open
Abstract
Background DEMQOL and DEMQOL-Proxy are widely used patient reported outcome measures (PROMs) of health related quality of life in people with dementia (PWD). Growing interest in routine use of PROMs in health care calls for more robust instruments that are potentially fit for reliable and valid comparisons at the micro-level (patients) and meso-level (clinics, hospitals, care homes). Methods We used modern psychometric methods (based on the Rasch model) to re-evaluate DEMQOL (1428 PWDs) and DEMQOL-Proxy (1022 carers) to ensure they are fit for purpose. We evaluated scale to sample targeting, ordering of item thresholds, item fit to the model, and differential item functioning (sex, age, relationship), local independence, unidimensionality and reliability on the full set of items and a smaller item set. Results For both DEMQOL and DEMQOL-Proxy the smaller item set performed better than the original item set. We developed revised scores using the items from the smaller set. Conclusions We have improved the scoring of DEMQOL and DEMQOL-Proxy using the Rasch measurement model. Future work should focus on the problems identified with content and response options.
Collapse
Affiliation(s)
- A A Jolijn Hendriks
- Department of Health Services Research and Policy, London School of Hygiene & Tropical Medicine, 15-17 Tavistock Place, London, WC1H 9SH, UK.
| | - Sarah C Smith
- Department of Health Services Research and Policy, London School of Hygiene & Tropical Medicine, 15-17 Tavistock Place, London, WC1H 9SH, UK
| | - Theopisti Chrysanthaki
- Department of Health Services Research and Policy, London School of Hygiene & Tropical Medicine, 15-17 Tavistock Place, London, WC1H 9SH, UK.,School of Health Sciences, Faculty of Health and Medical Sciences, University of Surrey, Guildford, Surrey, GU2 7XH, UK
| | - Stefan J Cano
- Modus Outcomes, Spirella Building, Letchworth Garden City, SG6 4ET, UK
| | - Nick Black
- Department of Health Services Research and Policy, London School of Hygiene & Tropical Medicine, 15-17 Tavistock Place, London, WC1H 9SH, UK
| |
Collapse
|
5
|
Vélez CM, Lugo-Agudelo LH, Hernández-Herrera GN, García-García HI. Colombian Rasch validation of KIDSCREEN-27 quality of life questionnaire. Health Qual Life Outcomes 2016; 14:67. [PMID: 27141836 PMCID: PMC4855364 DOI: 10.1186/s12955-016-0472-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 04/23/2016] [Indexed: 08/23/2023] Open
Abstract
BACKGROUND The family of KIDSCREEN instruments is the only one with trans-cultural adaptation and validation in Colombia. These validations have been performed from the classical test theory approach, which has evidenced satisfactory psychometric properties. The aim of this study was to evaluate psychometric properties of KIDSCREEN-27 children and parent-proxy versions, through Rasch analysis. METHODS The participants in the present study were two different sets of populations, 321 kids with a mean age of 12.3 (SD 2.6), 41 % 8 to 11 years old and 59 % 12 to 18 years old; and 1150 parent-proxy with an average age of 45.5 (SD 18.9). Psychometric properties were assessed using the partial credits model in the Rasch approach. Unidimensionality, fitting of person and item, response form, and differential item functioning (DIF) were measured. RESULTS The Infit MNSQ in child self-reported version that ranges between 0.71-1.76, and 0.69-1.31 in the parent-proxy version. Scores gathered on Likert forms of 5-response options, person separation was 2.08 for child self-reported version and 2.40 for parent-proxy; reliability was 0.81 and 0.85, respectively. Items reliability was 0.99 on both versions, with separations of 11.92 for child self-reported and 10.83 for parent-proxy. There was not DIF according to the variables sex and age but was present according to socioeconomic status. CONCLUSION There was a good fit for items and individuals to the Rasch model. Item separation was adecuate, and person separation improved when the response form was re-codified to four options. The presence of DIF according to socioeconomic status implies a scale's bias in the measure of HRQoL of Colombian children.
Collapse
Affiliation(s)
- Claudia-Marcela Vélez
- Group of Clinical Epidemiology (GRAEPIC), School of Medicine, University of Antioquia, Medellín, Colombia.,Department of Clinical Epidemiology and Biostatistics, McMaster University, 1280, Main, Hamilton, Ontario, Canada
| | - Luz-Helena Lugo-Agudelo
- Academic Group of Clinical Epidemiology (GRAEPIC) and Group Health Rehabilitation, School of Medicine, University of Antioquia, Carrera 51 D # 62-29, Medellín, Colombia
| | | | - Héctor-Iván García-García
- Academic Group of Clinical Epidemiology (GRAEPIC) and Group Health Rehabilitation, School of Medicine, University of Antioquia, Carrera 51 D # 62-29, Medellín, Colombia.
| |
Collapse
|
6
|
Lo BCY, Zhao Y, Kwok AWY, Chan W, Chan CKY. Evaluation of the Psychometric Properties of the Asian Adolescent Depression Scale and Construction of a Short Form: An Item Response Theory Analysis. Assessment 2015; 24:660-676. [PMID: 26603116 DOI: 10.1177/1073191115614393] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The present study applied item response theory to examine the psychometric properties of the Asian Adolescent Depression Scale and to construct a short form among 1,084 teenagers recruited from secondary schools in Hong Kong. Findings suggested that some items of the full form reflected higher levels of severity and were more discriminating than others, and the Asian Adolescent Depression Scale was useful in measuring a broad range of depressive severity in community youths. Differential item functioning emerged in several items where females reported higher depressive severity than males. In the short form construction, preliminary validation suggested that, relative to the 20-item full form, our derived short form offered significantly greater diagnostic performance and stronger discriminatory ability in differentiating depressed and nondepressed groups, and simultaneously maintained adequate measurement precision with a reduced response burden in assessing depression in the Asian adolescents. Cultural variance in depressive symptomatology and clinical implications are discussed.
Collapse
Affiliation(s)
| | - Yue Zhao
- 1 The University of Hong Kong, Hong Kong SAR
| | | | - Wai Chan
- 3 The Chinese University of Hong Kong, Hong Kong SAR
| | | |
Collapse
|
7
|
Lee M, Zhu W, Ackley-Holbrook E, Brower DG, McMurray B. Calibration and validation of the Physical Activity Barrier Scale for persons who are blind or visually impaired. Disabil Health J 2014; 7:309-17. [DOI: 10.1016/j.dhjo.2014.02.004] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2013] [Revised: 12/28/2013] [Accepted: 02/11/2014] [Indexed: 11/16/2022]
|
8
|
Anatchkova MD, Barysauskas CM, Kinney RL, Kiefe CI, Ash AS, Lombardini L, Allison JJ. Psychometric evaluation of the Care Transition Measure in TRACE-CORE: do we need a better measure? J Am Heart Assoc 2014; 3:e001053. [PMID: 24901109 PMCID: PMC4309102 DOI: 10.1161/jaha.114.001053] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Background The quality of transitional care is associated with important health outcomes such as rehospitalization and costs. The widely used Care Transitions Measure (CTM‐15) was developed with a classic test theory approach; its short version (CTM‐3) was included in the CAHPS Hospital Survey. We conducted a psychometric evaluation of both measures and explored whether item response theory (IRT) could produce a more precise measure. Methods and Results As part of the Transitions, Risks, and Actions in Coronary Events Center for Outcomes Research and Education, 1545 participants were interviewed during an acute coronary syndrome hospitalization, providing information on general health status (Short Form‐36), CTM‐15, health utilization, and care process questions at 1 month postdischarge. We used classic and IRT analyses and compared the measurement precision of CTM‐15–, CTM‐3–, and CTM‐IRT–based score using relative validity. Participants were 79% non‐Hispanic white and 67% male, with an average age of 62 years. The CTM‐15 had good internal consistency (Cronbach's α=0.95) but demonstrated acquiescence bias (8.7% participants responded “Strongly agree” and 19% responded “Agree” to all items) and limited score variability. These problems were more pronounced for the CTM‐3. The CTM‐15 differentiated between patient groups defined by self‐reported health status, health care utilization, and care transition process indicators. Differences between groups were small (2 to 3 points). There was no gain in measurement precision from IRT scoring. The CTM‐3 was not significantly lower for patients reporting rehospitalization or emergency department visits. Conclusion We identified psychometric challenges of the CTM, which may limit its value in research and practice. These results are in line with emerging evidence of gaps in the validity of the measure.
Collapse
Affiliation(s)
- Milena D Anatchkova
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA (M.D.A., R.L.K., C.I.K., A.S.A., L.L., J.J.A.)
| | - Constance M Barysauskas
- Department of Biostatistics and Computational Biology, Dana Farber Cancer Institute, Harvard Medical School, Boston, MA (C.M.B.)
| | - Rebecca L Kinney
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA (M.D.A., R.L.K., C.I.K., A.S.A., L.L., J.J.A.)
| | - Catarina I Kiefe
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA (M.D.A., R.L.K., C.I.K., A.S.A., L.L., J.J.A.)
| | - Arlene S Ash
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA (M.D.A., R.L.K., C.I.K., A.S.A., L.L., J.J.A.)
| | - Lisa Lombardini
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA (M.D.A., R.L.K., C.I.K., A.S.A., L.L., J.J.A.)
| | - Jeroan J Allison
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA (M.D.A., R.L.K., C.I.K., A.S.A., L.L., J.J.A.)
| |
Collapse
|
9
|
Deng N, Allison JJ, Fang HJ, Ash AS, Ware JE. Using the bootstrap to establish statistical significance for relative validity comparisons among patient-reported outcome measures. Health Qual Life Outcomes 2013; 11:89. [PMID: 23721463 PMCID: PMC3681626 DOI: 10.1186/1477-7525-11-89] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2013] [Accepted: 05/27/2013] [Indexed: 12/26/2022] Open
Abstract
Background Relative validity (RV), a ratio of ANOVA F-statistics, is often used to compare the validity of patient-reported outcome (PRO) measures. We used the bootstrap to establish the statistical significance of the RV and to identify key factors affecting its significance. Methods Based on responses from 453 chronic kidney disease (CKD) patients to 16 CKD-specific and generic PRO measures, RVs were computed to determine how well each measure discriminated across clinically-defined groups of patients compared to the most discriminating (reference) measure. Statistical significance of RV was quantified by the 95% bootstrap confidence interval. Simulations examined the effects of sample size, denominator F-statistic, correlation between comparator and reference measures, and number of bootstrap replicates. Results The statistical significance of the RV increased as the magnitude of denominator F-statistic increased or as the correlation between comparator and reference measures increased. A denominator F-statistic of 57 conveyed sufficient power (80%) to detect an RV of 0.6 for two measures correlated at r = 0.7. Larger denominator F-statistics or higher correlations provided greater power. Larger sample size with a fixed denominator F-statistic or more bootstrap replicates (beyond 500) had minimal impact. Conclusions The bootstrap is valuable for establishing the statistical significance of RV estimates. A reasonably large denominator F-statistic (F > 57) is required for adequate power when using the RV to compare the validity of measures with small or moderate correlations (r < 0.7). Substantially greater power can be achieved when comparing measures of a very high correlation (r > 0.9).
Collapse
Affiliation(s)
- Nina Deng
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA 01655, USA.
| | | | | | | | | |
Collapse
|
10
|
Khan A, Chien CW, Brauer SG. Rasch-based scoring offered more precision in differentiating patient groups in measuring upper limb function. J Clin Epidemiol 2013; 66:681-7. [PMID: 23523550 DOI: 10.1016/j.jclinepi.2012.12.014] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2012] [Revised: 12/10/2012] [Accepted: 12/17/2012] [Indexed: 11/19/2022]
Abstract
OBJECTIVE To compare the discriminatory ability of Rasch-based and summative scoring in the context of assessing upper limb function of patients with stroke. STUDY DESIGN AND SETTING Data were from a cohort study of 497 adults with stroke undergoing physiotherapy. Upper limb function was assessed at admission and discharge using the upper limb subscale of the Motor Assessment Scale (UL-MAS). Rasch analysis was used to transform raw UL-MAS scores into interval measures. A relative precision (RP) index was used to differentiate patients by discharge destination. RESULTS The analysis confirmed the unidimensional structure of UL-MAS at both admission and discharge and demonstrated the adequate fit of the items. The RP index favored the Rasch-based scoring over the summative scoring in differentiating between the two patient groups, with significant gains in precision at admission (15%) and discharge (11%). When examining patients in the upper or lower quartile of UL-MAS, the gains in precision were statistically significant in favor of the Rasch-based scoring, with 20% precision at admission and 19% precision at discharge. CONCLUSION Rasch-based scoring was more precise in differentiating patient groups by discharge destination than the summative scoring used to measure upper limb function, especially at the extreme range of the scale.
Collapse
Affiliation(s)
- Asaduzzaman Khan
- School of Health and Rehabilitation Sciences, The University of Queensland, Brisbane, Queensland 4072, Australia.
| | | | | |
Collapse
|
11
|
Erhart M, Hagquist C, Auquier P, Rajmil L, Power M, Ravens-Sieberer U. A comparison of Rasch item-fit and Cronbach's alpha item reduction analysis for the development of a Quality of Life scale for children and adolescents. Child Care Health Dev 2010; 36:473-84. [PMID: 19702637 DOI: 10.1111/j.1365-2214.2009.00998.x] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
OBJECTIVE This study compares item reduction analysis based on classical test theory (maximizing Cronbach's alpha - approach A), with analysis based on the Rasch Partial Credit Model item-fit (approach B), as applied to children and adolescents' health-related quality of life (HRQoL) items. The reliability and structural, cross-cultural and known-group validity of the measures were examined. METHODS Within the European KIDSCREEN project, 3019 children and adolescents (8-18 years) from seven European countries answered 19 HRQoL items of the Physical Well-being dimension of a preliminary KIDSCREEN instrument. The Cronbach's alpha and corrected item total correlation (approach A) were compared with infit mean squares and the Q-index item-fit derived according to a partial credit model (approach B). Cross-cultural differential item functioning (DIF ordinal logistic regression approach), structural validity (confirmatory factor analysis and residual correlation) and relative validity (RV) for socio-demographic and health-related factors were calculated for approaches (A) and (B). RESULTS Approach (A) led to the retention of 13 items, compared with 11 items with approach (B). The item overlap was 69% for (A) and 78% for (B). The correlation coefficient of the summated ratings was 0.93. The Cronbach's alpha was similar for both versions [0.86 (A); 0.85 (B)]. Both approaches selected some items that are not strictly unidimensional and items displaying DIF. RV ratios favoured (A) with regard to socio-demographic aspects. Approach (B) was superior in RV with regard to health-related aspects. CONCLUSION Both types of item reduction analysis should be accompanied by additional analyses. Neither of the two approaches was universally superior with regard to cultural, structural and known-group validity. However, the results support the usability of the Rasch method for developing new HRQoL measures for children and adolescents.
Collapse
Affiliation(s)
- M Erhart
- Department of Psychosomatics in Children and Adolescents, University Hospital Hamburg-Eppendorf, Germany
| | | | | | | | | | | | | |
Collapse
|
12
|
Gothwal VK, Wright TA, Lamoureux EL, Pesudovs K. Measuring outcomes of cataract surgery using the Visual Function Index-14. J Cataract Refract Surg 2010; 36:1181-8. [DOI: 10.1016/j.jcrs.2010.01.029] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2009] [Revised: 01/02/2010] [Accepted: 01/21/2010] [Indexed: 10/19/2022]
|
13
|
Abstract
In designing a study protocol relating to hip fracture treatment and outcomes, it is important to select appropriate outcome instruments. Before beginning the process of instrument selection, investigators must gain a comprehensive understanding of the condition of interest and have a thorough knowledge of the expected benefits and harms of the proposed intervention. Adequate evidence of an intervention's effectiveness includes indication of impact on the patient's health. We provide a brief discussion about different ways that health and health measurement have been defined, including the International Classification of Function, Disability and Health (ICF), health-related quality of life (HRQOL), and cost-to-benefit analyses. We outline important properties (reliability, validity, sensitivity to change, and responsiveness) that a measurement instrument must demonstrate before being considered an acceptable means to measure outcome. Potential outcome measures relevant to patients with hip fracture are summarized, and important points to consider in the selection of outcome measures for a hypothetical research question in a hip fracture population are discussed.
Collapse
|
14
|
Liang WM, Chang CH, Yeh YC, Shy HY, Chen HW, Lin MR. Psychometric evaluation of the WHOQOL-BREF in community-dwelling older people in Taiwan using Rasch analysis. Qual Life Res 2009; 18:605-18. [DOI: 10.1007/s11136-009-9471-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2007] [Accepted: 03/10/2009] [Indexed: 11/30/2022]
|
15
|
Smith HJ, Richardson JB, Tennant A. Modification and validation of the Lysholm Knee Scale to assess articular cartilage damage. Osteoarthritis Cartilage 2009; 17:53-8. [PMID: 18556222 DOI: 10.1016/j.joca.2008.05.002] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/06/2008] [Accepted: 05/04/2008] [Indexed: 02/02/2023]
Abstract
OBJECTIVE The Lysholm Knee Scale is an 8-item questionnaire originally designed as an outcome measure for ligament reconstruction but is commonly used as a measure for knee chondral damage. This study tests the scale's internal construct validity using the Rasch model, a measurement model which sets strict standards for the quality of measurement derived from the scale. The study also investigates the level of agreement between scores from patients and physiotherapists; and reviews the present weighting system. DESIGN One hundred and fifty-seven patients with knee chondral damage awaiting surgery completed the Lysholm as part of a multicentre clinical trial based in 16 UK and two Norwegian hospitals. The patients were assessed by a physiotherapist who independently completed the Lysholm on the same day. RESULTS Fit to the Rasch model was achieved [mean item fit -0.26, standard deviation (SD) 1.01] after removal of one item (Swelling). With no differential item functioning (DIF) by rater, the intraclass correlation coefficient was 0.9 [95% confidence interval (CI): 0.86-0.93] and a Bland-Altman plot showed no consistent difference in rating. CONCLUSIONS The Lysholm Knee Scale satisfies Rasch model expectations after removal of the swelling item. Generally there is a high degree of agreement between the patient and professional ratings. By removing the swelling item and using unweighted scores, a modified version of the Lysholm Knee Scale is recommended as an outcome measure for knee chondral damage.
Collapse
Affiliation(s)
- H J Smith
- Arthritis Research Centre, RJAH Orthopaedic Hospital NHS Trust, Gobowen, Oswestry, Shropshire, UK.
| | | | | |
Collapse
|
16
|
Interpretation and precision of the Observer Scar Assessment Scale improved by a revised scoring. J Clin Epidemiol 2008; 61:1289-1295. [DOI: 10.1016/j.jclinepi.2008.04.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2008] [Revised: 03/18/2008] [Accepted: 04/04/2008] [Indexed: 11/19/2022]
|
17
|
Suglia SF, Ryan L, Wright RJ. Creation of a community violence exposure scale: accounting for what, who, where, and how often. J Trauma Stress 2008; 21:479-86. [PMID: 18956446 PMCID: PMC2630468 DOI: 10.1002/jts.20362] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Previous research has used the Rasch model, a method for obtaining a continuous scale from dichotomous survey items measuring a single latent construct, to create a scale of community violence exposure. The authors build upon previous work and describe the application of a Rasch model using the continuation ratio model to create an exposure to community violence (ETV) scale including event circumstance information previously shown to modify the impact of experienced events. They compare the Rasch ETV scale to a simpler sum ETV score, and estimate the effect of ETV on child posttraumatic stress symptoms. Incorporating detailed event circumstance information that is grounded in traumatic stress theory may reduce measurement error in the assessment of children's community violence exposure.
Collapse
Affiliation(s)
- Shakira Franco Suglia
- Department of Environmental Health, and Department of Epidemiology, Harvard School of Public Health, Boston, MA 02215, USA.
| | - Louise Ryan
- Department of Biostatistics, Harvard School of Public Health, Boston, MA
| | - Rosalind J. Wright
- Department of Environmental Health, and Channing Laboratories, Harvard School of Public Health, Boston, MA
| |
Collapse
|
18
|
Stock R, Mahoney ER, Reece D, Cesario L. Developing a senior healthcare practice using the chronic care model: effect on physical function and health-related quality of life. J Am Geriatr Soc 2008; 56:1342-8. [PMID: 18503521 DOI: 10.1111/j.1532-5415.2008.01763.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
An ambulatory senior health clinic was developed using the chronic care model (CCM), with emphasis on an interdisciplinary team approach. To determine the effect of this care model approach in a nonprofit healthcare system, an observational, longitudinal panel study of community-dwelling Medicare beneficiaries was performed to examine the effect on physical function and health-related quality of life (HRQL). Participants in the study were recruited from a community sample of 6,864 eligible Medicare beneficiaries. Informed consent and baseline data were obtained from 1,709 individuals (recruitment response rate=25%) and complete data across 30 months from 1,307 (completion response rate=76%). Participants receiving care in the CCM-based senior healthcare practice (n=318) were compared with patients of primary care physicians supported by care managers (n=598) and a group without care managers (n=391). Self-reported data were collected over the telephone to measure physical function and HRQL at baseline and 6, 18, and 30 months. A multiple group mixture growth model was used to analyze physical function and HRQL across the 30 months. Physical function and HRQL mean scores decreased across time in all participants and were moderately correlated at each wave (correlation coefficient=0.74-0.79). Two latent growth classes were identified. In class 1, physical function decreased, and HRQL remained stable across time. In class 2, physical function and HRQL decreased in parallel. Ninety-seven percent of intervention group patients were in class 1, and 99% of patients in comparison groups 1 and 2 were in class 2. Despite physical function decline, patients in a senior health clinic care model maintained HRQL over time, whereas patients receiving traditional care had physical function and HRQL decline. An interdisciplinary team CCM approach appears to have a positive effect on HRQL in this population.
Collapse
Affiliation(s)
- Ronald Stock
- Geriatricsand Care Coordination, PeaceHealth Oregon Region, Eugene, Oregon 97401, USA.
| | | | | | | |
Collapse
|
19
|
Abstract
PURPOSE Understanding patients' experiences of their interactions with health services is an important step in building quality from within. The purpose of this article is to look at the possibilities for involving service users in the development of the National Health Service in England through the structure of integrated care pathways (ICPs). DESIGN/METHODOLOGY/APPROACH A systematic literature review was undertaken to identify how patient experiences have been attained and used in three clinical areas: cataract care, hip replacement and knee arthroscopy. The information was weighted according to methodological criteria and synthesized according to the typical stages of each pathway. Key issues were summarised thematically across each pathway. FINDINGS The findings relate to the use of patient views and experiences within organisational structures, service development, methodological research, education and training. The article identifies important issues of practical significance for involving service users in the planning and development of patient focused ICPs: such as the diversity of patients, perspectives of continuity, information and patient support and the need for methodological research. RESEARCH LIMITATIONS/IMPLICATIONS The review is limited in that the literature across all three pathways tends to report findings of small studies undertaken in one clinical service or setting and most studies are not randomised or controlled. ORIGINALITY/VALUE The literature identified by the review contains important messages for both NHS policy and future research to involve service users in the planned expansion and plurality of NHS care.
Collapse
|
20
|
Garbuz DS, Xu M, Sayre EC. Patients' outcome after total hip arthroplasty: a comparison between the Western Ontario and McMaster Universities index and the Oxford 12-item hip score. J Arthroplasty 2006; 21:998-1004. [PMID: 17027542 DOI: 10.1016/j.arth.2006.01.014] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/07/2005] [Accepted: 01/25/2006] [Indexed: 02/01/2023] Open
Abstract
This prospective cohort study included 402 patients who had primary total hip arthroplasty. The Western Ontario and McMaster Universities Index (WOMAC) and the Oxford 12-item Hip Score (OHS) were used to assess patients preoperatively and at 1 year postoperation. The OHS has a higher responsiveness than the WOMAC in the global scale and in the pain subscale. However, the WOMAC has better responsiveness in its function scale. The point estimate of relative precision of measuring postoperative quality of life shows that the OHS has a tendency toward a better performance than the WOMAC; however, this finding is not statistically significant. The OHS also demonstrates similar floor and ceiling effect patterns as does the WOMAC. We recommend that the choice should depend on which scale researchers are using to power a study.
Collapse
Affiliation(s)
- Donald S Garbuz
- Department of Orthopedics Academic Office, University of British Columbia, Vancouver, British Columbia, Canada
| | | | | |
Collapse
|
21
|
Hart DL, Mioduski JE, Werneke MW, Stratford PW. Simulated computerized adaptive test for patients with lumbar spine impairments was efficient and produced valid measures of function. J Clin Epidemiol 2006; 59:947-56. [PMID: 16895818 DOI: 10.1016/j.jclinepi.2005.10.017] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2005] [Revised: 10/12/2005] [Accepted: 10/16/2005] [Indexed: 10/24/2022]
Abstract
OBJECTIVE To equate physical functioning (PF) items with Back Pain Functional Scale (BPFS) items, develop a computerized adaptive test (CAT) designed to assess lumbar spine functional status (LFS) in people with lumbar spine impairments, and compare discriminant validity of LFS measures (theta(IRT)) generated using all items analyzed with a rating scale Item Response Theory model (RSM) and measures generated using the simulated CAT (theta(CAT)). METHODS We performed a secondary analysis of retrospective intake rehabilitation data. RESULTS Unidimensionality and local independence of 25 BPFS and PF items were supported. Differential item functioning was negligible for levels of symptom acuity, gender, age, and surgical history. The RSM fit the data well. A lumbar spine specific CAT was developed that was 72% more efficient than using all 25 items to estimate LFS measures. theta(IRT) and theta(CAT) measures did not discriminate patients by symptom acuity, age, or gender, but discriminated patients by surgical history in similar clinically logical ways. theta(CAT) measures were as precise as theta(IRT) measures. CONCLUSION A body part specific simulated CAT developed from an LFS item bank was efficient and produced precise measures of LFS without eroding discriminant validity.
Collapse
Affiliation(s)
- Dennis L Hart
- Focus On Therapeutic Outcomes, Inc., 551 Yopps Cove Road, White Stone, VA 22578, USA.
| | | | | | | |
Collapse
|
22
|
Garbuz DS, Xu M, Duncan CP, Masri BA, Sobolev B. Delays worsen quality of life outcome of primary total hip arthroplasty. Clin Orthop Relat Res 2006; 447:79-84. [PMID: 16505716 DOI: 10.1097/01.blo.0000203477.19421.ed] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Although there are indications of health status deterioration for patients while waiting for elective total hip arthroplasties, controversy exists regarding the effect of waiting on postoperative outcomes. We hypothesized that longer waiting times are detrimental to achieving the full benefit of surgery. We prospectively examined 201 patients with osteoarthritis who were on the waiting list for primary total hip arthroplasties. The Western Ontario and McMaster Universities Osteoarthritis Index questionnaire was used to assess patients at surgical consultation (preoperative) and 1 year postoperative. The study included regression models to determine the expected outcome for an individual's preoperative score. Logistic regression models were used to assess the relationship between waiting time and the probability of a better than expected outcome. We found that the odds of achieving a better than expected postoperative functional outcome decreased by 8% for each month on the waiting list. Expedited access resulted in a larger proportion of patients with better than expected function 12 months after surgery.
Collapse
Affiliation(s)
- Donald S Garbuz
- Department of Orthopaedics, University of British Columbia, Vancouver, British Columbia, Canada.
| | | | | | | | | |
Collapse
|
23
|
Conaghan PG, Tennant A, Peterfy CG, Woodworth T, Stevens R, Guermazi A, Genant H, Felson DT, Hunter D. Examining a whole-organ magnetic resonance imaging scoring system for osteoarthritis of the knee using Rasch analysis. Osteoarthritis Cartilage 2006; 14 Suppl A:A116-21. [PMID: 16678453 DOI: 10.1016/j.joca.2006.03.011] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/24/2004] [Accepted: 03/11/2006] [Indexed: 02/02/2023]
Abstract
OBJECTIVE The ability to reliably quantify all the structural abnormalities in osteoarthritis (OA) of the knee is a long-standing goal of OA research. On December 5 and 6, 2002, Outcome Measures in Rheumatology Clinical Trials and Osteoarthritis Research Society, International held a Workshop for Consensus on Osteoarthritis Imaging in Bethesda, MD, with the aim of providing a state-of-the-art review of imaging outcome measures for OA of the knee. As part of the Workshop, data from previous clinical trials and epidemiological studies of OA were analysed with respect to the metrological properties of the measurement methods used. The following report outlines the results of analyses aimed at evaluating the internal construct validity of a whole-organ, ordinal (semi-quantitative) magnetic resonance imaging score (WORMS) using Rasch analysis. The fit of data to the Rasch model offers a measure of the validity of summing different items into a subscale score and the degree to which this score behaves as a unidimensional, interval level measurement tool. METHODS The Rasch model was applied in two OA studies. The first was a clinical cohort comprising OA knee subjects entering a clinical trial; study entry criteria included patients with at least moderate pain, radiographic osteophytes and a minimum of 1.5mm tibiofemoral joint-space width. The second cohort was from the Boston Osteoarthritis Knee Study, an observational cohort of subjects with symptomatic knee OA with pain on most days and a definite osteophyte in either the tibiofemoral or patellofemoral joints. Baseline WORMS scores from both studies were used for the Rasch analysis, performed with RUMM 2020 software. RESULTS There was a substantial proportion of subjects in both study populations with zero scores in several of the subscales of WORMS. Few of the subscales met the requirements of the Rasch measurement model when summated across all sites, and summations of some postulated compartmentally based sites also failed to fit the Rasch model. The existing scoring categories also required rescoring at many sites. CONCLUSION There remain important issues in constructing outcome measurements that summate different features across multiple anatomical sites. The whole-organ scoring system evaluated here is no exception. Resolving these issues will improve the ability of imaging studies to assess complex pathological structural change.
Collapse
Affiliation(s)
- P G Conaghan
- Academic Unit of Musculoskeletal Disease and Rehabilitation, University of Leeds, Leeds, UK.
| | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Wei S, Su-Juan W, Yuan-Gui L, Hong Y, Xiu-Juan X, Xiao-Mei S. Reliability and validity of the GMFM-66 in 0- to 3-year-old children with cerebral palsy. Am J Phys Med Rehabil 2006; 85:141-7. [PMID: 16428905 DOI: 10.1097/01.phm.0000197585.68302.25] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE To examine the reliability and validity of a 66-item version of the Gross Motor Function Measure (GMFM-66) to assess the gross motor functions of children <3 yrs old with cerebral palsy. DESIGN 298 valid samples were obtained from 171 children with cerebral palsy (male 126, female 45, mean age 19 mos, age range 3-36 mos) measured with GMFM-88. Then a 73-item version of GMFM (GMFM-73) special for these children was obtained by Rasch analysis. GMFM-66 score and GMFM-73 scores of each sample were obtained. The reliability and validity of GMFM-66 were evaluated by analyzing the correlation between the scores and between the change scores of these two GMFM versions. The relative precision of GMFM-73 vs. GMFM-66 was also analyzed. RESULTS The test-retest reliability and interscorer reliability of GMFM-66 were both high. The ICC scores were 0.9666 and 0.9782, respectively. Significant correlations were found between the scores (r = 0.9848, P < 0.001) and between the change scores (r = 0.8700, P < 0.001) of these two versions of GMFM. A 14% less gain in relative precision was achieved when using GMFM-73 vs. GMFM-66. CONCLUSION Results indicated that the GMFM-66 had good reliability and validity in assessing the gross motor functions of children <3 yrs old with cerebral palsy. The GMFM-73 derived in the present study did not function significantly better for young children than GMFM-66.
Collapse
Affiliation(s)
- Shi Wei
- Rehabilitation Center, Children's Hospital, Fudan University, 183 FengLin Road, Shanghai 200-032, China
| | | | | | | | | | | |
Collapse
|
25
|
Hart DL, Cook KF, Mioduski JE, Teal CR, Crane PK. Simulated computerized adaptive test for patients with shoulder impairments was efficient and produced valid measures of function. J Clin Epidemiol 2005; 59:290-8. [PMID: 16488360 DOI: 10.1016/j.jclinepi.2005.08.006] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2005] [Revised: 07/21/2005] [Accepted: 08/08/2005] [Indexed: 11/28/2022]
Abstract
BACKGROUND AND OBJECTIVE To test unidimensionality and local independence of a set of shoulder functional status (SFS) items, develop a computerized adaptive test (CAT) of the items using a rating scale item response theory model (RSM), and compare discriminant validity of measures generated using all items (theta(IRT)) and measures generated using the simulated CAT (theta(CAT)). STUDY DESIGN AND SETTING We performed a secondary analysis of data collected prospectively during rehabilitation of 400 patients with shoulder impairments who completed 60 SFS items. RESULTS Factor analytic techniques supported that the 42 SFS items formed a unidimensional scale and were locally independent. Except for five items, which were deleted, the RSM fit the data well. The remaining 37 SFS items were used to generate the CAT. On average, 6 items were needed to estimate precise measures of function using the SFS CAT, compared with all 37 SFS items. The theta(IRT) and theta(CAT) measures were highly correlated (r = .96) and resulted in similar classifications of patients. CONCLUSION The simulated SFS CAT was efficient and produced precise, clinically relevant measures of functional status with good discriminating ability.
Collapse
Affiliation(s)
- Dennis L Hart
- Focus On Therapeutic Outcomes, Inc., 551 Yopps Cove Road, White Stone, VA 22578, USA.
| | | | | | | | | |
Collapse
|
26
|
Petersen MA, Groenvold M, Aaronson N, Brenne E, Fayers P, Nielsen JD, Sprangers M, Bjorner JB. Scoring based on item response theory did not alter the measurement ability of EORTC QLQ-C30 scales. J Clin Epidemiol 2005; 58:902-8. [PMID: 16085193 DOI: 10.1016/j.jclinepi.2005.02.008] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2003] [Revised: 06/09/2004] [Accepted: 02/14/2005] [Indexed: 01/22/2023]
Abstract
BACKGROUND AND OBJECTIVES Most health-related quality-of-life questionnaires include multi-item scales. Scale scores are usually estimated as simple sums of the item scores. However, scoring procedures utilizing more information from the items might improve measurement abilities, and thereby reduce the needed sample sizes. We investigated whether item response theory (IRT)-based scoring improved the measurement abilities of the EORTC QLQ-C30 physical functioning, emotional functioning, and fatigue scales. METHODS Using a database of 13,010 subjects we estimated the relative validities of IRT scoring compared to sum scoring of the scales. RESULTS The mean relative validities were 1.04 (physical), 1.03 (emotional), and 0.97 (fatigue). None of these were significantly larger than 1. Thus, no gain in measurement abilities using IRT scoring was found for these scales. Possible explanations include that the items in the scales are not constructed for IRT scoring and that the scales are relatively short. CONCLUSION IRT scoring of the three longest EORTC QLQ-C30 scales did not improve measurement abilities compared to the traditional sum scoring of the scales.
Collapse
Affiliation(s)
- Morten Aa Petersen
- The research unit, Department of Palliative Medicine, Bispebjerg Hospital, Bispebjerg bakke 23, 2400 Copenhagen, Denmark.
| | | | | | | | | | | | | | | |
Collapse
|
27
|
Xu M, Garbuz DS, Kuramoto L, Sobolev B. Classifying health-related quality of life outcomes of total hip arthroplasty. BMC Musculoskelet Disord 2005; 6:48. [PMID: 16144550 PMCID: PMC1242235 DOI: 10.1186/1471-2474-6-48] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/15/2005] [Accepted: 09/06/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Primary total hip arthroplasty (THA) is an effective treatment for hip osteoarthritis, assessed by whatever distribution-based measures of responsiveness. Yet, the group level evaluation has provided very little evidence contributes to our understanding of the large variation of treatment outcome. The objective is to develop criteria that classify individual treatment health related quality of life (HRQOL) outcome after primary THA, adjusted by preoperative scores. METHODS We prospectively measured 147 patients' disease specific HRQOL on the date of consultation and 12 months post operation by Western Ontario McMaster Universities Osteoarthritis Index (WOMAC). Regression models were used to determine the "expected" outcome for a certain individual baseline score. The ceiling effect of WOMAC measurement is addressed by implementing a left-censoring method. RESULTS The classification criteria are chosen to be the lower boundary of the 95% confidence interval (CI) of the estimated median from the regression. The robustness of the classification criteria was demonstrated using the Monte-Carlo simulation. CONCLUSION The classification criteria are robust and can be applied in general orthopaedic research when the sample size is reasonable large (over 500).
Collapse
Affiliation(s)
- Min Xu
- Arthritis Research Centre of Canada, Vancouver, BC, Canada
| | - Donald S Garbuz
- Department of Orthopaedics, University of British Columbia, Vancouver, BC, Canada
| | - Lisa Kuramoto
- Centre for Clinical Epidemiology & Evaluation, Vancouver, BC, Canada
| | - Boris Sobolev
- Centre for Clinical Epidemiology & Evaluation, Vancouver, BC, Canada
| |
Collapse
|
28
|
Hart DL, Mioduski JE, Stratford PW. Simulated computerized adaptive tests for measuring functional status were efficient with good discriminant validity in patients with hip, knee, or foot/ankle impairments. J Clin Epidemiol 2005; 58:629-38. [PMID: 15878477 DOI: 10.1016/j.jclinepi.2004.12.004] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2004] [Revised: 11/29/2004] [Accepted: 12/07/2004] [Indexed: 01/06/2023]
Abstract
BACKGROUND AND OBJECTIVE To develop computerized adaptive tests (CATs) designed to assess lower extremity functional status (FS) in people with lower extremity impairments using items from the Lower Extremity Functional Scale and compare discriminant validity of FS measures generated using all items analyzed with a rating scale Item Response Theory model (theta(IRT)) and measures generated using the simulated CATs (theta(CAT)). METHODS Secondary analysis of retrospective intake rehabilitation data. RESULTS Unidimensionality of items was strong, and local independence of items was adequate. Differential item functioning (DIF) affected item calibration related to body part, that is, hip, knee, or foot/ankle, but DIF did not affect item calibration for symptom acuity, gender, age, or surgical history. Therefore, patients were separated into three body part specific groups. The rating scale model fit all three data sets well. Three body part specific CATs were developed: each was 70% more efficient than using all LEFS items to estimate FS measures. theta(IRT) and theta(CAT) measures discriminated patients by symptom acuity, age, and surgical history in similar ways. theta(CAT) measures were as precise as theta(IRT) measures. CONCLUSION Body part-specific simulated CATs were efficient and produced precise measures of FS with good discriminant validity.
Collapse
Affiliation(s)
- Dennis L Hart
- Focus On Therapeutic Outcomes, Inc., White Stone, VA 22578-2403, USA.
| | | | | |
Collapse
|
29
|
Alcalá MJ, Casellas F, Fontanet G, Prieto L, Malagelada JR. Shortened questionnaire on quality of life for inflammatory bowel disease. Inflamm Bowel Dis 2004; 10:383-91. [PMID: 15475746 DOI: 10.1097/00054725-200407000-00009] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
Abstract
Questionnaires for measuring quality of life in patients with inflammatory bowel disease usually include a large number of items and are time-consuming for both administration and interpretation. Our aim was to elaborate and validate a short quality-of-life questionnaire with the most representative items from the Spanish version of the 36-item Inflammatory Bowel Disease Questionnaire (IBDQ-36) using the Rasch analysis. The responses to 311 IBDQ-36 questionnaires from 167 patients with ulcerative colitis (UC) and 144 with Crohn's disease (CD) were analyzed. IBDQ-36 was shortened with successive Rasch analyses until all the remaining items showed acceptable separation and goodness-of-fit properties. Validation of the short questionnaire was studied in a new group of 125 patients by determining its validity and reliability. A 9-item short questionnaire was obtained (IBDQ-9). Its correlation with IBDQ-36 was excellent (r = 0.91). Correlation between IBDQ-9 and clinical indices of activity was statistically significant in UC (r = 0.70) and CD (r= 0.70). IBDQ-9 score discriminates adequately between patients in clinical remission or relapse (P < 0.01). Sensitivity to change was determined in 14 patients who improved clinically, showing significant IBDQ-9 changes between both determinations (P < 0.01), with an effect size of -2.67 in UC and -5.29 in CD. IBDQ-9 was also homogeneous, with a Cronbach's alpha of 0.95 in UC and 0.91 in CD. In 35-clinically stable patients, test-retest reliability was good, with a statistically-significant correlation between both questionnaires (r = 0.76 in UC and 0.86 in CD, P < 0.01) and an intraclass correlation coefficient of 0.82 in UC and 0.84 in CD. In conclusion, a short and valid questionnaire to measure quality of life in patients with inflammatory bowel disease was obtained using a new measurement model. Its use should facilitate comprehension of the impact of inflammatory bowel disease.
Collapse
Affiliation(s)
- M J Alcalá
- Hospital Universitari Vall d'Hebron, Barcelona, Spain
| | | | | | | | | |
Collapse
|