201
|
Abstract
OBJECTIVE To review critically the features of measures of generic health-related quality of life (HRQOL) for disability outcomes research. DATA SOURCES A search of electronic databases, summary reviews, books, and government documents was performed. Comment and experiences from participants of a conference on outcomes research were also incorporated. STUDY SELECTION English language literature from scientists from a broad range of disciplines and research settings, including medicine, nursing, social science, and public health, and health services research and practice. DATA EXTRACTION A critical review of measures that have been or might be used to measure disability outcomes. DATA SYNTHESIS Commonly used generic measures of HRQOL can be applied to disability outcomes research with some caveats. Three common tools are the Medical Outcomes Study Short-Form Health Survey (SF-36), Sickness Impact Profile (SIP), and Quality of Well-Being (QWB) scale. The SF-36 and SIP have been used with some success in research with people with disability. The QWB scale has been used less frequently. CONCLUSION Most studies using generic HRQOL tools are of groups with specific impairments rather than heterogeneous groups of people with disability. None of the tools appears to measure HRQOL without some potential biases (eg, inappropriate wording) for people with disability, but more specific testing of these problems is needed. Also needed are studies to determine whether these tools can measure meaningful longitudinal changes.
Collapse
Affiliation(s)
- E M Andresen
- Department of Community Health, Saint Louis University School of Public Health, MO 63108, USA.
| | | |
Collapse
|
202
|
Rogers WH, Wittink H, Wagner A, Cynn D, Carr DB. Assessing Individual Outcomes during Outpatient Multidisciplinary Chronic Pain Treatment by Means of an Augmented SF-36. PAIN MEDICINE 2000; 1:44-54. [PMID: 15101963 DOI: 10.1046/j.1526-4637.2000.99102.x] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
OBJECTIVE To meet the growing demand for objective outcomes measurement during treatment of chronic pain, we developed an instrument to track outcomes of individual patients. METHOD In a 2-phase study, existing and novel outcomes instruments were applied in an interdisciplinary pain management program. In the initial phase, 408 patients were administered the Short Form 36-item questionnaire and during phase 2, 437 patients (87 of whom were followed) were given an expanded (191-item) questionnaire. RESULTS When applied to individual patients, the Short Form 26-item questionnaire lacked measurement reliability for assessment of treatment outcomes and sensitivity to upper extremity or facial pathology, and failed to separate limitations of work versus everyday activity. A novel group of scales derived from responses to 61 questions, including the Short Form 36-item questionnaire, proved sufficiently reliable for routine follow-up of individual chronic pain patients. CONCLUSIONS This new Treatment Outcomes in Pain Survey allows assessment of individual patient outcomes, and aggregate or individual clinician performance, during interdisciplinary treatment of chronic pain.
Collapse
|
203
|
Abstract
This article reports on the development and validation of the Italian SF-36 Health Survey using data from seven studies in which an Italian version of the SF-36 was administered to more than 7000 subjects between 1991 and 1995. Empirical findings from a wide array of studies and diseases indicate that the performance of the questionnaire improved as the Italian translation was revised and that it met the standards suggested by the literature in terms of feasibility, psychometric tests, and interpretability. This generally satisfactory picture strengthens the idea that the Italian SF-36 is as valid and reliable as the original instrument and applicable and valid across age, gender, and disease. Empirical evidence from a cross-sectional survey carried out to norm the final version in a representative sample of 2031 individuals confirms the questionnaire's characteristics in terms of hypothesized constructs and psychometric behavior and gives a better picture of its external validity (i.e., robustness and generalizability) when administered in settings that are very close to real world.
Collapse
Affiliation(s)
- G Apolone
- Dipartimento di Oncologia, Istituto di Ricerche Farmacologiche Mario Negri, Milan, Italy
| | | |
Collapse
|
204
|
Razavi D, Gandek B. Testing Dutch and French translations of the SF-36 Health Survey among Belgian angina patients. J Clin Epidemiol 1998; 51:975-81. [PMID: 9817115 DOI: 10.1016/s0895-4356(98)00089-4] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
The psychometric properties of the Belgian Dutch and French translations of the SF-36 Health Survey were evaluated in a sample of 4448 Belgian patients with angina enrolled in a 6-month treatment study. Missing data were rare (<2%), and tests of both item internal consistency and item discriminant validity were satisfactory in both languages. Cronbach's alpha coefficient ranged from 0.81 to 0.91 (Dutch) and 0.82 to 0.92 (French). SF-36 scales discriminated between groups of patients differing in age and in the number of weekly angina attacks, change over 6 months in the number of weekly angina attacks and physician assessment of change in physical condition both were significantly related to changes in SF-36 scale scores. On average, scale scores for French-speaking patients were lower than for Dutch-speaking patients, most notably for Vitality and Mental Health. The average change in SF-36 scale scores over 6 months, in relation to change in clinical criteria, was similar in both language groups. The psychometric properties of the Belgian Dutch and French translations should be tested further in Belgium to determine whether the generally favorable results reported here can be replicated in other samples.
Collapse
Affiliation(s)
- D Razavi
- Institut Jules Bordet, Brussels, Belgium
| | | |
Collapse
|
205
|
Keller SD, Ware JE, Bentler PM, Aaronson NK, Alonso J, Apolone G, Bjorner JB, Brazier J, Bullinger M, Kaasa S, Leplège A, Sullivan M, Gandek B. Use of structural equation modeling to test the construct validity of the SF-36 Health Survey in ten countries: results from the IQOLA Project. International Quality of Life Assessment. J Clin Epidemiol 1998; 51:1179-88. [PMID: 9817136 DOI: 10.1016/s0895-4356(98)00110-3] [Citation(s) in RCA: 178] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A crucial prerequisite to the use of the SF-36 Health Survey in multinational studies is the reproduction of the conceptual model underlying its scoring and interpretation. Structural equation modeling (SEM) was used to test these aspects of the construct validity of the SF-36 in ten IQOLA countries: Denmark, France, Germany, Italy, the Netherlands, Norway, Spain, Sweden, the United Kingdom, and the United States. Data came from general population surveys fielded to gather normative data. Measurement and structural models developed in the United States were cross-validated in random halves of the sample in each country. SEM analyses supported the eight first-order factor model of health that underlies the scoring of SF-36 scales and two second-order factors that are the basis for summary physical and mental health measures. A single third-order factor was also observed in support of the hypothesis that all responses to the SF-36 are generated by a single, underlying construct--health. In addition, a third second-order factors, interpreted as general well-being, was shown to improve the fit of the model. This model (including eight first-order factors, three second-order factors, and one third-order factor) was cross-validated using a holdout sample within the United States and in each of the nine other countries. These results confirm the hypothesized relationships between SF-36 items and scales and justify their scoring in each country using standard algorithms. Results also suggest that SF-36 scales and summary physical and mental health measures will have similar interpretations across countries. The practical implications of a third second-order SF-36 factor (general well-being) warrant further study.
Collapse
Affiliation(s)
- S D Keller
- Health Assessment Lab at the Health Institute, New England Medical Center, Boston, Massachusetts, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
206
|
Raczek AE, Ware JE, Bjorner JB, Gandek B, Haley SM, Aaronson NK, Apolone G, Bech P, Brazier JE, Bullinger M, Sullivan M. Comparison of Rasch and summated rating scales constructed from SF-36 physical functioning items in seven countries: results from the IQOLA Project. International Quality of Life Assessment. J Clin Epidemiol 1998; 51:1203-14. [PMID: 9817138 DOI: 10.1016/s0895-4356(98)00112-7] [Citation(s) in RCA: 118] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Rasch models for polytomous items were used to assess the scaling assumptions and compare item response patterns in the 10-item SF-36 physical functioning scale (PF-10) for general population respondents in Denmark, Germany, Italy, the Netherlands, Sweden, the United Kingdom, and the United States. The Rasch model of physical functioning developed in the United States was compared to models for other countries, and each country was compared to a multinational composite. Strong scale congruence across the seven countries was demonstrated; items that varied between countries and from the composite may reflect unique cultural response patterns or differences in translation. Scoring algorithms based on the Rasch model for each country were superior to the current Likert scoring in tests of relative validity (RV) in discriminating among age groups in all countries. In relation to the Likert PF-10 scoring (RV = 1.00), scores estimated using the Rasch rating scale model achieve a median RV of 1.31 (range: 1.01-1.59), while the Rasch partial credit model attained a median RV of 1.44 (range: 1.01-2.23). Rasch models hold good potential for improving health status measures, estimating individual scores when responses to scale items are missing, and equating scores across countries.
Collapse
Affiliation(s)
- A E Raczek
- School of Education, Boston College, Chestnut Hill, Massachusetts, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
207
|
Bullinger M, Alonso J, Apolone G, Leplège A, Sullivan M, Wood-Dauphinee S, Gandek B, Wagner A, Aaronson N, Bech P, Fukuhara S, Kaasa S, Ware JE. Translating health status questionnaires and evaluating their quality: the IQOLA Project approach. International Quality of Life Assessment. J Clin Epidemiol 1998; 51:913-23. [PMID: 9817108 DOI: 10.1016/s0895-4356(98)00082-1] [Citation(s) in RCA: 611] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
This article describes the methods adopted by the International Quality of Life Assessment (IQOLA) project to translate the SF-36 Health Survey. Translation methods included the production of forward and backward translations, use of difficulty and quality ratings, pilot testing, and cross-cultural comparison of the translation work. Experience to date suggests that the SF-36 can be adapted for use in other countries with relatively minor changes to the content of the form, providing support for the use of these translations in multinational clinical trials and other studies. The most difficult items to translate were physical functioning items, which used examples of activities and distances that are not common outside of the United States; items that used colloquial expressions such as pep or blue; and the social functioning items. Quality ratings were uniformly high across countries. While the IQOLA approach to translation and validation was developed for use with the SF-36, it is applicable to other translation efforts.
Collapse
Affiliation(s)
- M Bullinger
- Abteilung Für Medizinische Psychologie, Universitätskrankenhaus Eppendorf, Hamburg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
208
|
Ware JE, Gandek B, Kosinski M, Aaronson NK, Apolone G, Brazier J, Bullinger M, Kaasa S, Leplège A, Prieto L, Sullivan M, Thunedborg K. The equivalence of SF-36 summary health scores estimated using standard and country-specific algorithms in 10 countries: results from the IQOLA Project. International Quality of Life Assessment. J Clin Epidemiol 1998; 51:1167-70. [PMID: 9817134 DOI: 10.1016/s0895-4356(98)00108-5] [Citation(s) in RCA: 436] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Data from general population surveys (n = 1771 to 9151) in nine European countries (Denmark, France, Germany, Italy, the Netherlands, Norway, Spain, Sweden, and the United Kingdom) were analyzed to test the algorithms used to score physical and mental component summary measures (PCS-36/MCS-36) based on the SF-36 Health Survey. Scoring coefficients for principal components were estimated independently in each country using identical methods of factor extraction and orthogonal rotation. PCS-36 and MCS-36 scores were also estimated using standard (U.S.-derived) scoring algorithms, and results were compared. Product-moment correlations between scores estimated from standard and country-specific scoring coefficients were very high (0.98 to 1.00) for both physical and mental health components in all countries. As hypothesized for orthogonal components, correlations between physical and mental components within each country were very low (0.00 to 0.12) for both estimation methods. Mean scores for PCS-36 differed by as much as 3.0 points across countries using standard scoring, and mean scores for MCS-36 differed across countries by as much as 6.4 points. In view of the high degree of equivalence observed within each country, using standard and country-specific algorithms, we recommend use of standard scoring algorithms for purposes of multinational studies involving these 10 countries.
Collapse
Affiliation(s)
- J E Ware
- Health Assessment Lab at the Health Institute, New England Medical Center, Boston, Massachusetts 02111, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
209
|
Gandek B, Ware JE, Aaronson NK, Apolone G, Bjorner JB, Brazier JE, Bullinger M, Kaasa S, Leplege A, Prieto L, Sullivan M. Cross-validation of item selection and scoring for the SF-12 Health Survey in nine countries: results from the IQOLA Project. International Quality of Life Assessment. J Clin Epidemiol 1998; 51:1171-8. [PMID: 9817135 DOI: 10.1016/s0895-4356(98)00109-7] [Citation(s) in RCA: 2084] [Impact Index Per Article: 80.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Data from general population surveys (n = 1483 to 9151) in nine European countries (Denmark, France, Germany, Italy, the Netherlands, Norway, Spain, Sweden, and the United Kingdom) were analyzed to cross-validate the selection of questionnaire items for the SF-12 Health Survey and scoring algorithms for 12-item physical and mental component summary measures. In each country, multiple regression methods were used to select 12 SF-36 items that best reproduced the physical and mental health summary scores for the SF-36 Health Survey. Summary scores then were estimated with 12 items in three ways: using standard (U.S.-derived) SF-12 items and scoring algorithms; standard items and country-specific scoring; and country-specific sets of 12 items and scoring. Replication of the 36-item summary measures by the 12-item summary measures was then evaluated through comparison of mean scores and the strength of product-moment correlations. Product-moment correlations between SF-36 summary measures and SF-12 summary measures (standard and country-specific) were very high, ranging from 0.94-0.96 and 0.94-0.97 for the physical and mental summary measures, respectively. Mean 36-item summary measures and comparable 12-item summary measures were within 0.0 to 1.5 points (median = 0.5 points) in each country and were comparable across age groups. Because of the high degree of correspondence between summary physical and mental health measures estimated using the SF-12 and SF-36, it appears that the SF-12 will prove to be a practical alternative to the SF-36 in these countries, for purposes of large group comparisons in which the focus is on overall physical and mental health outcomes.
Collapse
Affiliation(s)
- B Gandek
- Health Assessment Lab at the Health Institute, New England Medical Center, Boston, Massachusetts 02111, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
210
|
Gandek B, Ware JE, Aaronson NK, Alonso J, Apolone G, Bjorner J, Brazier J, Bullinger M, Fukuhara S, Kaasa S, Leplège A, Sullivan M. Tests of data quality, scaling assumptions, and reliability of the SF-36 in eleven countries: results from the IQOLA Project. International Quality of Life Assessment. J Clin Epidemiol 1998; 51:1149-58. [PMID: 9817132 DOI: 10.1016/s0895-4356(98)00106-1] [Citation(s) in RCA: 299] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Data from general population samples in 11 countries (n = 1483 to 9151) were used to assess data quality and test the assumptions underlying the construction and scoring of multi-item scales from the SF-36 Health Survey. Across all countries, the rate of item-level missing data generally was low, although slightly higher for items printed in the grid format. In each country, item means generally were clustered as hypothesized within scales. Correlations between items and hypothesized scales were greater than 0.40 with one exception, supporting item internal consistency. Items generally correlated significantly higher with their own scale than with competing scales, supporting item discriminant validity. Scales could be constructed for 93-100% of respondents. Internal consistency reliability of the eight SF-36 scales was above 0.70 for all scales, with two exceptions. Floor effects were low for all except the two role functioning scales; ceiling effects were high for both role functioning scales and also were noteworthy for the Physical Functioning, Bodily Pain, and Social Functioning scales in some countries. These results support the construction and scoring of the SF-36 translations in these 11 countries using the method of summated ratings.
Collapse
Affiliation(s)
- B Gandek
- Health Assessment Lab at the Health Institute, New England Medical Center, Boston, Massachusetts 02111, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
211
|
Ware JE, Gandek B. Overview of the SF-36 Health Survey and the International Quality of Life Assessment (IQOLA) Project. J Clin Epidemiol 1998; 51:903-12. [PMID: 9817107 DOI: 10.1016/s0895-4356(98)00081-x] [Citation(s) in RCA: 1651] [Impact Index Per Article: 63.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
This article presents information about the development and evaluation of the SF-36 Health Survey, a 36-item generic measure of health status. It summarizes studies of reliability and validity and provides administrative and interpretation guidelines for the SF-36. A brief history of the International Quality of Life Assessment (IQOLA) Project is also included.
Collapse
Affiliation(s)
- J E Ware
- Health Assessment Lab at the Health Institute, New England Medical Center, Boston, Massachusetts 02111, USA
| | | |
Collapse
|
212
|
Ware JE, Kosinski M, Gandek B, Aaronson NK, Apolone G, Bech P, Brazier J, Bullinger M, Kaasa S, Leplège A, Prieto L, Sullivan M. The factor structure of the SF-36 Health Survey in 10 countries: results from the IQOLA Project. International Quality of Life Assessment. J Clin Epidemiol 1998; 51:1159-65. [PMID: 9817133 DOI: 10.1016/s0895-4356(98)00107-3] [Citation(s) in RCA: 461] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Studies of the factor structure of the SF-36 Health Survey are an important step in its construct validation. Its structure is also the psychometric basis for scoring physical and mental health summary scales, which are proving useful in simplifying and interpreting statistical analyses. To test the generalizability of the SF-36 factor structure, product-moment correlations among the eight SF-36 Health Survey scales were estimated for representative samples of general populations in each of 10 countries. Matrices were independently factor analyzed using identical methods to test for hypothesized physical and mental health components, and results were compared with those published for the United States. Following simple orthogonal rotation of two principal components, they were easily interpreted as dimensions of physical and mental health in all countries. These components accounted for 76% to 85% of the reliable variance in scale scores across nine European countries, in comparison with 82% in the United States. Similar patterns of correlations between the eight scales and the components were observed across all countries and across age and gender subgroups within each country. Correlations with the physical component were highest (0.64 to 0.86) for the Physical Functioning, Role Physical, and Bodily Pain scales, whereas the Mental Health, Role Emotional, and Social Functioning scales correlated highest (0.62 to 0.91) with the mental component. Secondary correlations for both clusters of scales were much lower. Scales measuring General Health and Vitality correlated moderately with both physical and mental health components. These results support the construct validity of the SF-36 translations and the scoring of physical and mental health components in all countries studied.
Collapse
Affiliation(s)
- J E Ware
- Health Assessment Lab at the Health Institute, New England Medical Center, Boston, Massachusetts 02111, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|