1
|
Terwee CB, Roorda LD. Country-specific reference values for PROMIS ® pain, physical function and participation measures compared to US reference values. Ann Med 2023; 55:1-11. [PMID: 36426680 PMCID: PMC9704075 DOI: 10.1080/07853890.2022.2149849] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
INTRODUCTION Patient-Reported Outcomes Measurement Information System (PROMIS®) is commonly used across medical conditions. To facilitate interpretation of scores across countries, we calculated Dutch reference values for PROMIS Physical Function (PROMIS-PF), Pain Interference (PROMIS-PI), Pain Behavior (PROMIS-PB), Ability to Participate in Social Roles and Activities (PROMIS-APSRA), and Satisfaction with Social Roles and Activities (PROMIS-SSRA), as compared to US reference values. PATIENTS AND METHODS A panel completed full PROMIS-PF (n=1310), PROMIS-PI and PROMIS-PB (n=1052), and PROMIS-APSRA and PROMIS-SSRA (n=1002) item banks and reported their level of health per domain (no, mild, moderate, severe limitations). T-scores were calculated by sample and subgroups (age, gender, self-reported level of domain). Distribution-based and anchor-based thresholds for mild, moderate, and severe scores were determined. RESULTS Mean T-scores were close to the US mean of 50 for PROMIS-PF (49.8) and PROMIS-APSRA (50.6), lower for PROMIS-SSRA (47.5) and higher for PROMIS-PI (54.9) and PROMIS-PB (52.0). Distribution-based thresholds for mild, moderate, and severe scores were comparable to US recommended cut-off values (except for PROMIS-PI) but participants reported limitations 'earlier' than suggested thresholds. CONCLUSION Dutch reference values were close to US reference values for some PROMIS domains but not all. We recommend country-specific reference values to facilitate worldwide PROMIS use.KEY MESSAGESPROMIS offers universally applicable IRT-based efficient and patient-friendly measures to assess commonly relevant patient-reported outcomes across medical conditions.To support the use of PROMIS in daily clinical practice and research across the world, country-specific general population reference values should be obtained.More research is necessary to obtain reliable and valid cut-off values for what constitutes mild, moderate and severe scores from the patients' perspective.
Collapse
Affiliation(s)
- Caroline B Terwee
- Department of Epidemiology and Data Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.,Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Leo D Roorda
- Amsterdam Rehabilitation Research Center
- Reade, Amsterdam, The Netherlands
| |
Collapse
|
2
|
Clinical Relevance and Advantages of Intradermal Test Results in 371 Patients with Allergic Rhinitis, Asthma and/or Otitis Media with Effusion. Cells 2021; 10:cells10113224. [PMID: 34831446 PMCID: PMC8619930 DOI: 10.3390/cells10113224] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 11/14/2021] [Accepted: 11/16/2021] [Indexed: 01/02/2023] Open
Abstract
Background: We evaluated the value of positive intradermal dilution testing (IDT) after negative skin prick tests (SPT) by retrospectively determining allergy immunotherapy (AIT) outcomes. Methods: This private practice, cohort study compared the relative value of SPT vs. IDT in 371 adults and children with suspected manifestations of allergy: chronic allergic rhinitis (AR), asthma and/or chronic otitis media with effusion (OME). The primary outcome measure was symptom resolution following immunotherapy, as determined by symptom severity questionnaires completed by patients before and after AIT. Results: Positive IDT identified 193 (52%) patients who would not otherwise have been diagnosed. IDT detected 3.7-fold more allergens per patient than SPT (8.56 vs. 2.3; p < 0.01). Patients positive only on IDT responded to AIT equally well as those identifiable by SPT, independent of allergen sensitivity (67% by SPT vs. 62% by IDT; p = 0.69, not significantly different). Conclusion: Intradermal titration can identify patients who will benefit from allergy immunotherapy more accurately than SPT. Outcomes analysis in 371 patients shows that IDT doubled their chance of successful treatment with no greater risk of therapeutic failure. Positive IDT, following negative SPT, is clinically relevant and offers superior sensitivity over SPT for detecting allergens clinically relevant to diagnosis of AIT-responsive atopic disease.
Collapse
|
3
|
Hurst DS, Gordon BR, McDaniel AB, Poe DS. Intradermal Testing Doubles Identification of Allergy among 110 Immunotherapy-Responsive Patients with Eustachian Tube Dysfunction. Diagnostics (Basel) 2021; 11:763. [PMID: 33923133 PMCID: PMC8146738 DOI: 10.3390/diagnostics11050763] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 04/13/2021] [Accepted: 04/20/2021] [Indexed: 11/17/2022] Open
Abstract
The purpose of this study was to determine whether the sensitivity advantage of intradermal dilutional testing (IDT) is clinically relevant in patients with obstructive Eustachian tube dysfunction (ETD) or otitis media with effusion (OME). This retrospective, private-practice cohort study compared the sensitivity of skin prick tests (SPT) vs. IDT in 110 adults and children with suspected allergy and OME. Primary outcome measure was symptom resolution from allergy immunotherapy (AIT). IDT identified 57% more patients as being allergic, and 8.6 times more reactive allergens than would have been diagnosed using only SPT. Patients diagnosed by IDT had the same degree of symptom improvement from immunotherapy, independent of allergen sensitivity (66% by SPT vs. 63% by IDT; p = 0.69, not different). Low-sensitivity allergy tests, which may fail to identify allergy in over two thirds of children aged 3 to 15 as being atopic, or among 60% of patients with ETD, may explain why many physicians do not consider allergy as a treatable etiology for their patient's OME/ETD. IDT offers superior sensitivity over SPT for detecting allergens clinically relevant to treating OME/ETD. These data strongly support increased utilization of intradermal testing and invite additional clinical outcome studies.
Collapse
Affiliation(s)
- David S. Hurst
- Department of Otolaryngology, Tufts University, Boston 02111, MA, USA
| | - Bruce R. Gordon
- Department of Laryngology & Otology, Harvard University, Boston, MA 02114, USA;
| | - Alan B. McDaniel
- Department of Otolaryngology, University of Louisville, Louisville, KY 40202, USA;
| | | |
Collapse
|
4
|
Sugaya N, Arai M, Goto F. Is the Headache in Patients with Vestibular Migraine Attenuated by Vestibular Rehabilitation? Front Neurol 2017; 8:124. [PMID: 28421034 PMCID: PMC5377541 DOI: 10.3389/fneur.2017.00124] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Accepted: 03/15/2017] [Indexed: 01/03/2023] Open
Abstract
Background Vestibular rehabilitation is the most effective treatment for dizziness due to vestibular dysfunction. Given the biological relationship between vestibular symptoms and headache, headache in patients with vestibular migraine (VM) could be improved by vestibular rehabilitation that leads to the improvement of dizziness. This study aimed to compare the effects of vestibular rehabilitation on headache and other outcomes relating to dizziness, and the psychological factors in patients with VM patients, patients with dizziness and tension-type headache, and patients without headache. Methods Our participants included 251 patients with dizziness comprising 28 patients with VM, 79 patients with tension-type headache, and 144 patients without headache. Participants were hospitalized for 5 days and taught to conduct a vestibular rehabilitation program. They were assessed using the Dizziness Handicap Inventory (DHI), Headache Impact Test (HIT-6), Hospital Anxiety and Depression Scale (HADS), and Somatosensory Catastrophizing Scale (SSCS) and underwent center of gravity fluctuation measurement as an objective dizziness severity index before, 1 month after, and 4 months after their hospitalization. Results The VM and tension-type headache groups demonstrated a significant improvement in the HIT-6 score with improvement of the DHI, HADS, SSCS, and a part of the objective dizziness index that also shown in patients without headache following vestibular rehabilitation. The change in HIT-6 during rehabilitation in the VM group was positively correlated with changes in the DHI and anxiety in the HADS. Changes in the HIT-6 in tension-type headache group positively correlated with changes in anxiety and SSCS. Conclusion Vestibular rehabilitation contributed to improvement of headache both in patients with VM and patients with dizziness and tension-type headache, in addition to improvement of dizziness and psychological factors. Improvement in dizziness following vestibular rehabilitation could be associated with the improvement of headache more prominently in VM compared with comorbid tension-type headache.
Collapse
Affiliation(s)
- Nagisa Sugaya
- Unit of Public Health and Preventive Medicine, School of Medicine, Yokohama City University, Yokohama, Japan
| | - Miki Arai
- Department of Otolaryngology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan
| | - Fumiyuki Goto
- Department of Otolaryngology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan
| |
Collapse
|
5
|
Abstract
The evaluation of the outcomes of total knee arthroplasty requires measurement tools that are valid, reliable, and responsive to change. However, the accuracy of any outcome measurement is determined by the validity and reliability of the instrument used. To ensure this accuracy, it is imperative that each instrument used in orthopaedics is free of biases leading to inaccurate estimates of treatment effects. WHERE ARE WE NOW?: Many patient-derived outcome instruments have been developed and tested through the application of the standard assessments that form the basis of classical test theory: validity, reliability, and responsiveness. These assessments determine if the instrument reliably measures what it is intended to measure, and if it captures differences among groups of patients or changes over time. WHERE DO WE NEED TO GO?: Thorough evaluation of the outcome instruments used in orthopaedics is a critical prerequisite for the continued improvement of effective patient care. Additional steps of psychometric testing that are sometimes overlooked include testing for differential item functioning (DIF) and the effects of the mode of administration of the outcome instrument. The use of suitable approaches to test for these potential sources of bias would facilitate the development of more robust outcome assessment in research and clinical practice. HOW DO WE GET THERE?: Testing for DIF, including the effects of mode of administration, may be performed using several analytical approaches. This will allow optimal application of each outcome instrument with respect to patient characteristics, time and mode of the administration, and modification, as necessary.
Collapse
|
6
|
Wu DR. [Modern testing theory and its application in the field of health measurement]. ZHONG XI YI JIE HE XUE BAO = JOURNAL OF CHINESE INTEGRATIVE MEDICINE 2012; 10:271-278. [PMID: 22409916 DOI: 10.3736/jcim20120305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
This paper briefly introduces item response theory (IRT) as a typical representation of modern testing theory (MTT), and systematically reviews the processes and contents of the application of IRT in the area of health measurement, including, for example, item bank development, scale revision and computerized adaptive testing. The author presents the potential benefits and the notable problems during health measuring by IRT. Then, the author asserts the need for thorough assessment of feasibility when using the IRT in patient-reported outcome research. Further research based on IRT and computerized adaptive testing in health measurement will be carried out in the field of medical care including traditional Chinese medicine and integrative medicine.
Collapse
Affiliation(s)
- Da-rong Wu
- The Second Affiliated Hospital (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou University of Chinese Medicine, Guangdong Province, China.
| |
Collapse
|
7
|
Wan C, Fang J, Jiang R, Shen J, Jiang D, Tu X, Messing S, Tang W. Development and validation of a quality of life instrument for patients with drug dependence: comparisons with SF-36 and WHOQOL-100. Int J Nurs Stud 2011; 48:1080-95. [PMID: 21397228 DOI: 10.1016/j.ijnurstu.2011.02.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2010] [Revised: 01/30/2011] [Accepted: 02/06/2011] [Indexed: 10/18/2022]
Abstract
AIM Our goal was to develop a self-administered quality of life scale for patients with drug addiction/dependence (QOL-DA) and compare it with the SF-36 and the WHOQOL-100. METHODS Employing theory and methodology of rating scale construction, a self-administered quality of life instrument for individuals with drug dependence QOL-DA was developed and evaluated utilizing responses from 212 drug-dependent subjects at the Kunming Municipal Mandatory Detoxification and Rehabilitation Center in China. Quality of life was measured using the SF-36, WHOQOL-100 and QOL-DA three times during the detoxification. RESULTS Test-retest reliability in the domains of physical function, psychological function, social function and toxicity were 0.82, 0.64, 0.78, and 0.76, respectively. Cronbach's coefficient α for the 4 domains was 0.87, 0.89, 0.93 and 0.86, respectively. Correlations and factor analysis showed good construct validity. Criterion-related and convergent validity was confirmed by using the SF-36 and the WHOQOL-100 simultaneously. The instrument does show the change in QOL after two weeks of detoxification with higher standardized response mean higher than that of SF-36 and WHOQOL-100. CONCLUSION The instrument developed has good validity, reliability and better responsiveness than instruments currently used, and can be employed effectively to measure the quality of life of individuals with drug dependence.
Collapse
Affiliation(s)
- Chonghua Wan
- School of Humanities and Management, Guangdong Medical College, Dongguan, China.
| | | | | | | | | | | | | | | |
Collapse
|
8
|
Anatchkova MD, Saris-Baglama RN, Kosinski M, Bjorner JB. Development and preliminary testing of a computerized adaptive assessment of chronic pain. THE JOURNAL OF PAIN 2009; 10:932-43. [PMID: 19595636 PMCID: PMC2763618 DOI: 10.1016/j.jpain.2009.03.007] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2008] [Revised: 01/10/2009] [Accepted: 03/06/2009] [Indexed: 11/17/2022]
Abstract
UNLABELLED The aim of this article is to report the development and preliminary testing of a prototype computerized adaptive test of chronic pain (CHRONIC PAIN-CAT) conducted in 2 stages: (1) evaluation of various item selection and stopping rules through real data-simulated administrations of CHRONIC PAIN-CAT; (2) a feasibility study of the actual prototype CHRONIC PAIN-CAT assessment system conducted in a pilot sample. Item calibrations developed from a US general population sample (N = 782) were used to program a pain severity and impact item bank (kappa = 45), and real data simulations were conducted to determine a CAT stopping rule. The CHRONIC PAIN-CAT was programmed on a tablet PC using QualityMetric's Dynamic Health Assessment (DYHNA) software and administered to a clinical sample of pain sufferers (n = 100). The CAT was completed in significantly less time than the static (full item bank) assessment (P < .001). On average, 5.6 items were dynamically administered by CAT to achieve a precise score. Scores estimated from the 2 assessments were highly correlated (r = .89), and both assessments discriminated across pain severity levels (P < .001, RV = .95). Patients' evaluations of the CHRONIC PAIN-CAT were favorable. PERSPECTIVE This report demonstrates that the CHRONIC PAIN-CAT is feasible for administration in a clinic. The application has the potential to improve pain assessment and help clinicians manage chronic pain.
Collapse
|
9
|
Velozo CA, Wang Y, Lehman L, Wang JH. Utilizing Rasch measurement models to develop a computer adaptive self-report of walking, climbing, and running. Disabil Rehabil 2009; 30:458-67. [DOI: 10.1080/09638280701617317] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
10
|
Vidotto G, Carone M, Jones PW, Salini S, Bertolotti G. Maugeri Respiratory Failure questionnaire reduced form: A method for improving the questionnaire using the Rasch model. Disabil Rehabil 2009; 29:991-8. [PMID: 17612984 DOI: 10.1080/09638280600926678] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
PURPOSE The Maugeri Respiratory Failure questionnaire (MRF-28) is the first instrument specifically developed for use with chronic respiratory failure (CRF) patients. The 28 items were selected using classical test theory. The purpose of the current analysis was to further refine the questionnaire using item response theory, specifically, the Rasch model analysis. METHODS Three hundred and seventeen CRF patients (mean aged 66.7 yrs; Male 219, Female 98) completed the MRF-28 health status measure. Data were collected through the self-report questionnaire and analyzed using 1-parameter logistic models by means of RUMM software. RESULTS The 28-item questionnaire has good psychometric properties in terms of discriminant power because the Person Separation Index is 0.896. However, the item-trait interaction was not good as shown by the total-item Chi-square (chi(2)(112), p<0.001). Removing two items that did not fit the Rasch model well, produced a minor improvement in Person Separation Index to 0.899 and the item-trait interaction improved (chi(2)(104), p = NS). In the preliminary analysis we identified 21 patients who were outliers; when they were excluded the distribution of the residuals, according to the Kolmogorov-Smirnov statistics, was normal and factor analysis of the item residuals showed that the components had similar eigenvalues and no strong correlation with items. These results suggest that the MRF-26 is a unidimensional measure of health-related quality of life impairment for chronic respiratory failure patients. CONCLUSIONS A combination of classical psychometric tests and Rasch analysis produced an instrument of moderate size that covers a wide range of effects of CRF and has interval scaling properties.
Collapse
Affiliation(s)
- G Vidotto
- Department General Psychology, University of Padua, Padua, Italy.
| | | | | | | | | |
Collapse
|
11
|
Sijtsma K, Emons WHM, Bouwmeester S, Nyklícek I, Roorda LD. Nonparametric IRT analysis of Quality-of-Life Scales and its application to the World Health Organization Quality-of-Life Scale (WHOQOL-Bref). Qual Life Res 2008; 17:275-90. [PMID: 18246447 PMCID: PMC2238782 DOI: 10.1007/s11136-007-9281-6] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2007] [Accepted: 11/06/2007] [Indexed: 11/24/2022]
Abstract
Background This study investigates the usefulness of the nonparametric monotone homogeneity model for evaluating and constructing Health-Related Quality-of-Life Scales consisting of polytomous items, and compares it to the often-used parametric graded response model. Methods The nonparametric monotone homogeneity model is a general model of which all known parametric models for polytomous items are special cases. Merits, drawbacks, and possibilities of nonparametric and parametric models and available software are discussed. Particular attention is given to the monotone homogeneity model (also known as the Mokken model), and the often-used parametric graded response model. Results Data from the WHOQOL-Bref were analyzed using both the monotone homogeneity model and the graded response model. The monotone homogeneity model analysis yielded unidimensional scales for each content domain. Scalability coefficients further showed that some items have limited scalability with respect to the other items in the same scale. The parametric IRT analyses lead to the rejection of some of the items. Conclusions The nonparametric monotone homogeneity model is highly suited for data analysis in a health-related quality-of-life context, and the parametric graded response model may add interesting features to measurement provided the model fits the data well.
Collapse
Affiliation(s)
- Klaas Sijtsma
- Department of Methodology and Statistics FSW, Tilburg University, PO Box 90153, Tilburg 5000 LE, The Netherlands.
| | | | | | | | | |
Collapse
|
12
|
Improvements in short-form measures of health status: Introduction to a series. J Clin Epidemiol 2008; 61:1-5. [DOI: 10.1016/j.jclinepi.2007.08.008] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2006] [Revised: 08/02/2007] [Accepted: 08/09/2007] [Indexed: 01/22/2023]
|
13
|
Becker J, Schwartz C, Saris-Baglama RN, Kosinski M, Bjorner JB. Using Item Response Theory (IRT) for Developing and Evaluating the Pain Impact Questionnaire (PIQ-6™). PAIN MEDICINE 2007. [DOI: 10.1111/j.1526-4637.2007.00377.x] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
14
|
Wan C, Tang X, Tu XM, Feng C, Messing S, Meng Q, Zhang X. Psychometric properties of the simplified Chinese version of the EORTC QLQ-BR53 for measuring quality of life for breast cancer patients. Breast Cancer Res Treat 2007; 105:187-93. [PMID: 17221159 DOI: 10.1007/s10549-006-9443-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2006] [Accepted: 10/24/2006] [Indexed: 11/28/2022]
Abstract
A Simplified Chinese version of the EORTC QLQ-BR53 was evaluated using responses from 233 patients with breast cancer in China by assessing the construct and criterion-related validity, internal consistency and test-retest reliability, and responsiveness as measured by score changes of the scales. Internal consistency reliability measured by Cronbach's coefficient alpha is greater than 0.75 for most multi-item scales except cognitive functioning (0.41) and breast symptoms (0.71). Test-retest reliability coefficients for all domains are greater than 0.80 with the exception of physical functioning (0.65), social functioning (0.75), appetite loss (0.75), diarrhea (0.72), and body image (0.72). Correlation and factor analysis among domains and items showed good construct validity for both QLQ-C30 and QLQ-BR23. Score changes over time were observed in most domains except emotional functioning, global health status/QOL, dyspnoea, constipation, diarrhea, financial difficulties, sexual functioning, sexual enjoyment, and breast symptoms. Therefore, the Simplified Chinese version of QLQ-BR53 shows reasonable validity, reliability, and responsiveness and can be used to measure QOL for Chinese patients with breast cancer.
Collapse
Affiliation(s)
- Chonghua Wan
- Faculty of Public Health, Kunming Medical College, Kunming 650031, Yunnan, China.
| | | | | | | | | | | | | |
Collapse
|
15
|
Sébille V, Hardouin JB, Mesbah M. Sequential analysis of latent variables using mixed-effect latent variable models: Impact of non-informative and informative missing data. Stat Med 2007; 26:4889-904. [PMID: 17576119 DOI: 10.1002/sim.2959] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Sequential methods allowing for early stopping of clinical trials are widely used in various therapeutic areas. These methods allow for the analysis of different types of endpoints (quantitative, qualitative, time to event) and often provide, in average, substantial reductions in sample size as compared with single-stage designs while maintaining pre-specified type I and II errors. Sequential methods are also used when analysing particular endpoints that cannot be directly measured, such as depression, quality of life, or cognitive functioning, which are often measured through questionnaires. These types of endpoints are usually referred to as latent variables and should be analysed with latent variable models. In addition, in most clinical trials studying such latent variables, incomplete data are not uncommon and the missing data process might also be non-ignorable. We investigated the impact of informative or non-informative missing data on the statistical properties of the double triangular test (DTT), combined with the mixed-effects Rasch model (MRM) for dichotomous responses or the traditional method based on observed patient's scores (S) to the questionnaire. The achieved type I errors for the DTT were usually close to the target value of 0.05 for both methods, but increased slightly for the MRM when informative missing data were present. The DTT was very close to the nominal power of 0.95 when the MRM was used, but substantially underpowered with the S method (reduction of about 23 per cent), irrespective of whether informative missing data were present or not. Moreover, the DTT using the MRM allowed for reaching a conclusion (under H(0) or H(1)) with fewer patients than the S method, the average sample number for the latter increasing importantly when the proportion of missing data increased. Incorporating MRM in sequential analysis of latent variables might provide a more powerful method than the traditional S method, even in the presence of non-informative or informative missing data.
Collapse
Affiliation(s)
- Véronique Sébille
- Laboratoire de Biostatistique, Faculté de Pharmacie, Université de Nantes, 1 rue Gaston Veil, 44035 Nantes Cedex 1, France.
| | | | | |
Collapse
|
16
|
Lipscomb J, Snyder CF, Gotay CC. Cancer outcomes measurement. Qual Life Res 2006; 16:143-64. [PMID: 17091365 DOI: 10.1007/s11136-006-9116-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2006] [Accepted: 08/16/2006] [Indexed: 10/23/2022]
Abstract
BACKGROUND In 2001, the U.S. National Cancer Institute established the Cancer Outcomes Measurement Working Group (COMWG) to evaluate and advance the state of the science in patient-reported outcome (PRO) measurement, with a focus on health-related quality of life (HRQOL). To guide its work, the COMWG adopted the revised Medical Outcomes Trust (MOT) attributes and review criteria for evaluating health status and quality-of-life instruments. OBJECTIVE With the MOT attributes providing the organizing principle, this paper summarizes and draws inferences from key COMWG findings about the methodological soundness of HRQOL assessment in cancer and steps required to move the field forward. RESULTS AND CONCLUSIONS Across a range of cancer research applications, especially clinical trials, a variety of generic, general cancer, and cancer site-specific measures of HRQOL have demonstrated adequate reliability, validity, responsiveness, feasibility, and cultural and language adaptation. Methodological challenges remain in the interpretability of HRQOL measures, though substantial progress has been made in defining a "minimum important difference" in scale scores. Much work remains in forging a stronger link between the conceptual model and measurement model in HRQOL instrumentation. Progress along all MOT attributes will likely accelerate with the growing application of modern psychometrics, particularly item response theory modeling, which provides the underpinnings for item banking and computer-adaptive assessment of HRQOL. Future research should emphasize prospectively designed studies to evaluate PRO measures within the MOT framework and in-depth investigations of the role of PRO measures in cancer decision making at all levels.
Collapse
Affiliation(s)
- Joseph Lipscomb
- Department of Health Policy and Management, Rollins School of Public Health, Emory University, Rm 642, 1518 Clifton Road, NE, Atlanta, GA 30322, USA.
| | | | | |
Collapse
|
17
|
Eadie TL, Yorkston KM, Klasner ER, Dudgeon BJ, Deitz JC, Baylor CR, Miller RM, Amtmann D. Measuring communicative participation: a review of self-report instruments in speech-language pathology. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2006; 15:307-20. [PMID: 17102143 PMCID: PMC2649949 DOI: 10.1044/1058-0360(2006/030)] [Citation(s) in RCA: 144] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
PURPOSE To assess the adequacy of self-report instruments in speech-language pathology for measuring a construct called communicative participation. METHOD Six instruments were evaluated relative to (a) the construct measured, (b) the relevance of individual items to communicative participation, and (c) their psychometric properties. RESULTS No instrument exclusively measured communicative participation. Twenty-six percent (n = 34) of all items (N = 132) across the reviewed instruments were consistent with communicative participation. The majority (76%) of the 34 items were associated with general communication, while the remaining 24% of the items were associated with communication at work, during leisure, or for establishing relationships. Instruments varied relative to psychometric properties. CONCLUSIONS No existing self-report instruments in speech-language pathology were found to be solely dedicated to measuring communicative participation. Developing an instrument for measuring communicative participation is essential for meeting the requirements of our scope of practice.
Collapse
Affiliation(s)
- Tanya L Eadie
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA 98105, USA.
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Vidotto G, Bertolotti G, Carone M, Arpinelli F, Bellia V, Jones PW, Donner CF. A new questionnaire specifically designed for patients affected by chronic obstructive pulmonary disease. Respir Med 2006; 100:862-70. [PMID: 16221547 DOI: 10.1016/j.rmed.2005.08.024] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2005] [Revised: 08/23/2005] [Accepted: 08/24/2005] [Indexed: 11/17/2022]
Abstract
The aim of this study was to develop a specific and valid questionnaire for Italian COPD patients, living on the north or the south of Italy-which are two culturally distinct areas. The project consisted in three steps: (1) initial item set generation to identify items relevant to both genders, all ages and both regions; (2) item reduction including tests of regional specificity; (3) tests of internal validity using item-response theory using Rasch one-parameter modelling. Ninty-six COPD patients (mean aged 69 yr; 78 Male) completed the original set of 124 items of the Italian Health Status Questionnaire (IHSQ). Item reduction was carried out using an established standardised approach employing classical psychometric test theory. The internal construct validity of the 47 items that survived this process were tested to determine whether they constituted a unidimensional construct "impaired health due to COPD" using Rasch analysis. This showed that the questionnaire had very good psychometric properties, with an excellent Person Separation Index of 0.95 and no evidence of bias due to item-trait interaction (chi104(2)=127.1, P=n.s.). The combination of classical test theory and modern item-response methodology has produced a questionnaire with excellent measurement properties suitable for COPD patients whether from the north or south of Italy.
Collapse
Affiliation(s)
- G Vidotto
- Department of General Psychology, University of Padua, Via Venezia 8, 35131 Padova, Italy.
| | | | | | | | | | | | | |
Collapse
|
19
|
Rajagopalan K, Abetz L, Mertzanis P, Espindle D, Begley C, Chalmers R, Caffery B, Snyder C, Nelson JD, Simpson T, Edrington T. Comparing the discriminative validity of two generic and one disease-specific health-related quality of life measures in a sample of patients with dry eye. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2005; 8:168-174. [PMID: 15804325 DOI: 10.1111/j.1524-4733.2005.03074.x] [Citation(s) in RCA: 80] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
OBJECTIVE The purpose of this study was to compare the discriminative properties of two generic health-related quality of life (QoL) instruments (SF-36 and EQ-5D) and a newly developed disease-specific patient-reported outcomes instrument (Impact of Dry Eye on Everyday Life (IDEEL)) to distinguish between different levels of dry eye severity. METHODS Assessment of 210 people: 130 with non-Sjogren's Keratoconjunctivitis Sicca (non-SS KCS), 32 with Sjögren's Syndrome (SS) and 48 controls; comparison of SF-36, EQ-5D, and IDEEL age-adjusted data by dry eye severity levels. Severity was assessed based on diagnosis (non-SS KCS, SS, control), patient-report (none, very mild, mild, moderate, severe, extremely severe) and clinician-report (none, mild, moderate, severe). RESULTS Discriminative validity results were consistent for all instruments. Significant differences between severity levels were found with most SF-36 scales (P < 0.05), all EQ-5D scales (P < 0.05), and all IDEEL scales (P < 0.0001), except for Treatment Satisfaction. IDEEL scales consistently outperformed the generic QoL measures regardless of the severity criterion used. Most SF-36 scales outperformed the EQ-5D QoL scale, but the EQ-5D visual analog scale outperformed the SF-36 scales, except for General Health Perceptions. CONCLUSIONS The disease-specific IDEEL scales are better able to discriminate between severity levels than the majority of the generic QoL scales. Preliminary evidence demonstrates that the IDEEL will be sensitive to QoL changes over time, although further testing in controlled longitudinal studies is needed.
Collapse
|
20
|
Abstract
New sumatriptan users in a California health plan were surveyed on the impact of migraine using a newly developed migraine impact measure, the Headache Impact Test-6. Productivity and satisfaction with migraine therapy also were assessed. After sumatriptan was initiated, participants reported significantly fewer workdays missed, fewer days worked with headache, and greater productivity while headache symptoms were present. In addition, almost 50% less members used narcotics/opioids, while the frequency, duration, and severity of migraines decreased. Initiation of sumatriptan therapy is associated with improvements in absenteeism and presenteeism, clinical outcomes (Headache Impact Test-6), and satisfaction. The benefits of triptan therapy extend to the employer, who sees a decrease in lost productivity, fewer emergency room visits, and less narcotics use in employees with migraine. Managed care organizations that provide this pharmacotherapy may foster greater satisfaction among members and employer customers.
Collapse
Affiliation(s)
- Mary Brun Weaver
- GlaxoSmithKline, Research Triangle Park, North Carolina 27709-3398, USA.
| | | | | |
Collapse
|
21
|
Bjorner JB, Kosinski M, Ware JE. The feasibility of applying item response theory to measures of migraine impact: a re-analysis of three clinical studies. Qual Life Res 2004; 12:887-902. [PMID: 14651410 DOI: 10.1023/a:1026175112538] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
BACKGROUND Item response theory (IRT) is a powerful framework for analyzing multiitem scales and is central to the implementation of computerized adaptive testing. OBJECTIVES To explain the use of IRT to examine measurement properties and to apply IRT to a questionnaire for measuring migraine impact--the Migraine Specific Questionnaire (MSQ). METHODS Data from three clinical studies that employed the MSQ-version 1 were analyzed by confirmatory factor analysis for categorical data and by IRT modeling. RESULTS Confirmatory factor analyses showed very high correlations between the factors hypothesized by the original test constructions. Further, high item loadings on one common factor suggest that migraine impact may be adequately assessed by only one score. IRT analyses of the MSQ were feasible and provided several suggestions as to how to improve the items and in particular the response choices. Out of 15 items, 13 showed adequate fit to the IRT model. In general, IRT scores were strongly associated with the scores proposed by the original test developers and with the total item sum score. Analysis of response consistency showed that more than 90% of the patients answered consistently according to a unidimensional IRT model. For the remaining patients, scores on the dimension of emotional function were less strongly related to the overall IRT scores that mainly reflected role limitations. Such response patterns can be detected easily using response consistency indices. Analysis of test precision across score levels revealed that the MSQ was most precise at one standard deviation worse than the mean impact level for migraine patients that are not in treatment. Thus, gains in test precision can be achieved by developing items aimed at less severe levels of migraine impact. CONCLUSIONS IRT proved useful for analyzing the MSQ. The approach warrants further testing in a more comprehensive item pool for headache impact that would enable computerized adaptive testing.
Collapse
|
22
|
Bjorner JB, Kosinski M, Ware JE. Calibration of an item pool for assessing the burden of headaches: an application of item response theory to the headache impact test (HIT). Qual Life Res 2004; 12:913-33. [PMID: 14651412 DOI: 10.1023/a:1026163113446] [Citation(s) in RCA: 159] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
BACKGROUND Measurement of headache impact is important in clinical trials, case detection, and the clinical monitoring of patients. Computerized adaptive testing (CAT) of headache impact has potential advantages over traditional fixed-length tests in terms of precision, relevance, real-time quality control and flexibility. OBJECTIVE To develop an item pool that can be used for a computerized adaptive test of headache impact. METHODS We analyzed responses to four well-known tests of headache impact from a population-based sample of recent headache sufferers (n = 1016). We used confirmatory factor analysis for categorical data and analyses based on item response theory (IRT). RESULTS In factor analyses, we found very high correlations between the factors hypothesized by the original test constructers, both within and between the original questionnaires. These results suggest that a single score of headache impact is sufficient. We established a pool of 47 items which fitted the generalized partial credit IRT model. By simulating a computerized adaptive health test we showed that an adaptive test of only five items had a very high concordance with the score based on all items and that different worst-case item selection scenarios did not lead to bias. CONCLUSION We have established a headache impact item pool that can be used in CAT of headache impact.
Collapse
|