1
|
Nurturing Untapped Integration Expertise of MS4 Assessment Writers. MEDICAL SCIENCE EDUCATOR 2024; 34:315-318. [PMID: 38686140 PMCID: PMC11055828 DOI: 10.1007/s40670-024-01974-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 01/03/2024] [Indexed: 05/02/2024]
Abstract
Creating original, integrated multiple-choice questions (MCQs) is time-consuming and onerous for basic science and clinical faculty. We demonstrate that medical students are co-experts to overcome assessment challenges of the faculty. We recruited, trained, and motivated medical students to write 10,000 high-quality MCQs for use in the foundational courses of medical education. These students were ideal because they possessed integrated knowledge (basic sciences and clinical experience). We taught them how to write high-quality MCQs using a writing template and continuous monitoring and support by an item bank curator. The students themselves also benefitted personally and pedagogically from the experience.
Collapse
|
2
|
Identification of application and interpretation errors that can occur in pairwise meta-analyses in systematic reviews of interventions: a systematic review. J Clin Epidemiol 2024; 170:111331. [PMID: 38552725 DOI: 10.1016/j.jclinepi.2024.111331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 02/27/2024] [Accepted: 03/18/2024] [Indexed: 05/13/2024]
Abstract
OBJECTIVES To generate a bank of items describing application and interpretation errors that can arise in pairwise meta-analyses in systematic reviews of interventions. STUDY DESIGN AND SETTING MEDLINE, Embase, and Scopus were searched to identify studies describing types of errors in meta-analyses. Descriptions of errors and supporting quotes were extracted by multiple authors. Errors were reviewed at team meetings to determine if they should be excluded, reworded, or combined with other errors, and were categorized into broad categories of errors and subcategories within. RESULTS Fifty articles met our inclusion criteria, leading to the identification of 139 errors. We identified 25 errors covering data extraction/manipulation, 74 covering statistical analyses, and 40 covering interpretation. Many of the statistical analysis errors related to the meta-analysis model (eg, using a two-stage strategy to determine whether to select a fixed or random-effects model) and statistical heterogeneity (eg, not undertaking an assessment for statistical heterogeneity). CONCLUSION We generated a comprehensive bank of possible errors that can arise in the application and interpretation of meta-analyses in systematic reviews of interventions. This item bank of errors provides the foundation for developing a checklist to help peer reviewers detect statistical errors.
Collapse
|
3
|
Measurement of visual functioning following first and second eye cataract surgery using Vision-Related Activity Limitation Item Bank. Graefes Arch Clin Exp Ophthalmol 2024; 262:857-864. [PMID: 37725146 DOI: 10.1007/s00417-023-06235-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 08/10/2023] [Accepted: 09/07/2023] [Indexed: 09/21/2023] Open
Abstract
PURPOSE This study aims to compare visual functioning (VF) after first or second eye cataract surgery using the vision-related activity limitation (VRAL) item bank. METHODS This prospective, interventional study included 787 patients (mean age, 58.2 years) with cataract undergoing cataract surgery (first eye surgery with/out ocular comorbidity, second eye surgery with/out ocular comorbidity) at a tertiary eye care center, South India, who were administered the item bank pre- and at 6 weeks postoperatively to assess change in VF. Rasch analysis was used to estimate VF at both time points, and responsiveness to cataract surgery was calculated as effect size (ES) which was interpreted as small (≤ 0.2), moderate (0.3-0.7), and large (≥ 0.8). RESULTS Mean best-corrected logMAR VA in surgical eye improved significantly postoperatively compared to preoperative VA (0.20 ± 0.40 vs. 1.19 ± 0.96; P < 0.0001) across all groups. Patients reported significant and large improvements in VF postoperatively across all groups: largest ES for first eye surgery without comorbidity (1.87 [95% CI, 1.61, 2.13]) and smallest for second eye without ocular comorbidity (1.55 [95% CI, 1.22, 1.88]). Compared to patients undergoing second eye surgery, first eye surgery patients reported significantly lower VF preoperatively (-0.72 ± 2.39 vs. 0.17 ± 2.34 logits; P < 0.0001), and a larger change in VF postoperatively (3.71 ± 2.33 logits vs. 4.27 ± 2.83 vs.; P = 0.004). CONCLUSIONS Cataract surgery resulted in large and significant improvements in VF, regardless of ocular comorbidity and first or second eye surgery. The VRAL item bank is a useful tool to measure responsiveness to cataract surgery.
Collapse
|
4
|
Content development for a new item-bank for measuring multifocal contact lens performance. J Patient Rep Outcomes 2024; 8:16. [PMID: 38329635 PMCID: PMC10853121 DOI: 10.1186/s41687-024-00689-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 01/18/2024] [Indexed: 02/09/2024] Open
Abstract
BACKGROUND Presbyopia is an age-related condition that causes a decreased ability to focus on nearby objects. Multifocal contact lenses are commonly used to address this issue. However, there seems to be a notable dissatisfaction among multifocal contact lens wearers. The absence of a reliable instrument to measure the patient's perspective, despite the widespread use of this method, highlights the need for further research in this area. OBJECTIVE The objective of this study is to develop an item-bank integrating all domains necessary to assess the patient's perspective on multifocal contact lens performance, offering a comprehensive measure. The item-bank will ensure a high level of content validity, be self-administered, and will initially be available in Spanish. The aim of this tool is to serve as a valuable resource for research and optometric clinics, facilitating the follow-up of patients with presbyopia who wear multifocal contact lenses or those who are newly starting to use them. METHODOLOGY The MCL-PRO item bank, followed a systematic and step-wise inductive approach to gather information, following the recommendations outlined in the COSMIN guidelines and similar studies. The process involved the following steps: (1) Literature review and relevant existing items identification (2) Social media review, (3) Semi-structured focus groups, (4) performing qualitative analysis, (5) refining and revising the items, and (6) generating the content of the item bank. RESULTS A total of 575 items were included in the item-bank hosted under 8 different domains that were found to be important for presbyopic population: visual symptoms (213), activity limitation (111), ocular symptoms (135), convenience (36), emotional well-being (33), general symptoms (16), cognitive issues (21) and economic issues (10). CONCLUSION The item-bank created has followed standardised methodology for its development and encloses all the aspects for MCL performance evaluation from patients perspective.
Collapse
|
5
|
Development of standard computerised adaptive test (CAT) settings for the EORTC CAT Core. Qual Life Res 2024:10.1007/s11136-023-03576-x. [PMID: 38231438 DOI: 10.1007/s11136-023-03576-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/01/2023] [Indexed: 01/18/2024]
Abstract
AIMS Computerised adaptive test (CAT) provides individualised patient reported outcome measurement while retaining direct comparability of scores across patients and studies. Optimal CAT measurement requires an appropriate CAT-setting, the set of criteria defining the CAT including start item, item selection criterion, and stop criterion. The European Organisation for Research and Treatment of Cancer (EORTC) CAT Core allows for assessing the 14 functional and symptom domains covered by the EORTC QLQ-C30 questionnaire. The aim was to present a general approach for selecting CAT-settings and to use this to develop a portfolio of standard settings for the EORTC CAT Core optimised for different purposes and populations. METHODS Using simulations, the measurement properties of CATs of different length and precision were evaluated and compared allowing for identifying the most suitable settings. All CATs were initiated with the most informative QLQ-C30 item. For each domain two fixed-length and two fixed-precision standard CATs were selected focusing on efficiency (brief version) and precision (long), respectively. RESULTS The brief fixed-length CATs included 3-5 items each while the long versions included 5-8 items. The fixed-precision CATs aimed for reliability of 0.65-0.95 (brief versions) and 0.85-0.98 (long versions), respectively. Median sample size savings using the CATs compared to the QLQ-C30 scales ranged 20%-31%, although savings varied considerably across the domains. CONCLUSION The EORTC CAT Core standard settings simplify selection of relevant and appropriate CATs. The CATs prioritise either brevity and efficiency or precision, but all provide increased measurement precision and hence, reduced sample size requirements compared to the QLQ-C30 scales. The CATs may be used as they are or modified to accommodate specific requirements.
Collapse
|
6
|
The first steps in the development of a cancer-specific patient-reported experience measure item bank (PREM-item bank): towards dynamic evaluation of experiences. Support Care Cancer 2024; 32:100. [PMID: 38214761 PMCID: PMC10786971 DOI: 10.1007/s00520-023-08266-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 12/17/2023] [Indexed: 01/13/2024]
Abstract
OBJECTIVE Since the implementation of value-based healthcare, there has been a growing emphasis on utilizing patient-reported experience measures (PREMs) to enhance the quality of care. However, the current PREMs are primarily generic and static, whereas healthcare is constantly evolving and encompasses a wide variety of aspects that impact care quality. To continuously improve care requires a dynamic PREM. The aim of this study was to propose an item bank for the establishment of a dynamic and care-specific patient-reported evaluation. METHODS In co-creation with patients, a mixed methods study was conducted involving: (1) an explorative review of the literature, (2) a focus group analysis with (ex-)patients, (3) qualitative analyses to formulate themes, and (4) a quantitative selection of items by patients and experts through prioritization. RESULTS Eight existing PREMs were evaluated. After removing duplicates, 141 items were identified. Through qualitative analyses of the focus group in which the patient journey was discussed, eight themes were formulated: "Organization of healthcare," "Competence of healthcare professionals," "Communication," "Information & services," "Patient empowerment," "Continuity & informal care," "Environment," and "Technology." Seven patients and eleven professionals were asked to prioritize what they considered the most important items. From this, an item bank with 76 items was proposed. CONCLUSION In collaboration with patients and healthcare professionals, we have proposed a PREM-item bank to evaluate the experiences of patients' receiving cancer care in an outpatient clinic. This item bank is the first step to dynamically assess the quality of cancer care provided in an outpatient setting.
Collapse
|
7
|
Psychometric properties of the Chinese version of the PROMIS-Cancer-Anxiety item bank assessed using a graded response model. Asia Pac J Oncol Nurs 2023; 10:100312. [PMID: 38106438 PMCID: PMC10724486 DOI: 10.1016/j.apjon.2023.100312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 09/24/2023] [Indexed: 12/19/2023] Open
Abstract
Objective This study aimed to examine the psychometric properties of the Chinese version of the Patient-Reported Outcome Measurement Information System (PROMIS)-Cancer-Anxiety item bank using a graded response model in a sample of patients with cancer. Methods A cross-sectional study was conducted and the Chinese version of the PROMIS-Cancer-Anxiety item bank was used to measure anxiety in patients with cancer. The unidimensional structure of the item bank was evaluated using principal component analysis. Residual correlations and the graphs of item mean scores conditional on the rest scores were examined to evaluate the local independence and monotonicity of the items, respectively. Item characteristics were described using item parameter estimates and item information. Operating characteristic curves (OCCs) and test information curve (TIC) were also plotted. Measurement invariance across age, gender, and education level was assessed to identify possible differential item functioning (DIF). Results A total of 1075 patients with cancer were enrolled. Under the assumptions of unidimensionality, local independence, and monotonicity, the discrimination parameters a ranged from 2.30 to 5.47, and the threshold parameters b ranged from b1 = -2.87 to b4 = 3.21 with proper intervals. Completely overlapped category curves were not observed among the OCCs of any items. Item information and TIC showed that the item bank had a wide measurement range. The DIFs for age, gender, and education level for all items were not remarkable. Conclusions The results supported using the Chinese version of the PROMIS-Cancer-Anxiety item bank to measure anxiety and develop a computerized adaptive testing (CAT) system for anxiety in patients with cancer.
Collapse
|
8
|
Development and psychometric evaluation of item banks for memory and attention - supplements to the EORTC CAT Core instrument. Health Qual Life Outcomes 2023; 21:124. [PMID: 37968682 PMCID: PMC10647100 DOI: 10.1186/s12955-023-02199-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 10/11/2023] [Indexed: 11/17/2023] Open
Abstract
BACKGROUND Cancer patients may experience a decrease in cognitive functioning before, during and after cancer treatment. So far, the Quality of Life Group of the European Organisation for Research and Treatment of Cancer (EORTC QLG) developed an item bank to assess self-reported memory and attention within a single, cognitive functioning scale (CF) using computerized adaptive testing (EORTC CAT Core CF item bank). However, the distinction between different cognitive functions might be important to assess the patients' functional status appropriately and to determine treatment impact. To allow for such assessment, the aim of this study was to develop and psychometrically evaluate separate item banks for memory and attention based on the EORTC CAT Core CF item bank. METHODS In a multistep process including an expert-based content analysis, we assigned 44 items from the EORTC CAT Core CF item bank to the memory or attention domain. Then, we conducted psychometric analyses based on a sample used within the development of the EORTC CAT Core CF item bank. The sample consisted of 1030 cancer patients from Denmark, France, Poland, and the United Kingdom. We evaluated measurement properties of the newly developed item banks using confirmatory factor analysis (CFA) and item response theory model calibration. RESULTS Item assignment resulted in 31 memory and 13 attention items. Conducted CFAs suggested good fit to a 1-factor model for each domain and no violations of monotonicity or indications of differential item functioning. Evaluation of CATs for both memory and attention confirmed well-functioning item banks with increased power/reduced sample size requirements (for CATs ≥ 4 items and up to 40% reduction in sample size requirements in comparison to non-CAT format). CONCLUSION Two well-functioning and psychometrically robust item banks for memory and attention were formed from the existing EORTC CAT Core CF item bank. These findings could support further research on self-reported cognitive functioning in cancer patients in clinical trials as well as for real-word-evidence. A more precise assessment of attention and memory deficits in cancer patients will strengthen the evidence on the effects of cancer treatment for different cancer entities, and therefore contribute to shared and informed clinical decision-making.
Collapse
|
9
|
Development of a diverse set of standard short forms based on the EORTC CAT Core item banks. Qual Life Res 2023:10.1007/s11136-023-03373-6. [PMID: 36853573 DOI: 10.1007/s11136-023-03373-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/08/2023] [Indexed: 03/01/2023]
Abstract
PURPOSE The European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group has developed item banks covering the 14 domains of the EORTC QLQ-C30 quality of life questionnaire. These allow for dynamic assessment and for forming population/study specific static short forms. To simplify selection of relevant short forms, we here present a portfolio of standard short forms with measurement properties optimized for different populations. METHODS For each domain, a brief and a long version were constructed for each of three populations having mild, moderate, and severe symptoms, respectively. The most informative items were prioritised while also taking content into consideration. All short forms included at least one QLQ-C30 item. The measurement precision/power of the short forms was compared to the corresponding QLQ-C30 scales using simulations. RESULTS In total, 84 short forms were constructed. The brief versions included 3-5 items each, the long versions 5-9 items. Estimated sample size savings using the suggested short forms while maintaining the same power as with the QLQ-C30 ranged 3-50% across domains with median savings of 19% (brief versions) and 28% (long versions), respectively. CONCLUSION The suggested short forms allow for simple selection of items particularly relevant for patients with mild, moderate, or severe symptoms, respectively. They facilitate the use of smaller samples without loss of power compared to the QLQ-C30 scales. The suggested short forms may be used as they are or adapted to the specific aims of individual studies/settings.
Collapse
|
10
|
The development of a glaucoma-specific health-related quality of life item bank supporting a novel computerized adaptive testing system in Asia. J Patient Rep Outcomes 2022; 6:107. [PMID: 36219349 PMCID: PMC9554106 DOI: 10.1186/s41687-022-00513-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 09/16/2022] [Indexed: 11/07/2022] Open
Abstract
Background A glaucoma-specific health-related quality of life (HRQoL) item bank (IB) and computerized adaptive testing (CAT) system relevant to Asian populations is not currently available. We aimed to develop content for an IB focusing on HRQoL domains important to Asian people with glaucoma; and to compare the content coverage of our new instrument with established glaucoma-specific instruments.
Methods In this qualitative study of glaucoma patients recruited from the Singapore National Eye Centre (November 2018-November 2019), items/domains were generated from: (1) glaucoma-specific questionnaires; (2) published articles; (3) focus groups/semi-structured interviews with glaucoma patients (n = 27); and (4) feedback from glaucoma experts. Data were analyzed using the constant comparative method. Items were systematically refined to a concise set, and pre-tested using cognitive interviews with 27 additional glaucoma patients.
Results Of the 54 patients (mean ± standard deviation [SD] age 66.9 ± 9.8; 53.7% male), 67 (62.0%), 30 (27.8%), and 11 (10.2%) eyes had primary open angle glaucoma, angle closure glaucoma, and no glaucoma respectively. Eighteen (33.3%), 11 (20.4%), 8 (14.8%), 12 (22.2%), and 5 (9.3%) patients had no, mild, moderate, severe, or advanced/end-stage glaucoma (better eye), respectively. Initially, 311 items within nine HRQoL domains were identified: Visual Symptoms, Ocular Comfort Symptoms, Activity Limitation, Driving, Lighting, Mobility, Psychosocial, Glaucoma management, and Work; however, Driving and Visual Symptoms were subsequently removed during the refinement process. During cognitive interviews, 12, 23 and 10 items were added, dropped and modified, respectively.
Conclusion Following a rigorous process, we developed a 221-item, 7-domain Asian glaucoma-specific IB. Once operationalised using CAT, this new instrument will enable precise, rapid, and comprehensive assessment of the HRQoL impact of glaucoma and associated treatment efficacy.
Supplementary Information The online version contains supplementary material available at 10.1186/s41687-022-00513-3.
Collapse
|
11
|
Delirium Item Bank: Utilization to Evaluate and Create Delirium Instruments. Dement Geriatr Cogn Disord 2022; 51:110-119. [PMID: 35533663 PMCID: PMC9518700 DOI: 10.1159/000522522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 02/07/2022] [Indexed: 11/19/2022] Open
Abstract
INTRODUCTION The large number of heterogeneous instruments in active use for identification of delirium prevents direct comparison of studies and the ability to combine results. In a recent systematic review we performed, we recommended four commonly used and well-validated instruments and subsequently harmonized them using advanced psychometric methods to develop an item bank, the Delirium Item Bank (DEL-IB). The goal of the present study was to find optimal cut-points on four existing instruments and to demonstrate use of the DEL-IB to create new instruments. METHODS We used a secondary analysis and simulation study based on data from three previous studies of hospitalized older adults (age 65+ years) in the USA, Ireland, and Belgium. The combined dataset included 600 participants, contributing 1,623 delirium assessments, and an overall incidence of delirium of about 22%. The measurements included the Diagnostic and Statistical Manual of Mental Disorders, 5th Edition diagnostic criteria for delirium, Confusion Assessment Method (long form and short form), Delirium Observation Screening Scale, Delirium Rating Scale-Revised-98 (total and severity scores), and Memorial Delirium Assessment Scale (MDAS). RESULTS We identified different cut-points for each existing instrument to optimize sensitivity or specificity, and compared instrument performance at each cut-point to the author-defined cut-point. For instance, the cut-point on the MDAS that maximizes both sensitivity and specificity was at a sum score of 6 yielding 89% sensitivity and 79% specificity. We then created four new example instruments (two short forms and two long forms) and evaluated their performance characteristics. In the first example short form instrument, the cut-point that maximizes sensitivity and specificity was at a sum score of 3 yielding 90% sensitivity, 81% specificity, 30% positive predictive value, and 99% negative predictive value. DISCUSSION/CONCLUSION We used the DEL-IB to better understand the psychometric performance of widely used delirium identification instruments and scorings, and also demonstrated its use to create new instruments. Ultimately, we hope that the DEL-IB might be used to create optimized delirium identification instruments and to spur the development of a unified approach to identify delirium.
Collapse
|
12
|
Identifying the content for an item bank and computerized adaptive testing system to measure the impact of age-related macular degeneration on health-related quality of life. Qual Life Res 2021; 31:1237-1246. [PMID: 34562188 DOI: 10.1007/s11136-021-02989-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/29/2021] [Indexed: 11/27/2022]
Abstract
PURPOSE We are developing an age-related macular degeneration (AMD) health-related quality of life (HRQoL) item bank, applicable to Western and Asian populations. We report primarily on content generation and refinement, but also compare the HRQoL issues reported in our study with Western studies and current AMD-HRQoL questionnaires. METHODS In this cross-sectional, qualitative study of AMD patients attending the Singapore National Eye Centre (May-December 2019), items/domains were generated from: (1) AMD-specific questionnaires; (2) published articles; (3) focus groups/semi-structured interviews with AMD patients (n = 27); and (4) written feedback from retinal experts. Following thematic analysis, items were systematically refined to a minimally representative set and pre-tested using cognitive interviews with 16 AMD patients. RESULTS Of the 27 patients (mean ± standard deviation age 67.9 ± 7.0; 59.2% male), 18 (66.7%), two (7.4%), and seven (25.9%) had no, early-intermediate, and late/advanced AMD (better eye), respectively. Whilst some HRQoL issues, e.g. activity limitation, mobility, lighting, and concerns were similarly reported by Western patients and covered by other questionnaires, others like anxiety about intravitreal injections, work tasks, and financial dependency were novel. Overall, 462 items within seven independent HRQoL domains were identified: Activity limitation, Lighting, Mobility, Emotional, Concerns, AMD management, and Work. Following item refinement, items were reduced to 219, with 31 items undergoing amendment. CONCLUSION Our 7-domain, 219-item AMD-specific HRQoL instrument will undergo psychometric testing and calibration for computerized adaptive testing. The future instrument will enable users to precisely, rapidly, and comprehensively quantify the HRQoL impact of AMD and associated treatments, with item coverage relevant across several populations.
Collapse
|
13
|
Swedish translation and cross-cultural adaptation of eight pediatric item banks from the Patient-Reported Outcomes Measurement Information System (PROMIS) ®. J Patient Rep Outcomes 2021; 5:80. [PMID: 34487250 PMCID: PMC8421493 DOI: 10.1186/s41687-021-00353-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 08/24/2021] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND This study is part of the Swedish initiative for the establishment of standardized, modern patient-reported measures for national use in Swedish healthcare. The goal was to translate and culturally adapt eight pediatric Patient-Reported Outcomes Measurement Information System (PROMIS®) item banks (anger, anxiety, depressive symptoms, family relationships, fatigue, pain interference, peer relationships and physical activity) into Swedish. METHODS Authorization to translate all currently available pediatric PROMIS item banks (autumn, 2016) into Swedish was obtained from the PROMIS Health Organization. The translation followed the Functional Assessment of Chronic Illness Therapy translation recommendations with one major modification, which was the use of a bilingual multi-professional review workshop. The following steps were applied: translation, reconciliation, a two-day multi professional reviewer workshop, back translation, and cognitive debriefing with eleven children (8-17 years) before final review. The bilingual multi-professional review workshop provided a simultaneous, in-depth assessment from different professionals. The group consisted of questionnaire design experts, researchers experienced in using patient-reported measures in healthcare, linguists, and pediatric healthcare professionals. RESULTS All item banks had translation issues that needed to be resolved. Twenty-four items (20.7%) needed resolution at the final review stage after cognitive debriefing. The issues with translations included 1. Lack of matching definitions with items across languages (6 items); 2. Problems related to language, vocabulary, and cultural differences (6 items); and 3. Difficulties in adaptation to age-appropriate language (12 items). CONCLUSIONS The translated and adapted versions of the eight Swedish pediatric PROMIS item banks are linguistically acceptable. The next stage will be cross-cultural validation studies in Sweden. Despite the fact that there are cultural differences between Sweden and the United States, our translation processes have successfully managed to address all issues. Expert review groups from already-established networks and processes regarding pediatric healthcare throughout the country will facilitate the future implementation of pediatric PROMIS item banks in Sweden.
Collapse
|
14
|
Optimizing measurement of vision-related quality of life: a computerized adaptive test for the impact of vision impairment questionnaire (IVI-CAT). Qual Life Res 2019; 29:765-774. [PMID: 31707693 DOI: 10.1007/s11136-019-02354-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/29/2019] [Indexed: 10/25/2022]
Abstract
PURPOSE To compare the results from a simulated computerized adaptive test (CAT) for the 28-item Impact of Vision Impairment (IVI) questionnaire and the original paper-pencil version in terms of efficiency (main outcome), defined as percentage item reduction. METHODS Using paper-pencil IVI data from 832 participants across the spectrum of vision impairment, item calibrations of the 28-item IVI instrument and its associated 20-item vision-specific functioning (VSF) and 8-item emotional well-being (EWB) subscales were generated with Rasch analysis. Based on these calibrations, CAT simulations were conducted on 1000 cases, with 'high' and 'moderate' precision stopping rules (standard error of measurement [SEM] 0.387 and 0.521, respectively). We examined the average number of items needed to satisfy the stopping rules and the corresponding percentage item reduction, level of agreement between person measures estimated from the full IVI item bank and from the CAT simulations, and item exposure rates (IER). RESULTS For the overall IVI-CAT, 5 or 9.7 items were required, on average, to obtain moderate or high precision estimates of vision-related quality of life, corresponding to 82.1 and 65.4% item reductions compared to the paper-pencil IVI. Agreement was high between the person measures generated from the full IVI item bank and the IVI-CAT for both the high precision simulation (mean bias, - 0.004 logits; 95% LOA - 0.594 to 0.587) and moderate precision simulation (mean bias, 0.014 logits; 95% LOA - 0.828 to 0.855). The IER for the IVI-CAT in the moderate precision simulation was skewed, with six EWB items used > 40% of the time. CONCLUSION Compared to the paper-pencil IVI instrument, the IVI-CATs required fewer items without loss of measurement precision, making them potentially attractive outcome instruments for implementation into clinical trials, healthcare, and research. Final versions of the IVI-CATs are available.
Collapse
|
15
|
A qualitative study investigating the meaning of participation to improve the measurement of this construct. Qual Life Res 2019; 28:2233-2246. [PMID: 30993605 PMCID: PMC6620252 DOI: 10.1007/s11136-019-02179-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/04/2019] [Indexed: 12/16/2022]
Abstract
PURPOSE The purpose of this study was to improve the measurement of participation. Research questions were as follows: (1) What constitutes participation according to adults? (2) Do they mention participation subdomains that are not covered in the Patient-Reported Outcomes Measurement Information System (PROMIS) item bank "Ability to Participate in Social Roles and Activities"? METHODS Semi-structured interviews were conducted with 46 adults from the general population. Interviews were thematically analysed using the International Classification of Functioning, Disability and Health (ICF) as conceptual framework. Thereafter, assigned codes were compared to PROMIS item bank. RESULTS Participants mentioned a variety of participation subdomains that were meaningful to them, such as socializing and employment. All subdomains could be classified into the ICF. The following subdomains were not covered by the PROMIS item bank: acquisition of necessities, education life, economic life, community life, and religion and spirituality. Also a distinction between remunerative (i.e. paid) and non-remunerative (i.e. unpaid) employment, and domestic life was missing. Several ICF sub-codes were not mentioned, such as ceremonies. CONCLUSIONS Many participation subdomains were mentioned to be meaningful. As several of these subdomains are not covered in the PROMIS item bank, it may benefit from extension with new (patient-)reported subdomains of participation.
Collapse
|
16
|
Towards standardized patient reported physical function outcome reporting: linking ten commonly used questionnaires to a common metric. Qual Life Res 2018; 28:187-197. [PMID: 30317425 PMCID: PMC6339672 DOI: 10.1007/s11136-018-2007-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/17/2018] [Indexed: 02/01/2023]
Abstract
Objectives Outcomes obtained using different physical function patient reported outcome measures (PROMs) are difficult to compare. To facilitate standardization of physical function outcome measurement and reporting we developed an item response theory (IRT) based standardized physical function score metric for ten commonly used physical function PROMs. Methods Data of a total of 16,386 respondents from representative cohorts of patients with rheumatic diseases as well as the Dutch general population were used to map the items of ten commonly used physical function PROMs on a continuous latent physical function variable. The resulting IRT based common metric was cross-validated in an independent dataset of 243 patients with gout, osteoarthritis or polymyalgia in which four of the linked PROMs were administered. Results Our analyses supported that all 97 items of the ten included PROMs relate to a single underlying physical function variable and that responses to each item could be described by the generalized partial credit IRT model. In the cross-validation analyses we found congruent mean scores for four different PROMs when the IRT based scoring procedures were used. Conclusions We showed that the standardized physical function score metric developed in this study can be used to facilitate standardized reporting of physical function outcomes for ten commonly used make physical function PROMs. Electronic supplementary material The online version of this article (10.1007/s11136-018-2007-0) contains supplementary material, which is available to authorized users.
Collapse
|
17
|
Developing an item bank to measure the coping strategies of people with hereditary retinal diseases. Graefes Arch Clin Exp Ophthalmol 2018; 256:1291-1298. [PMID: 29730797 DOI: 10.1007/s00417-018-3998-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Revised: 04/10/2018] [Accepted: 04/20/2018] [Indexed: 01/09/2023] Open
Abstract
PURPOSE Our understanding of the coping strategies used by people with visual impairment to manage stress related to visual loss is limited. This study aims to develop a sophisticated coping instrument in the form of an item bank implemented via Computerised adaptive testing (CAT) for hereditary retinal diseases. METHODS Items on coping were extracted from qualitative interviews with patients which were supplemented by items from a literature review. A systematic multi-stage process of item refinement was carried out followed by expert panel discussion and cognitive interviews. The final coping item bank had 30 items. Rasch analysis was used to assess the psychometric properties. A CAT simulation was carried out to estimate an average number of items required to gain precise measurement of hereditary retinal disease-related coping. RESULTS One hundred eighty-nine participants answered the coping item bank (median age = 58 years). The coping scale demonstrated good precision and targeting. The standardised residual loadings for items revealed six items grouped together. Removal of the six items reduced the precision of the main coping scale and worsened the variance explained by the measure. Therefore, the six items were retained within the main scale. Our CAT simulation indicated that, on average, less than 10 items are required to gain a precise measurement of coping. CONCLUSIONS This is the first study to develop a psychometrically robust coping instrument for hereditary retinal diseases. CAT simulation indicated that on an average, only four and nine items were required to gain measurement at moderate and high precision, respectively.
Collapse
|
18
|
Some recommendations for developing multidimensional computerized adaptive tests for patient-reported outcomes. Qual Life Res 2018; 27:1055-1063. [PMID: 29476312 PMCID: PMC5874279 DOI: 10.1007/s11136-018-1821-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/21/2018] [Indexed: 10/31/2022]
Abstract
PURPOSE Multidimensional item response theory and computerized adaptive testing (CAT) are increasingly used in mental health, quality of life (QoL), and patient-reported outcome measurement. Although multidimensional assessment techniques hold promises, they are more challenging in their application than unidimensional ones. The authors comment on minimal standards when developing multidimensional CATs. METHODS Prompted by pioneering papers published in QLR, the authors reflect on existing guidance and discussions from different psychometric communities, including guidelines developed for unidimensional CATs in the PROMIS project. RESULTS The commentary focuses on two key topics: (1) the design, evaluation, and calibration of multidimensional item banks and (2) how to study the efficiency and precision of a multidimensional item bank. The authors suggest that the development of a carefully designed and calibrated item bank encompasses a construction phase and a psychometric phase. With respect to efficiency and precision, item banks should be large enough to provide adequate precision over the full range of the latent constructs. Therefore CAT performance should be studied as a function of the latent constructs and with reference to relevant benchmarks. Solutions are also suggested for simulation studies using real data, which often result in too optimistic evaluations of an item bank's efficiency and precision. DISCUSSION Multidimensional CAT applications are promising but complex statistical assessment tools which necessitate detailed theoretical frameworks and methodological scrutiny when testing their appropriateness for practical applications. The authors advise researchers to evaluate item banks with a broad set of methods, describe their choices in detail, and substantiate their approach for validation.
Collapse
|
19
|
Development of an item bank to measure factual disease and treatment related knowledge of rheumatoid arthritis patients in the treat to target era. PATIENT EDUCATION AND COUNSELING 2018; 101:67-73. [PMID: 28811047 DOI: 10.1016/j.pec.2017.07.019] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Revised: 04/21/2017] [Accepted: 07/17/2017] [Indexed: 06/07/2023]
Abstract
OBJECTIVE To develop a Disease and treatment associated Knowledge in RA item bank (DataK-RA) based on item response theory. METHODS Initial items were developed from a systematic review. Rheumatology professionals identified relevant content trough a RAND modified Delphi scoring procedure and consensus meeting. RA patients provided additional content trough a focus group. Patients and professionals rated readability, feasibility and comprehensiveness of resulting items. Cross-sectional data were collected to evaluate psychometric properties of the items. RESULTS Data of 473 patients were used for item reduction and calibration. Twenty items were discarded based on corrected item-total point biserial correlation <0.30. Confirmatory factor analysis with weighted least squares estimation on the polychoric correlation matrix suggested good fit for a unidimensional model for the remaining 42 items (CFI 0.97 TLI=0.97, RMSEA=0.02, WRMR=0.97), supporting the proposed scoring procedure. Scores were highly reliable and normally distributed with minimal ceiling (1.8%) and no floor effects. 75% of tested hypotheses about the association of DataK-RA scores with related constructs were supported, indicating good construct validity. CONCLUSION DataK-RA is a psychometrically sound item bank. PRACTICE IMPLICATIONS DataK-RA provides health professionals and researchers with a tool to identify and target patients' information needs or to assess effects of educational efforts.
Collapse
|
20
|
Measuring everyday functional competence using the Rasch assessment of everyday activity limitations (REAL) item bank. Qual Life Res 2017; 26:2949-2959. [PMID: 28638966 PMCID: PMC5655561 DOI: 10.1007/s11136-017-1627-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/15/2017] [Indexed: 11/23/2022]
Abstract
OBJECTIVE Traditional patient-reported physical function instruments often poorly differentiate patients with mild-to-moderate disability. We describe the development and psychometric evaluation of a generic item bank for measuring everyday activity limitations in outpatient populations. STUDY DESIGN AND SETTING Seventy-two items generated from patient interviews and mapped to the International Classification of Functioning, Disability and Health (ICF) domestic life chapter were administered to 1128 adults representative of the Dutch population. The partial credit model was fitted to the item responses and evaluated with respect to its assumptions, model fit, and differential item functioning (DIF). Measurement performance of a computerized adaptive testing (CAT) algorithm was compared with the SF-36 physical functioning scale (PF-10). RESULTS A final bank of 41 items was developed. All items demonstrated acceptable fit to the partial credit model and measurement invariance across age, sex, and educational level. Five- and ten-item CAT simulations were shown to have high measurement precision, which exceeded that of SF-36 physical functioning scale across the physical function continuum. Floor effects were absent for a 10-item empirical CAT simulation, and ceiling effects were low (13.5%) compared with SF-36 physical functioning (38.1%). CAT also discriminated better than SF-36 physical functioning between age groups, number of chronic conditions, and respondents with or without rheumatic conditions. CONCLUSION The Rasch assessment of everyday activity limitations (REAL) item bank will hopefully prove a useful instrument for assessing everyday activity limitations. T-scores obtained using derived measures can be used to benchmark physical function outcomes against the general Dutch adult population.
Collapse
|
21
|
Psychometric evaluation of an item bank for computerized adaptive testing of the EORTC QLQ-C30 cognitive functioning dimension in cancer patients. Qual Life Res 2017; 26:2919-2929. [PMID: 28707048 PMCID: PMC5655578 DOI: 10.1007/s11136-017-1648-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/08/2017] [Indexed: 11/29/2022]
Abstract
BACKGROUND The European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Group is developing computerized adaptive testing (CAT) versions of all EORTC Quality of Life Questionnaire (QLQ-C30) scales with the aim to enhance measurement precision. Here we present the results on the field-testing and psychometric evaluation of the item bank for cognitive functioning (CF). METHODS In previous phases (I-III), 44 candidate items were developed measuring CF in cancer patients. In phase IV, these items were psychometrically evaluated in a large sample of international cancer patients. This evaluation included an assessment of dimensionality, fit to the item response theory (IRT) model, differential item functioning (DIF), and measurement properties. RESULTS A total of 1030 cancer patients completed the 44 candidate items on CF. Of these, 34 items could be included in a unidimensional IRT model, showing an acceptable fit. Although several items showed DIF, these had a negligible impact on CF estimation. Measurement precision of the item bank was much higher than the two original QLQ-C30 CF items alone, across the whole continuum. Moreover, CAT measurement may on average reduce study sample sizes with about 35-40% compared to the original QLQ-C30 CF scale, without loss of power. CONCLUSION A CF item bank for CAT measurement consisting of 34 items was established, applicable to various cancer patients across countries. This CAT measurement system will facilitate precise and efficient assessment of HRQOL of cancer patients, without loss of comparability of results.
Collapse
|
22
|
The COPD-SIB: a newly developed disease-specific item bank to measure health-related quality of life in patients with chronic obstructive pulmonary disease. Health Qual Life Outcomes 2016; 14:97. [PMID: 27349641 PMCID: PMC4924274 DOI: 10.1186/s12955-016-0500-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 06/20/2016] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND Health-related quality of life (HRQoL) is widely used as an outcome measure in the evaluation of treatment interventions in patients with chronic obstructive pulmonary disease (COPD). In order to address challenges associated with existing fixed-length measures (e.g., too long to be used routinely, too short to ensure both content validity and reliability), a COPD-specific item bank (COPD-SIB) was developed. METHODS Items were selected based on literature review and interviews with Dutch COPD patients, with a strong focus on both content validity and item comprehension. The psychometric quality of the item bank was evaluated using Mokken Scale Analysis and parametric Item Response Theory, using data of 666 COPD patients. RESULTS The final item bank contains 46 items that form a strong scale, tapping into eight important themes that were identified based on literature review and patient interviews: Coping with disease/symptoms, adaptability; Autonomy; Anxiety about the course/end-state of the disease, hopelessness; Positive psychological functioning; Situations triggering or enhancing breathing problems; Symptoms; Activity; Impact. CONCLUSIONS The 46-item COPD-SIB has good psychometric properties and content validity. Items are available in Dutch and English. The COPD-SIB can be used as a stand-alone instrument, or to inform computerised adaptive testing.
Collapse
|
23
|
PROMIS Fatigue Item Bank had Clinical Validity across Diverse Chronic Conditions. J Clin Epidemiol 2016; 73:128-34. [PMID: 26939927 PMCID: PMC4902759 DOI: 10.1016/j.jclinepi.2015.08.037] [Citation(s) in RCA: 153] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2014] [Revised: 08/19/2015] [Accepted: 08/21/2015] [Indexed: 01/05/2023]
Abstract
OBJECTIVE To evaluate the comparability and responsiveness of Patient-Reported Outcomes Measurement Information System (PROMIS) fatigue item bank across six chronic conditions. STUDY DESIGN AND SETTING Individuals (n = 1,430) with chronic obstructive pulmonary disease (n = 125), chronic heart failure (n = 60), chronic back pain (n = 218), major depressive disorder (n = 196), rheumatoid arthritis (n = 521), and cancer (n = 310) completed assessments from the PROMIS fatigue item bank at baseline and a clinically relevant follow-up. The cancer and arthritis samples were followed in observational studies; the other four groups were enrolled immediately before a planned clinical intervention. All participants completed global ratings of change at follow-up. Linear mixed-effects models and standardized response means were estimated to examine clinical validity and responsiveness to change. RESULTS All patient groups reported more fatigue than the general population (range = 0.2-1.29 standard deviation worse). The four clinical groups with pretreatment baseline data experienced significant improvement in fatigue at follow-up (effect size range = 0.25-0.91). Individuals reporting better overall health usually experienced larger fatigue changes than those reporting worse overall health. CONCLUSION The results support the PROMIS fatigue measures's responsiveness to change in six different chronic conditions. In addition, these results support the ability of the PROMIS fatigue measures to compare differences in fatigue across a range of chronic conditions, thereby enabling comparative effectiveness research.
Collapse
|
24
|
Validity of PROMIS physical function measured in diverse clinical samples. J Clin Epidemiol 2016; 73:112-8. [PMID: 26970039 PMCID: PMC4968197 DOI: 10.1016/j.jclinepi.2015.08.039] [Citation(s) in RCA: 156] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2014] [Revised: 08/19/2015] [Accepted: 08/21/2015] [Indexed: 11/17/2022]
Abstract
OBJECTIVES To evaluate the validity of the Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function measures using longitudinal data collected in six chronic health conditions. STUDY DESIGN AND SETTING Individuals with rheumatoid arthritis (RA), major depressive disorder (MDD), back pain, chronic obstructive pulmonary disease (COPD), chronic heart failure (CHF), and cancer completed the PROMIS Physical Function computerized adaptive test or fixed-length short form at baseline and at the end of clinically relevant follow-up intervals. Anchor items were also administered to assess change in physical function and general health. Linear mixed-effects models and standardized response means were estimated at baseline and follow-up. RESULTS A total of 1,415 individuals participated (COPD n = 121; CHF n = 57; back pain n = 218; MDD n = 196; RA n = 521; cancer n = 302). The PROMIS Physical Function scores improved significantly for treatment of CHF and back pain patients but not for patients with MDD or COPD. Most of the patient subsamples that reported improvement or worsening on the anchors showed a corresponding positive or negative change in PROMIS Physical Function. CONCLUSION This study provides evidence that the PROMIS Physical Function measures are sensitive to change in intervention studies where physical function is expected to change and able to distinguish among different clinical samples. The results inform the estimation of meaningful change, enabling comparative effectiveness research.
Collapse
|
25
|
Clinical validity of PROMIS Depression, Anxiety, and Anger across diverse clinical samples. J Clin Epidemiol 2016; 73:119-27. [PMID: 26931289 DOI: 10.1016/j.jclinepi.2015.08.036] [Citation(s) in RCA: 290] [Impact Index Per Article: 36.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2014] [Revised: 08/25/2015] [Accepted: 08/31/2015] [Indexed: 10/22/2022]
Abstract
OBJECTIVES The purpose of this study was to evaluate the responsiveness to change of the PROMIS negative affect measures (depression, anxiety, and anger) using longitudinal data collected in six chronic health conditions. STUDY DESIGN AND SETTING Individuals with major depressive disorder (MDD), back pain, chronic obstructive pulmonary disease (COPD), chronic heart failure (CHF), and cancer completed PROMIS negative affect instruments as computerized adaptive test or as fixed-length short form at baseline and a clinically relevant follow-up interval. Participants also completed global ratings of health. Linear mixed effects models and standardized response means (SRM) were estimated at baseline and follow-up. RESULTS A total of 903 individuals participated (back pain, n = 218; cancer, n = 304; CHF, n = 60; COPD, n = 125; MDD, n = 196). All three negative affect instruments improved significantly for treatments of depression and pain. Depression improved for CHF patients (anxiety and anger not administered), whereas anxiety improved significantly in COPD groups (stable and exacerbation). Response to treatment was not assessed in cancer. Subgroups of patients reporting better or worse health showed a corresponding positive or negative average SRM for negative affect across samples. CONCLUSION This study provides evidence that the PROMIS negative affect scores are sensitive to change in intervention studies in which negative affect is expected to change. These results inform the estimation of meaningful change and enable comparative effectiveness research.
Collapse
|
26
|
Using Patient Health Questionnaire-9 item parameters of a common metric resulted in similar depression scores compared to independent item response theory model reestimation. J Clin Epidemiol 2015; 71:25-34. [PMID: 26475569 DOI: 10.1016/j.jclinepi.2015.10.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Revised: 09/29/2015] [Accepted: 10/06/2015] [Indexed: 11/23/2022]
Abstract
OBJECTIVES To investigate the validity of a common depression metric in independent samples. STUDY DESIGN AND SETTING We applied a common metrics approach based on item-response theory for measuring depression to four German-speaking samples that completed the Patient Health Questionnaire (PHQ-9). We compared the PHQ item parameters reported for this common metric to reestimated item parameters that derived from fitting a generalized partial credit model solely to the PHQ-9 items. We calibrated the new model on the same scale as the common metric using two approaches (estimation with shifted prior and Stocking-Lord linking). By fitting a mixed-effects model and using Bland-Altman plots, we investigated the agreement between latent depression scores resulting from the different estimation models. RESULTS We found different item parameters across samples and estimation methods. Although differences in latent depression scores between different estimation methods were statistically significant, these were clinically irrelevant. CONCLUSION Our findings provide evidence that it is possible to estimate latent depression scores by using the item parameters from a common metric instead of reestimating and linking a model. The use of common metric parameters is simple, for example, using a Web application (http://www.common-metrics.org) and offers a long-term perspective to improve the comparability of patient-reported outcome measures.
Collapse
|
27
|
Developing an item bank and short forms that assess the impact of asthma on quality of life. Respir Med 2013; 108:252-63. [PMID: 24411842 DOI: 10.1016/j.rmed.2013.12.008] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Revised: 12/06/2013] [Accepted: 12/16/2013] [Indexed: 01/24/2023]
Abstract
The present work describes the process of developing an item bank and short forms that measure the impact of asthma on quality of life (QoL) that avoids confounding QoL with asthma symptomatology and functional impairment. Using a diverse national sample of adults with asthma (N = 2032) we conducted exploratory and confirmatory factor analyses, and item response theory and differential item functioning analyses to develop a 65-item unidimensional item bank and separate short form assessments. A psychometric evaluation of the RAND Impact of Asthma on QoL item bank (RAND-IAQL) suggests that though the concept of asthma impact on QoL is multi-faceted, it may be measured as a single underlying construct. The performance of the bank was then evaluated with a real-data simulated computer adaptive test. From the RAND-IAQL item bank we then developed two short forms consisting of 4 and 12 items (reliability = 0.86 and 0.93, respectively). A real-data simulated computer adaptive test suggests that as few as 4-5 items from the bank are needed to obtain highly precise scores. Preliminary validity results indicate that the RAND-IAQL measures distinguish between levels of asthma control. To measure the impact of asthma on QoL, users of these items may choose from two highly reliable short forms, computer adaptive test administration, or content-specific subsets of items from the bank tailored to their specific needs.
Collapse
|
28
|
Subjective well-being measures for children were developed within the PROMIS project: presentation of first results. J Clin Epidemiol 2013; 67:207-18. [PMID: 24295987 DOI: 10.1016/j.jclinepi.2013.08.018] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2011] [Revised: 07/25/2013] [Accepted: 08/13/2013] [Indexed: 11/22/2022]
Abstract
OBJECTIVES The aims of this Patient Reported Outcome Measurement Information System (PROMIS) study were to (1) conceptualize children's subjective well-being (SWB) and (2) produce item pools with excellent content validity for calibration and use in computerized adaptive testings (CATs). STUDY DESIGN AND SETTING Children's SWB was defined through semistructured interviews with experts, children (aged 8-17 years), parents, and a systematic literature review to identify item concepts comprehensively covering the full spectrum of SWB. Item concepts were transformed into item expressions and evaluated for comprehensibility using cognitive interviews, reading level analysis, and translatability review. RESULTS Children's SWB comprises affective (positive affect) and global evaluation components (life satisfaction). Input from experts, children, parents, and the literature indicated that the eudaimonic dimension of SWB-that is, a sense of meaning and purpose-could be evaluated. Item pools for life satisfaction (56 items), positive affect (53 items), and meaning and purpose (55 items) were produced. Small differences in comprehensibility of some items were observed between children and adolescents. CONCLUSION The SWB measures for children are the first to assess both the hedonic and eudaimonic aspects of SWB. Both children and youth seem to understand the concepts of a meaningful life, optimism, and goal orientation.
Collapse
|