Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zheng Y, Chang CH, Chang HH. Content-balancing strategy in bifactor computerized adaptive patient-reported outcome measurement. Qual Life Res 2013;22:491-9. [PMID: 22538634 DOI: 10.1007/s11136-012-0179-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/04/2012] [Indexed: 10/28/2022]

For:	Zheng Y, Chang CH, Chang HH. Content-balancing strategy in bifactor computerized adaptive patient-reported outcome measurement. Qual Life Res 2013;22:491-9. [PMID: 22538634 DOI: 10.1007/s11136-012-0179-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/04/2012] [Indexed: 10/28/2022]

Number

Cited by Other Article(s)

Giordano A, Testa S, Bassi M, Cilia S, Bertolotto A, Quartuccio ME, Pietrolongo E, Falautano M, Grobberio M, Niccolai C, Allegri B, Viterbo RG, Confalonieri P, Giovannetti AM, Cocco E, Grasso MG, Lugaresi A, Ferriani E, Nocentini U, Zaffaroni M, De Livera A, Jelinek G, Solari A, Rosato R. Applying multidimensional computerized adaptive testing to the MSQOL-54: a simulation study. Health Qual Life Outcomes 2023;21:61. [PMID: 37357308 DOI: 10.1186/s12955-023-02152-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 06/15/2023] [Indexed: 06/27/2023] Open

Abstract

BACKGROUND

The Multiple Sclerosis Quality of Life-54 (MSQOL-54) is one of the most commonly-used MS-specific health-related quality of life (HRQOL) measures. It is a multidimensional, MS-specific HRQOL inventory, which includes the generic SF-36 core items, supplemented with 18 MS-targeted items. Availability of an adaptive short version providing immediate item scoring may improve instrument usability and validity. However, multidimensional computerized adaptive testing (MCAT) has not been previously applied to MSQOL-54 items. We thus aimed to apply MCAT to the MSQOL-54 and assess its performance.

METHODS

Responses from a large international sample of 3669 MS patients were assessed. We calibrated 52 (of the 54) items using bifactor graded response model (10 group factors and one general HRQOL factor). Then, eight simulations were run with different termination criteria: standard errors (SE) for the general factor and group factors set to different values, and change in factor estimates from one item to the next set at < 0.01 for both the general and the group factors. Performance of the MCAT was assessed by the number of administered items, root mean square difference (RMSD), and correlation.

RESULTS

Eight items were removed due to local dependency. The simulation with SE set to 0.32 (general factor), and no SE thresholds (group factors) provided satisfactory performance: the median number of administered items was 24, RMSD was 0.32, and correlation was 0.94.

CONCLUSIONS

Compared to the full-length MSQOL-54, the simulated MCAT required fewer items without losing precision for the general HRQOL factor. Further work is needed to add/integrate/revise MSQOL-54 items in order to make the calibration and MCAT performance efficient also on group factors, so that the MCAT version may be used in clinical practice and research.

Collapse

Affiliation(s)

Andrea Giordano Unit of Neuroepidemiology, Fondazione IRRCS Istituto Neurologico Carlo Besta, Via Celoria 11, Milan, 20133, Italy Department of Psychology, University of Turin, Turin, Italy
Silvia Testa Department of Human and Social Sciences, University of Aosta Valley, Aosta, Italy
Marta Bassi Department of Biomedical and Clinical Sciences, Università di Milano, Milan, Italy
Sabina Cilia Department of Territorial Activities, Azienda Sanitaria Provinciale, Health District, Catania, Italy
Antonio Bertolotto Neurology Unit & Regional Referral Multiple Sclerosis Centre (CReSM), University Hospital San Luigi Gonzaga, Orbassano, Italy
Maria Esmeralda Quartuccio Department of Neuroscience, San Camillo-Forlanini Hospital, Rome, Italy
Erika Pietrolongo Department of Neurosciences, Imaging and Clinical Sciences, University G. d'Annunzio, Chieti, Italy
Monica Falautano Psychological Service - Neurological and Neurological Rehabilitation Units, IRCCS San Raffaele, Milan, Italy
Monica Grobberio Laboratory of Clinical Neuropsychology, Psychology Unit, ASST Lariana, Como, Italy
Claudia Niccolai IRCCS Don Gnocchi Foundation, Florence, Italy
Beatrice Allegri Multiple Sclerosis Center, Neurology Unit, Hospital of Vaio, Fidenza, Italy
Rosa Gemma Viterbo Azienda Sanitaria Locale, ASL-BA, Bari, Italy
Paolo Confalonieri Multiple Sclerosis Center, Unit of Neuroimmunology and Neuromuscular Diseases, Fondazione IRRCS Istituto Neurologico Carlo Besta, Milan, Italy
Ambra Mara Giovannetti Unit of Neuroepidemiology, Fondazione IRRCS Istituto Neurologico Carlo Besta, Via Celoria 11, Milan, 20133, Italy Multiple Sclerosis Center, Unit of Neuroimmunology and Neuromuscular Diseases, Fondazione IRRCS Istituto Neurologico Carlo Besta, Milan, Italy
Eleonora Cocco Department of Medical Science and Public Health, University of Cagliari, Cagliari, Italy Multiple Sclerosis Center, ASL Cagliari, ATS Sardegna, Cagliari, Italy
Maria Grazia Grasso Multiple Sclerosis Unit, IRCCS S. Lucia Foundation, Rome, Italy
Alessandra Lugaresi Dipartimento di Scienze Biomediche e Neuromotorie, Università di Bologna, Bologna, Italy IRCCS Istituto delle Scienze Neurologiche di Bologna, Bologna, Italy
Elisa Ferriani UOC Psicologia Ospedaliera, AUSL di Bologna, Bologna, Italy
Ugo Nocentini Department of Clinical Sciences and Translational Medicine, University of Rome "Tor Vergata", Rome, Italy Behavioral Neuropsychology Laboratory, IRCCS S. Lucia Foundation, Rome, Italy
Mauro Zaffaroni Neurologia ad indirizzo Neuroimmunologico - Centro Sclerosi Multipla, Ospedale di Gallarate - ASST della Valle Olona, Gallarate, Italy
Alysha De Livera Mathematics and Statistics, La Trobe University, Melbourne, Australia Neuroepidemiology Unit, Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, Australia
George Jelinek Neuroepidemiology Unit, Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, Australia
Alessandra Solari Unit of Neuroepidemiology, Fondazione IRRCS Istituto Neurologico Carlo Besta, Via Celoria 11, Milan, 20133, Italy.
Rosalba Rosato Department of Psychology, University of Turin, Turin, Italy

Collapse

Schurr T, Loth F, Lidington E, Piccinin C, Arraras JI, Groenvold M, Holzner B, van Leeuwen M, Petersen MA, Schmidt H, Young T, Giesinger JM. Patient-reported outcome measures for physical function in cancer patients: content comparison of the EORTC CAT Core, EORTC QLQ-C30, SF-36, FACT-G, and PROMIS measures using the International Classification of Functioning, Disability and Health. BMC Med Res Methodol 2023;23:21. [PMID: 36681808 PMCID: PMC9862545 DOI: 10.1186/s12874-022-01826-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 12/20/2022] [Indexed: 01/22/2023] Open

Abstract

BACKGROUND

Patient-reported physical function (PF) is a key endpoint in cancer clinical trials. Using complex statistical methods, common metrics have been developed to compare scores from different patient-reported outcome (PRO) measures, but such methods do not account for possible differences in questionnaire content. Therefore, the aim of our study was a content comparison of frequently used PRO measures for PF in cancer patients.

METHODS

Relying on the framework of the International Classification of Functioning, Disability and Health (ICF) we categorized the item content of the physical domains of the following measures: EORTC CAT Core, EORTC QLQ-C30, SF-36, PROMIS Cancer Item Bank for Physical Function, PROMIS Short Form for Physical Function 20a, and the FACT-G. Item content was linked to ICF categories by two independent reviewers.

RESULTS

The 118 items investigated were assigned to 3 components ('d - Activities and Participation', 'b - Body Functions', and 'e - Environmental Factors') and 11 first-level ICF categories. All PF items of the EORTC measures but one were assigned to the first-level ICF categories 'd4 - Mobility' and 'd5 - Self-care', all within the component 'd - Activities and Participation'. The SF-36 additionally included item content related to 'd9 - Community, social and civic life' and the PROMIS Short Form for Physical Function 20a also included content related to 'd6 - domestic life'. The PROMIS Cancer Item Bank (v1.1) covered, in addition, two first-level categories within the component 'b - Body Functions'. The FACT-G Physical Well-being scale was found to be the most diverse scale with item content partly not covered by the ICF framework.

DISCUSSION

Our results provide information about conceptual differences between common PRO measures for the assessment of PF in cancer patients. Our results complement quantitative information on psychometric characteristics of these measures and provide a better understanding of the possibilities of establishing common metrics.

Collapse

Affiliation(s)

T Schurr Department of Psychiatry, Psychotherapy, Psychosomatics, and Medical Psychology, University Hospital of Psychiatry I, Innsbruck Medical University, Anichstraße 35, A-6020 Innsbruck, Austria
F Loth Professorship for Psychological Diagnostics and Intervention Psychology, Faculty of Philosophy and Education, Catholic University of Eichstätt-Ingolstadt, Ostenstraße 25, 85072 Eichstätt, Germany
E Lidington Cancer Behavioural Science Unit, King’s College London, Guy’s Hospital, St Thomas Street, London, SE1 9RT UK
C Piccinin Quality of Life Department, EORTC, Avenue E. Mounier, 83/11, 1200 Brussels, Belgium
JI Arraras Medical Oncology Department, Hospital Universitario de Navarra, C/Irunlarrea 3, S31008 Pamplona, Spain
M Groenvold Palliative Care Research Unit, Department of Geriatrics and Palliative Medicine GP, Bispebjerg & Frederiksberg Hospital, University of Copenhagen, Copenhagen, Denmark
B Holzner Department of Psychiatry, Psychotherapy, Psychosomatics, and Medical Psychology, University Hospital of Psychiatry II, Innsbruck Medical University, Anichstraße 35, A-6020 Innsbruck, Austria
M van Leeuwen Division of Psychosocial Research & Epidemiology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
MA Petersen Palliative Care Research Unit, Department of Geriatrics and Palliative Medicine GP, Bispebjerg & Frederiksberg Hospital, University of Copenhagen, Copenhagen, Denmark
H Schmidt University Clinic and Outpatient Clinic for Radiotherapy and Institute of Health and Nursing Science, Medical Faculty of Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
T Young Lynda Jackson Macmillan Centre, Mount Vernon Cancer Centre, Rickmansworth Rd, GB- HA6 2RN Halle (Saale), UK
JM Giesinger Department of Psychiatry, Psychotherapy, Psychosomatics, and Medical Psychology, University Hospital of Psychiatry II, Innsbruck Medical University, Anichstraße 35, A-6020 Innsbruck, Austria

Collapse

Key considerations to reduce or address respondent burden in patient-reported outcome (PRO) data collection. Nat Commun 2022;13:6026. [PMID: 36224187 PMCID: PMC9556436 DOI: 10.1038/s41467-022-33826-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 10/05/2022] [Indexed: 11/30/2022] Open

The patient-reported outcomes measurement information systems (PROMIS®) physical function and its derivative measures in adults: a systematic review of content validity. Qual Life Res 2022;31:3317-3330. [PMID: 35622294 DOI: 10.1007/s11136-022-03151-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2022] [Indexed: 10/18/2022]

Wang C, Weiss DJ, Su S, Suen KY, Basford J, Cheville AL. Multidimensional Computerized Adaptive Testing: A Potential Path Toward the Efficient and Precise Assessment of Applied Cognition, Daily Activity, and Mobility for Hospitalized Patients. Arch Phys Med Rehabil 2022;103:S3-S14. [PMID: 35090886 PMCID: PMC9064883 DOI: 10.1016/j.apmr.2022.01.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 12/20/2021] [Accepted: 01/12/2022] [Indexed: 11/17/2022]

Abstract

OBJECTIVE

To develop and evaluate an efficient and precise variable-length functional assessment of applied cognition, daily activity, and mobility to inform mobility preservation and rehabilitation service delivery among hospitalized patients.

DESIGN

A multidimensional item bank tapping into these dimensions was developed, with all items calibrated using a multidimensional graded response model. The items were adaptively selected from the item banks to maximize the test information, and the test ended when a joint stopping rule was satisfied. A simulation study was conducted based on the completed instrument, the Functional Assessment in Acute Care Multidimensional Computerized Adaptive Test (FAMCAT), to compare its measurement precision and efficiency capabilities relative to conventional unidimensional computerized adaptive testing. Precision was measured by the bias and root mean squared error between the estimated and true (ie, simulated) θ estimates, whereas efficiency was measured by average test length. Data were collected by an interviewer reading questions from a tablet computer and entering patients' responses.

SETTING

A large Midwestern hospital.

PARTICIPANTS

A total of 4143 patients hospitalized with medical diagnosis and/or surgical complications, with 2060 in the calibration sample and 2083 in the validation cohort.

INTERVENTION

Not applicable.

RESULTS

Among the 2083 patients in the validation sample, FAMCAT administration required an average of 6 (SD=3.11) minutes. Ninety-six percent had their tests terminated by the standard error rule after responding to an average of 22.05 (SD=7.98) items, whereas 15 were terminated by the change in θ rule, with an average test length of 45.27 (SD=11.49). The remaining 76 responded until reaching the maximum test length of 60 items.

CONCLUSIONS

The FAMCAT has the potential to satisfy the need for structured, frequent, and precise assessment of functional domains among hospitalized patients with medical diagnosis and/or surgical complications. The results are promising and may be informative for others who wish to develop similar instruments when concurrent assessment of correlated domains is required.

Collapse

Zheng Y, Cheon H, Katz CM. Using Machine Learning Methods to Develop a Short Tree-Based Adaptive Classification Test: Case Study With a High-Dimensional Item Pool and Imbalanced Data. APPLIED PSYCHOLOGICAL MEASUREMENT 2020;44:499-514. [PMID: 34565931 PMCID: PMC7495791 DOI: 10.1177/0146621620931198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Liegl G, Rose M, Knebel F, Stengel A, Buttgereit F, Obbarius A, Fischer HF, Nolte S. Using subdomain-specific item sets affected PROMIS physical function scores differently in cardiology and rheumatology patients. J Clin Epidemiol 2020;127:151-160. [PMID: 32781113 DOI: 10.1016/j.jclinepi.2020.08.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 07/22/2020] [Accepted: 08/05/2020] [Indexed: 12/21/2022]

Affiliation(s)

Gregor Liegl Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany.
Matthias Rose Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany
Fabian Knebel Clinic for Cardiology and Angiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany
Andreas Stengel Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany; Clinic for Rheumatology and Clinical Immunology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany; Department of Psychosomatic Medicine and Psychotherapy, Medical University Hospital Tübingen, Tübingen, Germany
Frank Buttgereit Clinic for Rheumatology and Clinical Immunology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany
Alexander Obbarius Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany
H Felix Fischer Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany
Sandra Nolte Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Berlin, Germany; Population Health Strategic Research Centre, School of Health and Social Development, Deakin University, Burwood, Australia

Collapse

Mao X, Zhang J, Xin T. Application of Dimension Reduction to CAT Item Selection Under the Bifactor Model. APPLIED PSYCHOLOGICAL MEASUREMENT 2019;43:419-434. [PMID: 31452552 PMCID: PMC6696870 DOI: 10.1177/0146621618813086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Geerards D, Klassen AF, Hoogbergen MM, van der Hulst RRWJ, van den Berg L, Pusic AL, Gibbons CJ. Streamlining the Assessment of Patient-Reported Outcomes in Weight Loss and Body Contouring Patients: Applying Computerized Adaptive Testing to the BODY-Q. Plast Reconstr Surg 2019;143:946e-955e. [PMID: 31033817 DOI: 10.1097/prs.0000000000005587] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Abstract

BACKGROUND

The BODY-Q is a widely used patient-reported outcome measure of surgical outcomes in weight loss and body contouring patients. Reducing the length of the BODY-Q assessment could overcome implementation barriers in busy clinics. A shorter BODY-Q could be achieved by using computerized adaptive testing, a method to shorten and tailor assessments while maintaining reliability and accuracy. In this study, the authors apply computerized adaptive testing to the BODY-Q and assess computerized adaptive testing performance in terms of item reduction and accuracy.

METHODS

Parameters describing the psychometric properties of 138 BODY-Q items (i.e., questions) were derived from the original validation sample (n = 734). The 138 items are arranged into 18 scales reflecting Appearance, Quality of Life, and Experience of Care domains. The authors simulated 1000 administrations of the computerized adaptive testing until a stopping rule, reflecting assessment accuracy of standard error less than 0.55, was met. The authors describe the reduction of assessment length in terms of the mean and range of items administered. The authors assessed accuracy by determining correlation between full test and computerized adaptive testing scores.

RESULTS

The authors ran 54 simulations. Mean item reduction was 36.9 percent (51 items; range, 48 to 138 items). Highest item reduction was achieved for the Experience of Care domain (56.2 percent, 22.5 items). Correlation between full test scores and the BODY-Q computerized adaptive test scores averaged 0.99.

CONCLUSIONS

Substantial item reduction is possible by using BODY-Q computerized adaptive testing. Reduced assessment length using BODY-Q computerized adaptive testing could reduce patient burden while preserving the accuracy of clinical patient-reported outcomes for patients undergoing weight loss and body contouring operations.

Collapse

Affiliation(s)

Daan Geerards From the Patient-Reported Outcomes, Value & Experience Center, Department of Surgery, Brigham and Women's Hospital; the Department of Surgery, Harvard Medical School; the Department of Pediatrics, McMaster University; the Department of Plastic and Reconstructive Surgery, Catharina Hospital; and the Department of Plastic and Reconstructive Surgery, Maastricht University Medical Center
Anne F Klassen From the Patient-Reported Outcomes, Value & Experience Center, Department of Surgery, Brigham and Women's Hospital; the Department of Surgery, Harvard Medical School; the Department of Pediatrics, McMaster University; the Department of Plastic and Reconstructive Surgery, Catharina Hospital; and the Department of Plastic and Reconstructive Surgery, Maastricht University Medical Center
Maarten M Hoogbergen From the Patient-Reported Outcomes, Value & Experience Center, Department of Surgery, Brigham and Women's Hospital; the Department of Surgery, Harvard Medical School; the Department of Pediatrics, McMaster University; the Department of Plastic and Reconstructive Surgery, Catharina Hospital; and the Department of Plastic and Reconstructive Surgery, Maastricht University Medical Center
René R W J van der Hulst From the Patient-Reported Outcomes, Value & Experience Center, Department of Surgery, Brigham and Women's Hospital; the Department of Surgery, Harvard Medical School; the Department of Pediatrics, McMaster University; the Department of Plastic and Reconstructive Surgery, Catharina Hospital; and the Department of Plastic and Reconstructive Surgery, Maastricht University Medical Center
Lisa van den Berg From the Patient-Reported Outcomes, Value & Experience Center, Department of Surgery, Brigham and Women's Hospital; the Department of Surgery, Harvard Medical School; the Department of Pediatrics, McMaster University; the Department of Plastic and Reconstructive Surgery, Catharina Hospital; and the Department of Plastic and Reconstructive Surgery, Maastricht University Medical Center
Andrea L Pusic From the Patient-Reported Outcomes, Value & Experience Center, Department of Surgery, Brigham and Women's Hospital; the Department of Surgery, Harvard Medical School; the Department of Pediatrics, McMaster University; the Department of Plastic and Reconstructive Surgery, Catharina Hospital; and the Department of Plastic and Reconstructive Surgery, Maastricht University Medical Center
Chris J Gibbons From the Patient-Reported Outcomes, Value & Experience Center, Department of Surgery, Brigham and Women's Hospital; the Department of Surgery, Harvard Medical School; the Department of Pediatrics, McMaster University; the Department of Plastic and Reconstructive Surgery, Catharina Hospital; and the Department of Plastic and Reconstructive Surgery, Maastricht University Medical Center

Collapse

Smits N, van der Ark LA, Conijn JM. Measurement versus prediction in the construction of patient-reported outcome questionnaires: can we have our cake and eat it? Qual Life Res 2018;27:1673-1682. [PMID: 29098607 PMCID: PMC5997739 DOI: 10.1007/s11136-017-1720-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/12/2017] [Indexed: 02/07/2023]

Abstract

BACKGROUND

Two important goals when using questionnaires are (a) measurement: the questionnaire is constructed to assign numerical values that accurately represent the test taker's attribute, and (b) prediction: the questionnaire is constructed to give an accurate forecast of an external criterion. Construction methods aimed at measurement prescribe that items should be reliable. In practice, this leads to questionnaires with high inter-item correlations. By contrast, construction methods aimed at prediction typically prescribe that items have a high correlation with the criterion and low inter-item correlations. The latter approach has often been said to produce a paradox concerning the relation between reliability and validity [1-3], because it is often assumed that good measurement is a prerequisite of good prediction.

OBJECTIVE

To answer four questions: (1) Why are measurement-based methods suboptimal for questionnaires that are used for prediction? (2) How should one construct a questionnaire that is used for prediction? (3) Do questionnaire-construction methods that optimize measurement and prediction lead to the selection of different items in the questionnaire? (4) Is it possible to construct a questionnaire that can be used for both measurement and prediction?

ILLUSTRATIVE EXAMPLE

An empirical data set consisting of scores of 242 respondents on questionnaire items measuring mental health is used to select items by means of two methods: a method that optimizes the predictive value of the scale (i.e., forecast a clinical diagnosis), and a method that optimizes the reliability of the scale. We show that for the two scales different sets of items are selected and that a scale constructed to meet the one goal does not show optimal performance with reference to the other goal.

DISCUSSION

The answers are as follows: (1) Because measurement-based methods tend to maximize inter-item correlations by which predictive validity reduces. (2) Through selecting items that correlate highly with the criterion and lowly with the remaining items. (3) Yes, these methods may lead to different item selections. (4) For a single questionnaire: Yes, but it is problematic because reliability cannot be estimated accurately. For a test battery: Yes, but it is very costly. Implications for the construction of patient-reported outcome questionnaires are discussed.

Collapse

Smits N, Paap MCS, Böhnke JR. Some recommendations for developing multidimensional computerized adaptive tests for patient-reported outcomes. Qual Life Res 2018;27:1055-1063. [PMID: 29476312 PMCID: PMC5874279 DOI: 10.1007/s11136-018-1821-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/21/2018] [Indexed: 10/31/2022]

Abstract

PURPOSE

Multidimensional item response theory and computerized adaptive testing (CAT) are increasingly used in mental health, quality of life (QoL), and patient-reported outcome measurement. Although multidimensional assessment techniques hold promises, they are more challenging in their application than unidimensional ones. The authors comment on minimal standards when developing multidimensional CATs.

METHODS

Prompted by pioneering papers published in QLR, the authors reflect on existing guidance and discussions from different psychometric communities, including guidelines developed for unidimensional CATs in the PROMIS project.

RESULTS

The commentary focuses on two key topics: (1) the design, evaluation, and calibration of multidimensional item banks and (2) how to study the efficiency and precision of a multidimensional item bank. The authors suggest that the development of a carefully designed and calibrated item bank encompasses a construction phase and a psychometric phase. With respect to efficiency and precision, item banks should be large enough to provide adequate precision over the full range of the latent constructs. Therefore CAT performance should be studied as a function of the latent constructs and with reference to relevant benchmarks. Solutions are also suggested for simulation studies using real data, which often result in too optimistic evaluations of an item bank's efficiency and precision.

DISCUSSION

Multidimensional CAT applications are promising but complex statistical assessment tools which necessitate detailed theoretical frameworks and methodological scrutiny when testing their appropriateness for practical applications. The authors advise researchers to evaluate item banks with a broad set of methods, describe their choices in detail, and substantiate their approach for validation.

Collapse

Michel P, Baumstarck K, Ghattas B, Pelletier J, Loundou A, Boucekine M, Auquier P, Boyer L. A Multidimensional Computerized Adaptive Short-Form Quality of Life Questionnaire Developed and Validated for Multiple Sclerosis: The MusiQoL-MCAT. Medicine (Baltimore) 2016;95:e3068. [PMID: 27057832 PMCID: PMC4998748 DOI: 10.1097/md.0000000000003068] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open

Item exposure control for multidimensional computer adaptive testing under maximum likelihood and expected a posteriori estimation. Behav Res Methods 2015;48:1443-1453. [PMID: 26487053 DOI: 10.3758/s13428-015-0659-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Chang HH. Psychometrics behind Computerized Adaptive Testing. PSYCHOMETRIKA 2015;80:1-20. [PMID: 24499939 DOI: 10.1007/s11336-014-9401-5] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2013] [Indexed: 05/27/2023]

Michel P, Auquier P, Baumstarck K, Pelletier J, Loundou A, Ghattas B, Boyer L. Development of a cross-cultural item bank for measuring quality of life related to mental health in multiple sclerosis patients. Qual Life Res 2015;24:2261-71. [PMID: 25712324 DOI: 10.1007/s11136-015-0948-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/17/2015] [Indexed: 11/24/2022]

Abstract

OBJECTIVE

Quality of life (QoL) measurements are considered important outcome measures both for research on multiple sclerosis (MS) and in clinical practice. Computerized adaptive testing (CAT) can improve the precision of measurements made using QoL instruments while reducing the burden of testing on patients. Moreover, a cross-cultural approach is also necessary to guarantee the wide applicability of CAT. The aim of this preliminary study was to develop a calibrated item bank that is available in multiple languages and measures QoL related to mental health by combining one generic (SF-36) and one disease-specific questionnaire (MusiQoL).

METHODS

Patients with MS were enrolled in this international, multicenter, cross-sectional study. The psychometric properties of the item bank were based on classical test and item response theories and approaches, including the evaluation of unidimensionality, item response theory model fitting, and analyses of differential item functioning (DIF). Convergent and discriminant validities of the item bank were examined according to socio-demographic, clinical, and QoL features.

RESULTS

A total of 1992 patients with MS and from 15 countries were enrolled in this study to calibrate the 22-item bank developed in this study. The strict monotonicity of the Cronbach's alpha curve, the high eigenvalue ratio estimator (5.50), and the adequate CFA model fit (RMSEA = 0.07 and CFI = 0.95) indicated that a strong assumption of unidimensionality was warranted. The infit mean square statistic ranged from 0.76 to 1.27, indicating a satisfactory item fit. DIF analyses revealed no item biases across geographical areas, confirming the cross-cultural equivalence of the item bank. External validity testing revealed that the item bank scores correlated significantly with QoL scores but also showed discriminant validity for socio-demographic and clinical characteristics.

CONCLUSION

This work demonstrated satisfactory psychometric characteristics for a QoL item bank for MS in multiple languages. This work may offer a common measure for the assessment of QoL in different cultural contexts and for international studies conducted on MS.

Collapse