1
|
Giordano A, Testa S, Bassi M, Cilia S, Bertolotto A, Quartuccio ME, Pietrolongo E, Falautano M, Grobberio M, Niccolai C, Allegri B, Viterbo RG, Confalonieri P, Giovannetti AM, Cocco E, Grasso MG, Lugaresi A, Ferriani E, Nocentini U, Zaffaroni M, De Livera A, Jelinek G, Solari A, Rosato R. Applying multidimensional computerized adaptive testing to the MSQOL-54: a simulation study. Health Qual Life Outcomes 2023; 21:61. [PMID: 37357308 DOI: 10.1186/s12955-023-02152-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 06/15/2023] [Indexed: 06/27/2023] Open
Abstract
BACKGROUND The Multiple Sclerosis Quality of Life-54 (MSQOL-54) is one of the most commonly-used MS-specific health-related quality of life (HRQOL) measures. It is a multidimensional, MS-specific HRQOL inventory, which includes the generic SF-36 core items, supplemented with 18 MS-targeted items. Availability of an adaptive short version providing immediate item scoring may improve instrument usability and validity. However, multidimensional computerized adaptive testing (MCAT) has not been previously applied to MSQOL-54 items. We thus aimed to apply MCAT to the MSQOL-54 and assess its performance. METHODS Responses from a large international sample of 3669 MS patients were assessed. We calibrated 52 (of the 54) items using bifactor graded response model (10 group factors and one general HRQOL factor). Then, eight simulations were run with different termination criteria: standard errors (SE) for the general factor and group factors set to different values, and change in factor estimates from one item to the next set at < 0.01 for both the general and the group factors. Performance of the MCAT was assessed by the number of administered items, root mean square difference (RMSD), and correlation. RESULTS Eight items were removed due to local dependency. The simulation with SE set to 0.32 (general factor), and no SE thresholds (group factors) provided satisfactory performance: the median number of administered items was 24, RMSD was 0.32, and correlation was 0.94. CONCLUSIONS Compared to the full-length MSQOL-54, the simulated MCAT required fewer items without losing precision for the general HRQOL factor. Further work is needed to add/integrate/revise MSQOL-54 items in order to make the calibration and MCAT performance efficient also on group factors, so that the MCAT version may be used in clinical practice and research.
Collapse
Affiliation(s)
- Andrea Giordano
- Unit of Neuroepidemiology, Fondazione IRRCS Istituto Neurologico Carlo Besta, Via Celoria 11, Milan, 20133, Italy
- Department of Psychology, University of Turin, Turin, Italy
| | - Silvia Testa
- Department of Human and Social Sciences, University of Aosta Valley, Aosta, Italy
| | - Marta Bassi
- Department of Biomedical and Clinical Sciences, Università di Milano, Milan, Italy
| | - Sabina Cilia
- Department of Territorial Activities, Azienda Sanitaria Provinciale, Health District, Catania, Italy
| | - Antonio Bertolotto
- Neurology Unit & Regional Referral Multiple Sclerosis Centre (CReSM), University Hospital San Luigi Gonzaga, Orbassano, Italy
| | | | - Erika Pietrolongo
- Department of Neurosciences, Imaging and Clinical Sciences, University G. d'Annunzio, Chieti, Italy
| | - Monica Falautano
- Psychological Service - Neurological and Neurological Rehabilitation Units, IRCCS San Raffaele, Milan, Italy
| | - Monica Grobberio
- Laboratory of Clinical Neuropsychology, Psychology Unit, ASST Lariana, Como, Italy
| | | | - Beatrice Allegri
- Multiple Sclerosis Center, Neurology Unit, Hospital of Vaio, Fidenza, Italy
| | | | - Paolo Confalonieri
- Multiple Sclerosis Center, Unit of Neuroimmunology and Neuromuscular Diseases, Fondazione IRRCS Istituto Neurologico Carlo Besta, Milan, Italy
| | - Ambra Mara Giovannetti
- Unit of Neuroepidemiology, Fondazione IRRCS Istituto Neurologico Carlo Besta, Via Celoria 11, Milan, 20133, Italy
- Multiple Sclerosis Center, Unit of Neuroimmunology and Neuromuscular Diseases, Fondazione IRRCS Istituto Neurologico Carlo Besta, Milan, Italy
| | - Eleonora Cocco
- Department of Medical Science and Public Health, University of Cagliari, Cagliari, Italy
- Multiple Sclerosis Center, ASL Cagliari, ATS Sardegna, Cagliari, Italy
| | | | - Alessandra Lugaresi
- Dipartimento di Scienze Biomediche e Neuromotorie, Università di Bologna, Bologna, Italy
- IRCCS Istituto delle Scienze Neurologiche di Bologna, Bologna, Italy
| | - Elisa Ferriani
- UOC Psicologia Ospedaliera, AUSL di Bologna, Bologna, Italy
| | - Ugo Nocentini
- Department of Clinical Sciences and Translational Medicine, University of Rome "Tor Vergata", Rome, Italy
- Behavioral Neuropsychology Laboratory, IRCCS S. Lucia Foundation, Rome, Italy
| | - Mauro Zaffaroni
- Neurologia ad indirizzo Neuroimmunologico - Centro Sclerosi Multipla, Ospedale di Gallarate - ASST della Valle Olona, Gallarate, Italy
| | - Alysha De Livera
- Mathematics and Statistics, La Trobe University, Melbourne, Australia
- Neuroepidemiology Unit, Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, Australia
| | - George Jelinek
- Neuroepidemiology Unit, Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, Australia
| | - Alessandra Solari
- Unit of Neuroepidemiology, Fondazione IRRCS Istituto Neurologico Carlo Besta, Via Celoria 11, Milan, 20133, Italy.
| | - Rosalba Rosato
- Department of Psychology, University of Turin, Turin, Italy
| |
Collapse
|
2
|
Development and Calibration of the PREMIUM Item Bank for Measuring Respect and Dignity for Patients with Severe Mental Illness. J Clin Med 2022; 11:jcm11061644. [PMID: 35329970 PMCID: PMC8954414 DOI: 10.3390/jcm11061644] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 03/11/2022] [Accepted: 03/14/2022] [Indexed: 11/17/2022] Open
Abstract
Most patient-reported experience measures (PREMs) are paper-based, leading to a high burden for patients and care providers. The aim of this study was to (1) calibrate an item bank to measure patients’ experience of respect and dignity for adult patients with serious mental illnesses and (2) develop computerized adaptive testing (CAT) to improve the use of this PREM in routine practice. Patients with schizophrenia, bipolar disorder, and major depressive disorder were enrolled in this multicenter and cross-sectional study. Psychometric analyses were based on classical test and item response theories and included evaluations of unidimensionality, local independence, and monotonicity; calibration and evaluation of model fit; analyses of differential item functioning (DIF); testing of external validity; and finally, CAT development. A total of 458 patients participated in the study. Of the 24 items, 2 highly inter-correlated items were deleted. Factor analysis showed that the remaining items met the unidimensional assumption (RMSEA = 0.054, CFI = 0.988, TLI = 0.986). DIF analyses revealed no biases by sex, age, care setting, or diagnosis. External validity testing has generally supported our assumptions. CAT showed satisfactory accuracy and precision. This work provides a more accurate and flexible measure of patients’ experience of respect and dignity than that obtained from standard questionnaires.
Collapse
|
3
|
The Validity of the SQoL-18 in Patients with Bipolar and Depressive Disorders: A Psychometric Study from the PREMIUM Project. J Clin Med 2022; 11:jcm11030743. [PMID: 35160196 PMCID: PMC8836740 DOI: 10.3390/jcm11030743] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 01/27/2022] [Accepted: 01/28/2022] [Indexed: 12/14/2022] Open
Abstract
The S-QoL 18 is a self-administered questionnaire that assesses quality of life (QoL) among individuals with schizophrenia. This study aims to validate the S-QoL 18 in bipolar and depressive disorders for a more widespread use in psychiatric settings. This study was conducted in a non-selected sample of individuals with bipolar and depressive disorders in the day hospital of a regional psychiatric academic hospital. Two-hundred and seventy-two stable outpatients with bipolar (n = 73) and recurrent and persistent depressive (n = 199) disorders were recruited over a 12 month-period. The S-QoL 18 was tested for construct validity, reliability, and external validity. The eight-factor structure of the S-QoL 18 was confirmed by confirmatory factor analysis (RMSEA = 0.075 (0.064–0.086), CFI = 0.972, TLI = 0.961). Internal consistency and reliability were satisfactory. External validity was confirmed via correlations between S-QoL 18 dimension scores, symptomatology, and functioning. The percentage of missing data for the eight dimensions did not exceed 5%. INFIT statistics were ranged from 0.7 to 1.2, ensuring that all items of the scale measured the same QoL concept. In conclusion, the S-QoL 18 appears to be a valid and reliable instrument for measuring QoL in patients with bipolar and depressive disorders. The S-QoL 18 may be used by healthcare professionals in clinical settings to accurately assess QoL in individuals with bipolar and depressive disorders, as well as in schizophrenia.
Collapse
|
4
|
Wang C, Weiss DJ, Su S, Suen KY, Basford J, Cheville AL. Multidimensional Computerized Adaptive Testing: A Potential Path Toward the Efficient and Precise Assessment of Applied Cognition, Daily Activity, and Mobility for Hospitalized Patients. Arch Phys Med Rehabil 2022; 103:S3-S14. [PMID: 35090886 PMCID: PMC9064883 DOI: 10.1016/j.apmr.2022.01.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 12/20/2021] [Accepted: 01/12/2022] [Indexed: 11/17/2022]
Abstract
OBJECTIVE To develop and evaluate an efficient and precise variable-length functional assessment of applied cognition, daily activity, and mobility to inform mobility preservation and rehabilitation service delivery among hospitalized patients. DESIGN A multidimensional item bank tapping into these dimensions was developed, with all items calibrated using a multidimensional graded response model. The items were adaptively selected from the item banks to maximize the test information, and the test ended when a joint stopping rule was satisfied. A simulation study was conducted based on the completed instrument, the Functional Assessment in Acute Care Multidimensional Computerized Adaptive Test (FAMCAT), to compare its measurement precision and efficiency capabilities relative to conventional unidimensional computerized adaptive testing. Precision was measured by the bias and root mean squared error between the estimated and true (ie, simulated) θ estimates, whereas efficiency was measured by average test length. Data were collected by an interviewer reading questions from a tablet computer and entering patients' responses. SETTING A large Midwestern hospital. PARTICIPANTS A total of 4143 patients hospitalized with medical diagnosis and/or surgical complications, with 2060 in the calibration sample and 2083 in the validation cohort. INTERVENTION Not applicable. RESULTS Among the 2083 patients in the validation sample, FAMCAT administration required an average of 6 (SD=3.11) minutes. Ninety-six percent had their tests terminated by the standard error rule after responding to an average of 22.05 (SD=7.98) items, whereas 15 were terminated by the change in θ rule, with an average test length of 45.27 (SD=11.49). The remaining 76 responded until reaching the maximum test length of 60 items. CONCLUSIONS The FAMCAT has the potential to satisfy the need for structured, frequent, and precise assessment of functional domains among hospitalized patients with medical diagnosis and/or surgical complications. The results are promising and may be informative for others who wish to develop similar instruments when concurrent assessment of correlated domains is required.
Collapse
Affiliation(s)
- Chun Wang
- College of Education, University of Washington, Seattle, WA.
| | - David J Weiss
- Department of Psychology, University of Minnesota, Minneapolis, MN
| | - Shiyang Su
- Department of Psychology, University of Central Florida, Orlando, FL
| | - King Yiu Suen
- Department of Psychology, University of Minnesota, Minneapolis, MN
| | - Jeffrey Basford
- Department of Physical Medicine and Rehabilitation, Mayo Clinic, Rochester, MN
| | - Andrea L Cheville
- Department of Physical Medicine and Rehabilitation, Mayo Clinic, Rochester, MN
| |
Collapse
|
5
|
Applying Computerized Adaptive Testing to the FACE-Q Skin Cancer Module: Individualizing Patient-Reported Outcome Measures in Facial Surgery. Plast Reconstr Surg 2021; 148:863-869. [PMID: 34415858 DOI: 10.1097/prs.0000000000008326] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
BACKGROUND Skin cancer is among the most frequently occurring malignancies worldwide, which creates a great need for an effective patient-reported outcome measure. Providing shorter questionnaires reduces patient burden and increases patients' willingness to complete forms. The authors set out to use computerized adaptive testing to reduce the number of items needed to predict results for scales of the FACE-Q Skin Cancer Module, a validated patient-reported outcome measure that measures health-related quality of life and patient satisfaction in facial surgery. METHODS Computerized adaptive testing generates tailored questionnaires for patients in real time based on their responses to previous questions. The authors used an open-source computerized adaptive testing simulation software to run item responses for the five scales from the FACE-Q Skin Cancer Module (i.e., scar appraisal, satisfaction with facial appearance, appearance-related psychosocial distress, cancer worry, and satisfaction with information about appearance). Each simulation continued to administer items until prespecified levels of precision were met, estimated by standard error. Mean and maximum item reductions between the original fixed-length short forms and the simulated versions were evaluated. RESULTS The number of questions that patients needed to answer to complete the FACE-Q Skin Oncology Module was reduced from 41 items in the original form to a mean of 23 ± 0.55 items (range, 15 to 29) using the computerized adaptive testing version. Simulated computerized adaptive testing scores maintained a high correlation (0.98 to 0.99) with the score from the fixed-length short forms. CONCLUSIONS Applying computerized adaptive testing to the FACE-Q Skin Cancer Module can reduce the length of assessment by more than 50 percent, with virtually no loss in precision. It is likely to play a critical role in the implementation in clinical practice.
Collapse
|
6
|
Finkelstein FO, Cimini M, Finkelstein SH, Kliger AS. Computerized adaptive technology for the assessment of HRQOL of PD and CKD patients. Perit Dial Int 2020; 41:509-512. [PMID: 33016231 DOI: 10.1177/0896860820959961] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
This study was designed as a pilot study to see whether electronic patient-reported outcome measures using computer adaptive technology (CAT) could be successfully implemented in clinics caring for chronic kidney disease (CKD) and peritoneal dialysis (PD) patients. The results demonstrate the feasibility of using CAT on an iPad to assess the symptom burden and health-related quality of life of both PD and CKD patients.
Collapse
Affiliation(s)
| | | | | | - Alan S Kliger
- 12228Yale University, New Haven, CT, USA.,Metabolism Associates, New Haven, CT, USA
| |
Collapse
|
7
|
Granziol U, Brancaccio A, Pizziconi G, Spangaro M, Gentili F, Bosia M, Gregori E, Luperini C, Pavan C, Santarelli V, Cavallaro R, Cremonese C, Favaro A, Rossi A, Vidotto G, Spoto A. On the Implementation of Computerized Adaptive Observations for Psychological Assessment. Assessment 2020; 29:225-241. [PMID: 33016093 DOI: 10.1177/1073191120960215] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The use of observational tools in psychological assessment has decreased in recent years, mainly due to its personnel and time costs, and researchers have not explored methodological innovations like adaptive algorithms in observational assessment. In the present study, we introduce the behavior-driven observation procedure to develop, test, and implement observational adaptive instruments. In Study 1, we use a preexisting observational checklist to evaluate nonverbal behaviors related to psychotic symptoms and to specify the adaptive algorithm's model. We fit the model to observational data collected from 114 participants. The results support the model's goodness of fit. In Study 2, we use the estimated model parameters to calibrate the adaptive procedure and test the algorithm for accuracy and efficiency in adaptively reconstructing 58 nonadaptively collected response patterns. The results show the algorithm's good accuracy and efficiency, with a 40% average reduction in the number of administered items. In Study 3, we used real raters to test the adaptive checklist built with behavior-driven observation. The results indicate adequate intrarater agreement and good consistency of the observed response patterns. In conclusion, the results support the possibility of using behavior-driven observation to create accurate and affordable (in terms of resources) observational assessment tools.
Collapse
Affiliation(s)
| | | | | | - Marco Spangaro
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy.,School of Medicine, Vita-Salute San Raffaele University
| | | | - Marta Bosia
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy.,School of Medicine, Vita-Salute San Raffaele University
| | | | | | - Chiara Pavan
- Padova University Hospital, University of Padova, Italy.,Neuroscience Department, University of Padova, Italy
| | | | - Roberto Cavallaro
- Department of Clinical Neurosciences, IRCCS San Raffaele Scientific Institute, Milan, Italy.,School of Medicine, Vita-Salute San Raffaele University
| | | | - Angela Favaro
- Padova University Hospital, University of Padova, Italy.,Neuroscience Department, University of Padova, Italy
| | | | - Giulio Vidotto
- Department of General Psychology, University of Padova, Italy
| | - Andrea Spoto
- Department of General Psychology, University of Padova, Italy
| |
Collapse
|
8
|
Thomas ML. Advances in applications of item response theory to clinical assessment. Psychol Assess 2019; 31:1442-1455. [PMID: 30869966 DOI: 10.1037/pas0000597] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Item response theory (IRT) is moving to the forefront of methodologies used to develop, evaluate, and score clinical measures. Funding agencies and test developers are routinely supporting IRT work, and the theory has become closely tied to technological advances within the field. As a result, familiarity with IRT has grown increasingly relevant to mental health research and practice. But to what end? This article reviews advances in applications of IRT to clinical measurement in an effort to identify tangible improvements that can be attributed to the methodology. Although IRT shares similarities with classical test theory and factor analysis, the approach has certain practical benefits, but also limitations, when applied to measurement challenges. Major opportunities include the use of computerized adaptive tests to prevent conditional measurement error, multidimensional models to prevent misinterpretation of scores, and analyses of differential item functioning to prevent bias. Whereas these methods and technologies were once only discussed as future possibilities, they are now accessible because of recent support of IRT-focused clinical research. Despite this, much work still remains in widely disseminating methods and technologies from IRT into mental health research and practice. Clinicians have been reluctant to fully embrace the approach, especially in terms or prospective test development and adaptive item administration. Widespread use of IRT technologies will require continued cooperation among psychometricians, clinicians, and other stakeholders. There are also many opportunities to expand the methodology, especially with respect to integrating modern measurement theory with models from personality and cognitive psychology as well as neuroscience. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Collapse
|
9
|
Fernandes S, Fond G, Zendjidjian X, Michel P, Baumstarck K, Lancon C, Berna F, Schurhoff F, Aouizerate B, Henry C, Etain B, Samalin L, Leboyer M, Llorca PM, Coldefy M, Auquier P, Boyer L. The Patient-Reported Experience Measure for Improving qUality of care in Mental health (PREMIUM) project in France: study protocol for the development and implementation strategy. Patient Prefer Adherence 2019; 13:165-177. [PMID: 30718945 PMCID: PMC6345324 DOI: 10.2147/ppa.s172100] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Measuring the quality and performance of health care is a major challenge in improving the efficiency of a health system. Patient experience is one important measure of the quality of health care, and the use of patient-reported experience measures (PREMs) is recommended. The aims of this project are 1) to develop item banks of PREMs that assess the quality of health care for adult patients with psychiatric disorders (schizophrenia, bipolar disorder, and depression) and to validate computerized adaptive testing (CAT) to support the routine use of PREMs; and 2) to analyze the implementation and acceptability of the CAT among patients, professionals, and health authorities. METHODS This multicenter and cross-sectional study is based on a mixed method approach, integrating qualitative and quantitative methodologies in two main phases: 1) item bank and CAT development based on a standardized procedure, including conceptual work and definition of the domain mapping, item selection, calibration of the item bank and CAT simulations to elaborate the administration algorithm, and CAT validation; and 2) a qualitative study exploring the implementation and acceptability of the CAT among patients, professionals, and health authorities. DISCUSSION The development of a set of PREMs on quality of care in mental health that overcomes the limitations of previous works (ie, allowing national comparisons regardless of the characteristics of patients and care and based on modern testing using item banks and CAT) could help health care professionals and health system policymakers to identify strategies to improve the quality and efficiency of mental health care. TRIAL REGISTRATION NCT02491866.
Collapse
Affiliation(s)
- Sara Fernandes
- Aix-Marseille University, School of Medicine, CEReSS - Health Service Research and Quality of Life Center - EA 3279 Research Unit, Marseille, France, Email
| | - Guillaume Fond
- Aix-Marseille University, School of Medicine, CEReSS - Health Service Research and Quality of Life Center - EA 3279 Research Unit, Marseille, France, Email
| | - Xavier Zendjidjian
- Aix-Marseille University, School of Medicine, CEReSS - Health Service Research and Quality of Life Center - EA 3279 Research Unit, Marseille, France, Email
| | - Pierre Michel
- Aix-Marseille University, School of Medicine, CEReSS - Health Service Research and Quality of Life Center - EA 3279 Research Unit, Marseille, France, Email
| | - Karine Baumstarck
- Aix-Marseille University, School of Medicine, CEReSS - Health Service Research and Quality of Life Center - EA 3279 Research Unit, Marseille, France, Email
| | - Christophe Lancon
- Aix-Marseille University, School of Medicine, CEReSS - Health Service Research and Quality of Life Center - EA 3279 Research Unit, Marseille, France, Email
| | | | | | | | | | | | | | | | | | - Magali Coldefy
- Institute for Research and Information in Health Economics (IRDES), Paris, France
| | - Pascal Auquier
- Aix-Marseille University, School of Medicine, CEReSS - Health Service Research and Quality of Life Center - EA 3279 Research Unit, Marseille, France, Email
| | - Laurent Boyer
- Aix-Marseille University, School of Medicine, CEReSS - Health Service Research and Quality of Life Center - EA 3279 Research Unit, Marseille, France, Email
| |
Collapse
|
10
|
Smits N, Paap MCS, Böhnke JR. Some recommendations for developing multidimensional computerized adaptive tests for patient-reported outcomes. Qual Life Res 2018; 27:1055-1063. [PMID: 29476312 PMCID: PMC5874279 DOI: 10.1007/s11136-018-1821-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/21/2018] [Indexed: 10/31/2022]
Abstract
PURPOSE Multidimensional item response theory and computerized adaptive testing (CAT) are increasingly used in mental health, quality of life (QoL), and patient-reported outcome measurement. Although multidimensional assessment techniques hold promises, they are more challenging in their application than unidimensional ones. The authors comment on minimal standards when developing multidimensional CATs. METHODS Prompted by pioneering papers published in QLR, the authors reflect on existing guidance and discussions from different psychometric communities, including guidelines developed for unidimensional CATs in the PROMIS project. RESULTS The commentary focuses on two key topics: (1) the design, evaluation, and calibration of multidimensional item banks and (2) how to study the efficiency and precision of a multidimensional item bank. The authors suggest that the development of a carefully designed and calibrated item bank encompasses a construction phase and a psychometric phase. With respect to efficiency and precision, item banks should be large enough to provide adequate precision over the full range of the latent constructs. Therefore CAT performance should be studied as a function of the latent constructs and with reference to relevant benchmarks. Solutions are also suggested for simulation studies using real data, which often result in too optimistic evaluations of an item bank's efficiency and precision. DISCUSSION Multidimensional CAT applications are promising but complex statistical assessment tools which necessitate detailed theoretical frameworks and methodological scrutiny when testing their appropriateness for practical applications. The authors advise researchers to evaluate item banks with a broad set of methods, describe their choices in detail, and substantiate their approach for validation.
Collapse
Affiliation(s)
- Niels Smits
- Research Institute of Child Development and Education, University of Amsterdam, Nieuwe Achtergracht 127, 1018 WS, Amsterdam, The Netherlands.
| | - Muirne C S Paap
- Department of Special Needs, Education, and Youth Care, Faculty of Behavioural and Social Sciences, University of Groningen, Groningen, The Netherlands
| | - Jan R Böhnke
- Dundee Centre for Health and Related Research, School of Nursing and Health Sciences, University of Dundee, Dundee, UK
| |
Collapse
|