1
|
Hannachi T, Yakimova S, Somat A. A Follow up on the Continuum Theory of Eco-Anxiety: Analysis of the Climate Change Anxiety Scale Using Item Response Theory among French Speaking Population. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2024; 21:1158. [PMID: 39338041 PMCID: PMC11431234 DOI: 10.3390/ijerph21091158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 08/21/2024] [Accepted: 08/26/2024] [Indexed: 09/30/2024]
Abstract
The mental health impact of the environmental crisis, particularly eco-anxiety, is a growing research topic whose measurement still lacks consensus. This study aims to use item response theory (IRT) to gain a deeper understanding of the constructs measured by existing questionnaires. To conduct this review, we applied the graded response model with the help of the MIRT package in R on open-access data from the short French version of the Climate Change Anxiety Questionnaire, which measures cognitive-emotional impairment and functional impairment. The models tested in this study are the one, two, and three-factor models, and the bifactor model. After model selection, the psychometric properties of the selected model were tested. Our results suggest that the unidimensional model seems to be the most appropriate for measuring eco-anxiety. The item difficulty parameter extracted from the IRT enabled us to discuss the severity levels of the items comprising this tool. The Climate Change Anxiety Questionnaire appears to be more appropriate for measuring moderate to severe eco-anxiety. Avenues for improving this questionnaire and the measurement of eco-anxiety in general are then discussed.
Collapse
Affiliation(s)
- Taha Hannachi
- Laboratory of Psychology, Cognition, Behaviour, Communication (LP3C), Department of Psychology, Faculty of Human Science, Rennes 2 University, 35000 Rennes, France
| | - Sonya Yakimova
- Laboratory of Psychology, Cognition, Behaviour, Communication (LP3C), Department of Psychology, Faculty of Human Science, Rennes 2 University, 35000 Rennes, France
| | - Alain Somat
- Laboratory of Psychology, Cognition, Behaviour, Communication (LP3C), Department of Psychology, Faculty of Human Science, Rennes 2 University, 35000 Rennes, France
| |
Collapse
|
2
|
Beisemann M, Forthmann B, Doebler P. Understanding Ability and Reliability Differences Measured with Count Items: The Distributional Regression Test Model and the Count Latent Regression Model. MULTIVARIATE BEHAVIORAL RESEARCH 2024; 59:502-522. [PMID: 38348679 DOI: 10.1080/00273171.2023.2288577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2024]
Abstract
In psychology and education, tests (e.g., reading tests) and self-reports (e.g., clinical questionnaires) generate counts, but corresponding Item Response Theory (IRT) methods are underdeveloped compared to binary data. Recent advances include the Two-Parameter Conway-Maxwell-Poisson model (2PCMPM), generalizing Rasch's Poisson Counts Model, with item-specific difficulty, discrimination, and dispersion parameters. Explaining differences in model parameters informs item construction and selection but has received little attention. We introduce two 2PCMPM-based explanatory count IRT models: The Distributional Regression Test Model for item covariates, and the Count Latent Regression Model for (categorical) person covariates. Estimation methods are provided and satisfactory statistical properties are observed in simulations. Two examples illustrate how the models help understand tests and underlying constructs.
Collapse
|
3
|
Zimmer F, Draxler C, Debelak R. Power Analysis for the Wald, LR, Score, and Gradient Tests in a Marginal Maximum Likelihood Framework: Applications in IRT. PSYCHOMETRIKA 2023; 88:1249-1298. [PMID: 36029390 PMCID: PMC10656348 DOI: 10.1007/s11336-022-09883-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 01/11/2022] [Indexed: 06/15/2023]
Abstract
The Wald, likelihood ratio, score, and the recently proposed gradient statistics can be used to assess a broad range of hypotheses in item response theory models, for instance, to check the overall model fit or to detect differential item functioning. We introduce new methods for power analysis and sample size planning that can be applied when marginal maximum likelihood estimation is used. This allows the application to a variety of IRT models, which are commonly used in practice, e.g., in large-scale educational assessments. An analytical method utilizes the asymptotic distributions of the statistics under alternative hypotheses. We also provide a sampling-based approach for applications where the analytical approach is computationally infeasible. This can be the case with 20 or more items, since the computational load increases exponentially with the number of items. We performed extensive simulation studies in three practically relevant settings, i.e., testing a Rasch model against a 2PL model, testing for differential item functioning, and testing a partial credit model against a generalized partial credit model. The observed distributions of the test statistics and the power of the tests agreed well with the predictions by the proposed methods in sufficiently large samples. We provide an openly accessible R package that implements the methods for user-supplied hypotheses.
Collapse
Affiliation(s)
| | - Clemens Draxler
- The Health and Life Sciences University, Hall in Tirol, Austria
| | | |
Collapse
|
4
|
Paek I, Lin Z, Chalmers RP. Investigating Confidence Intervals of Item Parameters When Some Item Parameters Take Priors in the 2PL and 3PL Models. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2023; 83:375-400. [PMID: 36866071 PMCID: PMC9972130 DOI: 10.1177/00131644221096431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
To reduce the chance of Heywood cases or nonconvergence in estimating the 2PL or the 3PL model in the marginal maximum likelihood with the expectation-maximization (MML-EM) estimation method, priors for the item slope parameter in the 2PL model or for the pseudo-guessing parameter in the 3PL model can be used and the marginal maximum a posteriori (MMAP) and posterior standard error (PSE) are estimated. Confidence intervals (CIs) for these parameters and other parameters which did not take any priors were investigated with popular prior distributions, different error covariance estimation methods, test lengths, and sample sizes. A seemingly paradoxical result was that, when priors were taken, the conditions of the error covariance estimation methods known to be better in the literature (Louis or Oakes method in this study) did not yield the best results for the CI performance, while the conditions of the cross-product method for the error covariance estimation which has the tendency of upward bias in estimating the standard errors exhibited better CI performance. Other important findings for the CI performance are also discussed.
Collapse
Affiliation(s)
- Insu Paek
- Florida State University, Tallahassee,
FL, USA
| | | | | |
Collapse
|
5
|
Beisemann M. A flexible approach to modelling over-, under- and equidispersed count data in IRT: The Two-Parameter Conway-Maxwell-Poisson Model. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2022; 75:411-443. [PMID: 35678959 DOI: 10.1111/bmsp.12273] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 04/05/2022] [Indexed: 06/01/2023]
Abstract
Several psychometric tests and self-reports generate count data (e.g., divergent thinking tasks). The most prominent count data item response theory model, the Rasch Poisson Counts Model (RPCM), is limited in applicability by two restrictive assumptions: equal item discriminations and equidispersion (conditional mean equal to conditional variance). Violations of these assumptions lead to impaired reliability and standard error estimates. Previous work generalized the RPCM but maintained some limitations. The two-parameter Poisson counts model allows for varying discriminations but retains the equidispersion assumption. The Conway-Maxwell-Poisson Counts Model allows for modelling over- and underdispersion (conditional mean less than and greater than conditional variance, respectively) but still assumes constant discriminations. The present work introduces the Two-Parameter Conway-Maxwell-Poisson (2PCMP) model which generalizes these three models to allow for varying discriminations and dispersions within one model, helping to better accommodate data from count data tests and self-reports. A marginal maximum likelihood method based on the EM algorithm is derived. An implementation of the 2PCMP model in R and C++ is provided. Two simulation studies examine the model's statistical properties and compare the 2PCMP model to established models. Data from divergent thinking tasks are reanalysed with the 2PCMP model to illustrate the model's flexibility and ability to test assumptions of special cases.
Collapse
|
6
|
Abstract
In item response theory, uncertainty associated with estimated item parameters can lead to greater uncertainty in subsequent analyses, such as estimating trait scores for individual examinees. Most existing methods to characterize or correct for item parameter uncertainty implicitly assume that the latent trait continuum is fixed across the posterior distribution of item parameters. However, the latent trait continuum can also be understood as an artifact of the fitted model, such that the location of this continuum is determined with error. In other words, item parameter estimation error implies uncertainty about the location of the metric. This article uses Ramsay's (1996) geometry of the latent trait metric to develop a quantitative measure of metric stability, that is, the sampling variability of the latent trait continuum implied by errors in item parameter estimation. Through a series of illustrations, it is clarified how metric stability is related to other item response model evaluation outcomes (e.g., test information, model fit), and how metric stability can be useful in identifying well-determined regions of the latent trait continuum, making sample size recommendations, and selecting a model. Overall, the proposed measure of metric stability provides meaningful and highly interpretable information to aid in item response model evaluation.
Collapse
|
7
|
Williams ZJ, Gotham KO. Improving the measurement of alexithymia in autistic adults: a psychometric investigation of the 20-item Toronto Alexithymia Scale and generation of a general alexithymia factor score using item response theory. Mol Autism 2021; 12:56. [PMID: 34376227 PMCID: PMC8353782 DOI: 10.1186/s13229-021-00463-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 07/31/2021] [Indexed: 03/21/2023] Open
Abstract
BACKGROUND Alexithymia, a personality trait characterized by difficulties interpreting emotional states, is commonly elevated in autistic adults, and a growing body of literature suggests that this trait underlies several cognitive and emotional differences previously attributed to autism. Although questionnaires such as the 20-item Toronto Alexithymia Scale (TAS-20) are frequently used to measure alexithymia in the autistic population, few studies have investigated the psychometric properties of these questionnaires in autistic adults, including whether differential item functioning (I-DIF) exists between autistic and general population adults. METHODS This study is a revised version of a previous article that was retracted due to copyright concerns (Williams and Gotham in Mol Autism 12:1-40). We conducted an in-depth psychometric analysis of the TAS-20 in a large sample of 743 cognitively able autistic adults recruited from the Simons Foundation SPARK participant pool and 721 general population controls enrolled in a large international psychological study. The factor structure of the TAS-20 was examined using confirmatory factor analysis, and item response theory was used to generate a subset of the items that were strong indicators of a "general alexithymia" factor. Correlations between alexithymia and other clinical outcomes were used to assess the nomological validity of the new alexithymia score in the SPARK sample. RESULTS The TAS-20 did not exhibit adequate model fit in either the autistic or general population samples. Empirically driven item reduction was undertaken, resulting in an 8-item general alexithymia factor score (GAFS-8, with "TAS" no longer referenced due to copyright) with sound psychometric properties and practically ignorable I-DIF between diagnostic groups. Correlational analyses indicated that GAFS-8 scores, as derived from the TAS-20, meaningfully predict autistic trait levels, repetitive behaviors, and depression symptoms, even after controlling for trait neuroticism. The GAFS-8 also presented no meaningful decrement in nomological validity over the full TAS-20 in autistic participants. LIMITATIONS Limitations of the current study include a sample of autistic adults that was majority female, later diagnosed, and well educated; clinical and control groups drawn from different studies with variable measures; only 16 of the TAS-20 items being administered to the non-autistic sample; and an inability to test several other important psychometric characteristics of the GAFS-8, including sensitivity to change and I-DIF across multiple administrations. CONCLUSIONS These results indicate the potential of the GAFS-8 to robustly measure alexithymia in both autistic and non-autistic adults. A free online score calculator has been created to facilitate the use of norm-referenced GAFS-8 latent trait scores in research applications (available at https://asdmeasures.shinyapps.io/alexithymia ).
Collapse
Affiliation(s)
- Zachary J. Williams
- Medical Scientist Training Program, Vanderbilt University School of Medicine, 1215 21st Avenue South, Medical Center East, Room 8310, Nashville, TN 37232 USA
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN USA
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN USA
- Frist Center for Autism and Innovation, Vanderbilt University, Nashville, TN USA
| | | |
Collapse
|
8
|
Williams ZJ, Gotham KO. Assessing general and autism-relevant quality of life in autistic adults: A psychometric investigation using item response theory. Autism Res 2021; 14:1633-1644. [PMID: 33876550 PMCID: PMC8647037 DOI: 10.1002/aur.2519] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 02/22/2021] [Accepted: 04/01/2021] [Indexed: 12/20/2022]
Abstract
Although many interventions and services for autistic people have the ultimate goal of improving quality of life (QoL), there is relatively little research on how best to assess this construct in the autistic population, and existing scales designed for non-autistic individuals may not assess all meaningful facets of QoL in the autistic population. To address this need, the autism spectrum QoL form (ASQoL) was recently developed as a measure of the autism-relevant quality of life. However, the psychometrics of the ASQoL have not been examined beyond the authors' initial validation study, and important properties such as measurement invariance/differential item functioning (DIF) have not yet been tested. Using data from 700 autistic adults recruited from the Simons Foundation's SPARK cohort, the current study sought to perform a comprehensive independent psychometric evaluation of the ASQoL using item response theory, comparing its performance to a newly-proposed brief measure of general QoL (the WHOQOL-4). Our models revealed substantial DIF by sex and gender in the ASQoL, which caused ASQoL scores to grossly underestimate the self-reported QoL of autistic women. Based on a comparison of latent variable means, we demonstrated that observed sex/gender differences in manifest ASQoL scores were the result of statistical artifacts, a claim that was further supported by the lack of significant group differences on the sex/gender-invariant WHOQOL-4. Our findings indicate that the ASQoL composite score is psychometrically problematic in its current form, and substantial revisions may be necessary before valid and meaningful inferences can be made regarding autism-relevant aspects of QoL. LAY SUMMARY: Quality of life (QoL) is an extremely important outcome for autistic people, but many of the tools that are used to measure it does not take into account how QoL may be different for autistic people. Using data from 700 autistic adults, we examined the measurement properties of the autism spectrum quality of life form (ASQoL), a new measure of QoL designed specifically for autistic people. Our results indicate that the ASQoL shows a pronounced sex/gender bias, which causes it to underestimate QoL in autistic women. This bias needs to be eliminated before the ASQoL can be successfully used to measure QoL in the autistic population.
Collapse
Affiliation(s)
- Zachary J. Williams
- Medical Scientist Training Program, Vanderbilt University School of Medicine, Nashville, TN
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN
- Frist Center for Autism and Innovation, Vanderbilt University, Nashville, TN
| | | |
Collapse
|
9
|
Zhang Z. Asymptotic Standard Errors of Generalized Partial Credit Model True Score Equating Using Characteristic Curve Methods. APPLIED PSYCHOLOGICAL MEASUREMENT 2021; 45:331-345. [PMID: 34565939 PMCID: PMC8361376 DOI: 10.1177/01466216211013101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this study, the delta method was applied to estimate the standard errors of the true score equating when using the characteristic curve methods with the generalized partial credit model in test equating under the context of the common-item nonequivalent groups equating design. Simulation studies were further conducted to compare the performance of the delta method with that of the bootstrap method and the multiple imputation method. The results indicated that the standard errors produced by the delta method were very close to the criterion empirical standard errors as well as those yielded by the bootstrap method and the multiple imputation method under all the manipulated conditions.
Collapse
Affiliation(s)
- Zhonghua Zhang
- The University of Melbourne, Carlton, Victoria, Australia
| |
Collapse
|
10
|
Williams ZJ, Gotham KO. Improving the measurement of alexithymia in autistic adults: a psychometric investigation and refinement of the twenty-item Toronto Alexithymia Scale. Mol Autism 2021; 12:20. [PMID: 33653400 PMCID: PMC7971146 DOI: 10.1186/s13229-021-00427-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 02/19/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Alexithymia, a personality trait characterized by difficulties interpreting one's own emotional states, is commonly elevated in autistic adults, and a growing body of literature suggests that this trait underlies a number of cognitive and emotional differences previously attributed to autism, such as difficulties in facial emotion recognition and reduced empathy. Although questionnaires such as the twenty-item Toronto Alexithymia Scale (TAS-20) are frequently used to measure alexithymia in the autistic population, few studies have attempted to determine the psychometric properties of these questionnaires in autistic adults, including whether differential item functioning (I-DIF) exists between autistic and general population adults. METHODS We conducted an in-depth psychometric analysis of the TAS-20 in a large sample of 743 verbal autistic adults recruited from the Simons Foundation SPARK participant pool and 721 general population controls enrolled in a large international psychological study (the Human Penguin Project). The factor structure of the TAS-20 was examined using confirmatory factor analysis, and item response theory was used to further refine the scale based on local model misfit and I-DIF between the groups. Correlations between alexithymia and other clinical outcomes such as autistic traits, anxiety, and quality-of-life were used to assess the nomological validity of the revised alexithymia scale in the SPARK sample. RESULTS The TAS-20 did not exhibit adequate global model fit in either the autistic or general population samples. Empirically driven item reduction was undertaken, resulting in an eight-item unidimensional scale (TAS-8) with sound psychometric properties and practically ignorable I-DIF between diagnostic groups. Correlational analyses indicated that TAS-8 scores meaningfully predict autistic trait levels, anxiety and depression symptoms, and quality of life, even after controlling for trait neuroticism. LIMITATIONS Limitations of the current study include a sample of autistic adults that was overwhelmingly female, later-diagnosed, and well-educated; clinical and control groups drawn from different studies with variable measures; and an inability to test several other important psychometric characteristics of the TAS-8, including sensitivity to change and I-DIF across multiple administrations. CONCLUSIONS These results indicate the potential of the TAS-8 as a psychometrically robust tool to measure alexithymia in both autistic and non-autistic adults. A free online score calculator has been created to facilitate the use of norm-referenced TAS-8 latent trait scores in research applications (available at http://asdmeasures.shinyapps.io/TAS8_Score ).
Collapse
Affiliation(s)
- Zachary J. Williams
- Medical Scientist Training Program, Vanderbilt University School of Medicine, Nashville, TN USA
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, 1215 21st Avenue South, Medical Center East, Room 8310, Nashville, TN 37232 USA
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN USA
- Frist Center for Autism and Innovation, Vanderbilt University, Nashville, TN USA
| | | |
Collapse
|
11
|
Zhang Z. Asymptotic Standard Errors of Parameter Scale Transformation Coefficients in Test Equating Under the Nominal Response Model. APPLIED PSYCHOLOGICAL MEASUREMENT 2021; 45:134-138. [PMID: 33627919 PMCID: PMC7876637 DOI: 10.1177/0146621620965740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Researchers have developed a characteristic curve procedure to estimate the parameter scale transformation coefficients in test equating under the nominal response model. In the study, the delta method was applied to derive the standard error expressions for computing the standard errors for the estimates of the parameter scale transformation coefficients. This brief report presents the results of a simulation study that examined the accuracy of the derived formulas and compared the performance of this analytical method with that of the multiple imputation method. The results indicated that the standard errors produced by the delta method were very close to the criterion standard errors as well as those yielded by the multiple imputation method under all the simulation conditions.
Collapse
|
12
|
Liu CW, Chalmers RP. A note on computing Louis' observed information matrix identity for IRT and cognitive diagnostic models. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2021; 74:118-138. [PMID: 32757460 DOI: 10.1111/bmsp.12207] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 04/30/2020] [Indexed: 06/11/2023]
Abstract
Using Louis' formula, it is possible to obtain the observed information matrix and the corresponding large-sample standard error estimates after the expectation-maximization (EM) algorithm has converged. However, Louis' formula is commonly de-emphasized due to its relatively complex integration representation, particularly when studying latent variable models. This paper provides a holistic overview that demonstrates how Louis' formula can be applied efficiently to item response theory (IRT) models and other popular latent variable models, such as cognitive diagnostic models (CDMs). After presenting the algebraic components required for Louis' formula, two real data analyses, with accompanying numerical illustrations, are presented. Next, a Monte Carlo simulation is presented to compare the computational efficiency of Louis' formula with previously existing methods. Results from these presentations suggest that Louis' formula should be adopted as a standard method when computing the observed information matrix for IRT models and CDMs fitted with the EM algorithm due to its computational efficiency and flexibility.
Collapse
Affiliation(s)
- Chen-Wei Liu
- Department of Educational Psychology and Counseling, National Taiwan Normal University, Taipei, Taiwan
| | | |
Collapse
|
13
|
Schneider L, Chalmers RP, Debelak R, Merkle EC. Model Selection of Nested and Non-Nested Item Response Models Using Vuong Tests. MULTIVARIATE BEHAVIORAL RESEARCH 2020; 55:664-684. [PMID: 31530187 DOI: 10.1080/00273171.2019.1664280] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this paper, we apply Vuong's general approach of model selection to the comparison of nested and non-nested unidimensional and multidimensional item response theory (IRT) models. Vuong's approach of model selection is useful because it allows for formal statistical tests of both nested and non-nested models. However, only the test of non-nested models has been applied in the context of IRT models to date. After summarizing the statistical theory underlying the tests, we investigate the performance of all three distinct Vuong tests in the context of IRT models using simulation studies and real data. In the non-nested case we observed that the tests can reliably distinguish between the graded response model and the generalized partial credit model. In the nested case, we observed that the tests typically perform as well as or sometimes better than the traditional likelihood ratio test. Based on these results, we argue that Vuong's approach provides a useful set of tools for researchers and practitioners to effectively compare competing nested and non-nested IRT models.
Collapse
|
14
|
Falk CF, Ju U. Estimation of Response Styles Using the Multidimensional Nominal Response Model: A Tutorial and Comparison With Sum Scores. Front Psychol 2020; 11:72. [PMID: 32116902 PMCID: PMC7017717 DOI: 10.3389/fpsyg.2020.00072] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 01/10/2020] [Indexed: 11/16/2022] Open
Abstract
Recent years have seen a dramatic increase in item response models for measuring response styles on Likert-type items. These model-based approaches stand in contrast to traditional sum-score-based methods where researchers count the number of times that participants selected certain response options. The multidimensional nominal response model (MNRM) offers a flexible model-based approach that may be intuitive to those familiar with sum score approaches. This paper presents a tutorial on the model along with code for estimating it using three different software packages: flexMIRT®, mirt, and Mplus. We focus on specification and interpretation of response functions. In addition, we provide analytical details on how sum score to scale score conversion can be done with the MNRM. In the context of a real data example, three different scoring approaches are then compared. This example illustrates how sum-score-based approaches can sometimes yield scores that are confounded with substantive content. We expect that the current paper will facilitate further investigations as to whether different substantive conclusions are reached under alternative approaches to measuring response styles.
Collapse
Affiliation(s)
- Carl F Falk
- Department of Psychology, McGill University, Montreal, QC, Canada
| | - Unhee Ju
- Riverside Insights, Itasca, IL, United States
| |
Collapse
|
15
|
Liu Y, Xin T, Andersson B, Tian W. Information matrix estimation procedures for cognitive diagnostic models. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2019; 72:18-37. [PMID: 29508383 DOI: 10.1111/bmsp.12134] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Revised: 01/25/2018] [Indexed: 06/08/2023]
Abstract
Two new methods to estimate the asymptotic covariance matrix for marginal maximum likelihood estimation of cognitive diagnosis models (CDMs), the inverse of the observed information matrix and the sandwich-type estimator, are introduced. Unlike several previous covariance matrix estimators, the new methods take into account both the item and structural parameters. The relationships between the observed information matrix, the empirical cross-product information matrix, the sandwich-type covariance matrix and the two approaches proposed by de la Torre (2009, J. Educ. Behav. Stat., 34, 115) are discussed. Simulation results show that, for a correctly specified CDM and Q-matrix or with a slightly misspecified probability model, the observed information matrix and the sandwich-type covariance matrix exhibit good performance with respect to providing consistent standard errors of item parameter estimates. However, with substantial model misspecification only the sandwich-type covariance matrix exhibits robust performance.
Collapse
Affiliation(s)
- Yanlou Liu
- Chinese Academy of Education Big Data, Qufu Normal University, Shandong, China
- School of Psychology, Beijing Normal University, China
| | - Tao Xin
- Collaborative Innovation Center of Assessment toward Basic Education Quality, Beijing Normal University, China
| | - Björn Andersson
- Centre for Educational Measurement, University of Oslo, Norway
| | - Wei Tian
- Collaborative Innovation Center of Assessment toward Basic Education Quality, Beijing Normal University, China
| |
Collapse
|
16
|
Chalmers RP. Model-Based Measures for Detecting and Quantifying Response Bias. PSYCHOMETRIKA 2018; 83:696-732. [PMID: 29907891 DOI: 10.1007/s11336-018-9626-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2017] [Revised: 03/06/2018] [Indexed: 06/08/2023]
Abstract
This paper proposes a model-based family of detection and quantification statistics to evaluate response bias in item bundles of any size. Compensatory (CDRF) and non-compensatory (NCDRF) response bias measures are proposed, along with their sample realizations and large-sample variability when models are fitted using multiple-group estimation. Based on the underlying connection to item response theory estimation methodology, it is argued that these new statistics provide a powerful and flexible approach to studying response bias for categorical response data over and above methods that have previously appeared in the literature. To evaluate their practical utility, CDRF and NCDRF are compared to the closely related SIBTEST family of statistics and likelihood-based detection methods through a series of Monte Carlo simulations. Results indicate that the new statistics are more optimal effect size estimates of marginal response bias than the SIBTEST family, are competitive with a selection of likelihood-based methods when studying item-level bias, and are the most optimal when studying differential bundle and test bias.
Collapse
Affiliation(s)
- R Philip Chalmers
- Department of Educational Psychology, The University of Georgia, 323 Aderhold Hall, Athens, GA , 30602, USA.
| |
Collapse
|
17
|
Liu CW, Chalmers RP. Fitting item response unfolding models to Likert-scale data using mirt in R. PLoS One 2018; 13:e0196292. [PMID: 29723217 PMCID: PMC5933773 DOI: 10.1371/journal.pone.0196292] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2017] [Accepted: 04/10/2018] [Indexed: 11/19/2022] Open
Abstract
While a large family of unfolding models for Likert-scale response data have been developed for decades, very few applications of these models have been witnessed in practice. There may be several reasons why these have not appeared more widely in published research, however one obvious limitation appears to be the absence of suitable software for model estimation. In this article, the authors demonstrate how the mirt package can be adopted to estimate parameters from various unidimensional and multidimensional unfolding models. To concretely demonstrate the concepts and recommendations, a tutorial and examples of R syntax are provided for practical guidelines. Finally, the performance of mirt is evaluated via parameter-recovery simulation studies to demonstrate its potential effectiveness. The authors argue that, armed with the mirt package, applying unfolding models to Likert-scale data is now not only possible but can be estimated to real-datasets with little difficulty.
Collapse
Affiliation(s)
- Chen-Wei Liu
- Faculty of Education, the Chinese University of Hong Kong, Hong Kong, Hong Kong
| | - R. Philip Chalmers
- Quantitative Methodology, University of Georgia, Athens, United States of America
| |
Collapse
|