1
|
Morales-Vives F, Ferrando PJ, Hernández-Dorado A. Modeling maladaptive personality traits with unipolar item response theory: The case of Callousness. THE JOURNAL OF GENERAL PSYCHOLOGY 2024:1-28. [PMID: 39291963 DOI: 10.1080/00221309.2024.2404398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 09/09/2024] [Indexed: 09/19/2024]
Abstract
Most IRT applications in personality assume that the measured trait is a bipolar dimension, normally distributed in the population. These assumptions, however, could be questionable for maladaptive, (quasi) pathological traits that still fall in the normal range. This study focuses on one such trait, Callousness, and uses two different instruments and samples to determine whether there is a basis for modeling it as a unipolar trait instead of a bipolar one. More specifically, the following community samples were used, recruited in several Spanish high schools: a) 719 adolescents (13-19 years old, 55.8% girls), b) 681 adolescents (13-19 years old, 44.9% girls). Callousness was assessed with the Inventory of Callous-unemotional traits and Antisocial behavior in the first sample and with the Inventory of Callous Unemotional traits in the second sample. We compared the outcomes of fitting the Graded-Response model (a bipolar-trait model) and the Log-Logistic model (a unipolar trait model) in these community samples and found that they differed considerably at the scoring level. In terms of accuracy, the conditional reliability functions had opposite patterns: it was maximum at high levels in the Graded-Response model and at low levels in the Log-Logistic model. In terms of validity, the models showed different results regarding the prediction of indirect aggressiveness and non-planning impulsiveness.
Collapse
Affiliation(s)
- Fabia Morales-Vives
- Psychology Department, Research Center for Behavior Assessment, Universitat Rovira i Virgili, Tarragona, Spain
| | - Pere J Ferrando
- Psychology Department, Research Center for Behavior Assessment, Universitat Rovira i Virgili, Tarragona, Spain
| | - Ana Hernández-Dorado
- Psychology Department, Research Center for Behavior Assessment, Universitat Rovira i Virgili, Tarragona, Spain
| |
Collapse
|
2
|
Magnus BE. Item Response Modeling of Clinical Instruments With Filter Questions: Disentangling Symptom Presence and Severity. APPLIED PSYCHOLOGICAL MEASUREMENT 2024; 48:235-256. [PMID: 39166184 PMCID: PMC11331747 DOI: 10.1177/01466216241261709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/22/2024]
Abstract
Clinical instruments that use a filter/follow-up response format often produce data with excess zeros, especially when administered to nonclinical samples. When the unidimensional graded response model (GRM) is then fit to these data, parameter estimates and scale scores tend to suggest that the instrument measures individual differences only among individuals with severe levels of the psychopathology. In such scenarios, alternative item response models that explicitly account for excess zeros may be more appropriate. The multivariate hurdle graded response model (MH-GRM), which has been previously proposed for handling zero-inflated questionnaire data, includes two latent variables: susceptibility, which underlies responses to the filter question, and severity, which underlies responses to the follow-up question. Using both simulated and empirical data, the current research shows that compared to unidimensional GRMs, the MH-GRM is better able to capture individual differences across a wider range of psychopathology, and that when unidimensional GRMs are fit to data from questionnaires that include filter questions, individual differences at the lower end of the severity continuum largely go unmeasured. Practical implications are discussed.
Collapse
|
3
|
Shim H, Bonifay W, Wiedermann W. Parsimonious item response theory modeling with the negative log-log link: The role of inflection point shift. Behav Res Methods 2024; 56:4385-4402. [PMID: 37537489 DOI: 10.3758/s13428-023-02189-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/30/2023] [Indexed: 08/05/2023]
Abstract
In item response theory (IRT) modeling, the magnitude of the lower and upper asymptote parameters determines the degree to which the inflection point shifts above or below P = 0.50. The current study examines the one-parameter negative log-log model (NLLM), which is characterized by a downward shift in the inflection point, among other distinctive psychometric properties. After detailing the statistical foundations of the NLLM, we present a series of simulation studies to establish item and person parameter estimation accuracy and to demonstrate that this parsimonious model addresses the "slipping" effect (i.e., unexpectedly incorrect answers) via an inflection point < 0.50 rather than through computationally difficult estimation of the upper asymptote. We then provide further support for these simulation results through empirical data analysis. Finally, we discuss how the NLLM contributes to recent methodological literature on the utility of asymmetric IRT models.
Collapse
Affiliation(s)
- Hyejin Shim
- Laureate Institute for Brain Research, Tulsa, OK, USA
| | - Wes Bonifay
- Department of Educational, School, and Counseling Psychology, University of Missouri, Columbia, MO, USA.
- Missouri Prevention Science Institute, Columbia, MO, USA.
| | - Wolfgang Wiedermann
- Department of Educational, School, and Counseling Psychology, University of Missouri, Columbia, MO, USA
- Missouri Prevention Science Institute, Columbia, MO, USA
| |
Collapse
|
4
|
Ferrando PJ, Morales-Vives F, Hernández-Dorado A. Measuring Unipolar Traits With Continuous Response Items: Some Methodological and Substantive Developments. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2024; 84:425-449. [PMID: 38756459 PMCID: PMC11095320 DOI: 10.1177/00131644231181889] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2024]
Abstract
In recent years, some models for binary and graded format responses have been proposed to assess unipolar variables or "quasi-traits." These studies have mainly focused on clinical variables that have traditionally been treated as bipolar traits. In the present study, we have made a proposal for unipolar traits measured with continuous response items. The proposed log-logistic continuous unipolar model (LL-C) is remarkably simple and is more similar to the original binary formulation than the graded extensions, which is an advantage. Furthermore, considering that irrational, extreme, or polarizing beliefs could be another domain of unipolar variables, we have applied this proposal to an empirical example of superstitious beliefs. The results suggest that, in certain cases, the standard linear model can be a good approximation to the LL-C model in terms of parameter estimation and goodness of fit, but not trait estimates and their accuracy. The results also show the importance of considering the unipolar nature of this kind of trait when predicting criterion variables, since the validity results were clearly different.
Collapse
|
5
|
Deutscher D, Kallen MA, Hayes D, Werneke MW, Mioduski JE, Toczylowski T, Petitti JM, Cook KF. The Stroke Upper and Lower Extremity Physical Function Measures Were Supported for Score Reliability, Validity, and Administration Efficiency for Patients Poststroke. Phys Ther 2023; 103:pzad107. [PMID: 37572106 DOI: 10.1093/ptj/pzad107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 03/05/2023] [Accepted: 05/15/2023] [Indexed: 08/14/2023]
Abstract
OBJECTIVE The aims of this study were to (1) evaluate the suitability of newly developed items for calibration into 2 item banks for stroke upper extremity (SUE) and stroke lower extremity (SLE) physical function (PF) patient-reported outcome measures (PROMs) and to (2) assess score reliability and validity and PROM administration efficiency based on computerized adaptive testing (CAT). METHODS A retrospective longitudinal study involving patients poststroke who were treated in outpatient rehabilitation clinics and responded to 28 and 25 region-specific candidate items addressing tasks related to upper or lower extremity PF, respectively, was conducted. Item response theory (IRT) model assumptions of unidimensionality, local independence, item fit, and presence of differential item functioning were evaluated. CAT-generated scores were assessed for reliability, validity, and administration efficiency, and 10-item short forms were assessed for reliability. RESULTS Cohorts consisted of 2017 patients with stroke involving the upper extremity and 2107 patients with stroke involving the lower extremity (mean age [SD]: SUE = 62 [14] and SLE = 63 [14]; range = 14-89). Two solutions (SUE: 28-item; SLE: 24-item) supported unidimensionality and fit to the IRT model, with reliability estimates >0.93 for all administration modes. No items demonstrated differential item functioning. Scores discriminated among multiple patient groups in clinically logical ways, with better outcomes observed for patients who were younger, were male, had less chronicity, and had fewer comorbidities. The SUE and SLE, respectively, had 1 and 0.3% floor effects and 4.3 and 1.1% ceiling effects. Change score effect sizes were 0.5 (SUE) and 0.6 (SLE). Simulated CAT scores required an average of 6 (SUE) and 5.6 (SLE) items (median = 5). CONCLUSION The stroke upper extremity and stroke lower extremity PROM scores were reliable, valid, and efficient and had moderate change effect sizes for assessing PF as perceived by patients poststroke with upper and lower extremity impairments. Scores had negligible floor and acceptable ceiling effects. Based on these results, the stroke PROMs are suitable for research and routine clinical practice. IMPACT As IRT-based measures, these PROMs support clinical practice guideline recommendations for the use of outcome measures in neurologic physical therapy and the administration of condition-specific functional questions with low response burden for patients. The 10-item short forms offer a feasible alternative administration mode when CAT administration is not available.
Collapse
Affiliation(s)
- Daniel Deutscher
- Net Health Systems, Inc, Pittsburgh, Pennsylvania, USA
- Maccabitech Institute for Research & Innovation, Maccabi Healthcare Services , Tel-Aviv, Israel
| | - Michael A Kallen
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Deanna Hayes
- Net Health Systems, Inc, Pittsburgh, Pennsylvania, USA
| | | | | | - Theresa Toczylowski
- Department of Physical Therapy, Moss Rehabilitation Hospital, Elkins Park, Pennsylvania, USA
| | - Jessica M Petitti
- Department of Neurologic Rehabilitation, Ohio Health, Columbus, Ohio, USA
| | | |
Collapse
|
6
|
Saavedra LM, Morgan-López AA, West SG, Alegría M, Silverman WK. Mitigating Multiple Sources of Bias in a Quasi-Experimental Integrative Data Analysis: Does Treating Childhood Anxiety Prevent Substance Use Disorders in Late Adolescence/Young Adulthood? PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2023; 24:1622-1635. [PMID: 36057023 DOI: 10.1007/s11121-022-01422-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/05/2022] [Indexed: 11/26/2022]
Abstract
Psychiatric epidemiologists, developmental psychopathologists, prevention scientists, and treatment researchers have long speculated that treating child anxiety disorders could prevent alcohol and other drug use disorders in young adulthood. A primary challenge in examining long-term effects of anxiety disorder treatment from randomized controlled trials is that all participants receive an immediate or delayed study-related treatment prior to long-term follow-up assessment. Thus, if a long-term follow-up is conducted, a comparison condition no longer exists within the trial. Quasi-experimental designs (QEDs) pairing such clinical samples with comparable untreated epidemiological samples offer a method of addressing this challenge. Selection bias, often a concern in QEDs, can be mitigated by propensity score weighting. A second challenge may arise because the clinical and epidemiological studies may not have used identical measures, necessitating Integrative Data Analysis (IDA) for measure harmonization and scale score estimation. The present study uses a combination of propensity score weighting, zero-inflated mixture moderated nonlinear factor analysis (ZIM-MNLFA), and potential outcomes mediation in a child anxiety treatment QED/IDA (n = 396). Under propensity score-weighted potential outcomes mediation, CBT led to reductions in substance use disorder severity, the effects of which were mediated by reductions in anxiety severity in young adulthood. Sensitivity analyses highlighted the importance of attending to multiple types of bias. This study illustrates how hybrid QED/IDAs can be used in secondary prevention contexts for improved measurement and causal inference, particularly when control participants in clinical trials receive study-related treatment prior to long-term assessment.
Collapse
Affiliation(s)
- Lissette M Saavedra
- Community Health Research Division, RTI International, Research Triangle Park, NC, USA.
| | | | - Stephen G West
- Department of Psychology, Arizona State University, Tempe, AZ, USA
| | - Margarita Alegría
- Disparities Research Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Wendy K Silverman
- Child Study Center, Yale University School of Medicine, New Haven, CT, USA
| |
Collapse
|
7
|
Wittkopf S, Langmann A, Roessner V, Roepke S, Poustka L, Nenadić I, Stroth S, Kamp-Becker I. Conceptualization of the latent structure of autism: further evidence and discussion of dimensional and hybrid models. Eur Child Adolesc Psychiatry 2023; 32:2247-2258. [PMID: 36006478 PMCID: PMC10576682 DOI: 10.1007/s00787-022-02062-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 08/01/2022] [Indexed: 11/29/2022]
Abstract
Autism spectrum disorder (ASD) might be conceptualized as an essentially dimensional, categorical, or hybrid model. Yet, current empirical studies are inconclusive and the latent structure of ASD has explicitly been examined only in a few studies. The aim of our study was to identify and discuss the latent model structure of behavioral symptoms related to ASD and to address the question of whether categories and/or dimensions best represent ASD symptoms. We included data of 2920 participants (1-72 years of age), evaluated with the Autism Diagnostic Observation Schedule (Modules 1-4). We applied latent class analysis, confirmatory factor analysis, and factor mixture modeling and evaluated the model fit by a combination of criteria. Based on the model selection criteria, the model fits, the interpretability as well as the clinical utility we conclude that the hybrid model serves best for conceptualization and assessment of ASD symptoms. It is both grounded in empirical evidence and in clinical usefulness, is in line with the current classification system (DSM-5) and has the potential of being more specific than the dimensional approach (decreasing false positive diagnoses).
Collapse
Affiliation(s)
- Sarah Wittkopf
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Medical Clinic, Philipps-University, Marburg, Germany
| | - Anika Langmann
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Medical Clinic, Philipps-University, Marburg, Germany
| | - Veit Roessner
- Department of Child and Adolescent Psychiatry, Technical University Dresden, Dresden, Germany
| | - Stefan Roepke
- Department of Psychiatry, Charité Universitätsmedizin Berlin, Campus Benjamin Franklin, Berlin, Germany
| | - Luise Poustka
- Department of Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, Göttingen, Germany
| | - Igor Nenadić
- Department of Psychiatry and Psychotherapy, Medical Clinic, Philipps-University Marburg, Marburg, Germany
| | - Sanna Stroth
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Medical Clinic, Philipps-University, Marburg, Germany.
| | - Inge Kamp-Becker
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Medical Clinic, Philipps-University, Marburg, Germany
| |
Collapse
|
8
|
Deutscher D, Kallen MA, Hayes D, Werneke MW, Mioduski JE, Levenhagen K, Pfarr M, Cook KF. Lower Quadrant Edema Patient-Reported Outcome Measure Is Reliable, Valid, and Efficient for Patients With Lymphatic and Venous Disorders. Phys Ther 2023; 103:pzad083. [PMID: 37682087 DOI: 10.1093/ptj/pzad083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 01/20/2023] [Accepted: 04/15/2023] [Indexed: 09/09/2023]
Abstract
OBJECTIVE The main aims of this study were: (1) to create a patient-reported outcome measure (PROM) item bank for measuring the impact of lower quadrant edema (LQE) on physical function using item response theory and (2) to assess reliability, validity, and administration efficiency of LQE PROM scores based on computerized adaptive test (CAT) and the reliability of a 10-item short form (SF). METHODS This retrospective study included data from patients treated in outpatient rehabilitation clinics for lower quadrant edema who responded to all 30 candidate items at intake. Item response theory model assumptions of unidimensionality, local item independence, item fit, and presence of differential item functioning (DIF) were evaluated. LQE-CAT-generated scores were assessed for reliability, validity, and administration efficiency. LQE-SF-generated scores were assessed for reliability. RESULTS The total cohort included 4894 patients (mean [SD] age = 65 [14] years; range = 14-89 years). A set of 20 items was selected for the item bank based on support for its unidimensionality and fit to the item response theory model, with reliability estimates greater than 0.92 for CAT and SF administration modes. No items demonstrated DIF with respect to tested variables. After controlling for scores at intake, scores discriminated among multiple patient groups in clinically logical ways with better outcomes observed for patients who were younger with less chronic symptoms and fewer comorbidities. Scores were responsive to change but the effect size was small (0.4). There were negligible floor and ceiling effects. CAT administration of the item bank required an average of 6.1 items (median = 5). Scores correlated highly with full-bank scores (Pearson correlation coefficient = 0.98). CONCLUSION Scores on the LQE PROM were reliable, valid, and efficient for assessing perceived physical function of patients with lower quadrant edema. The LQE, CAT, and SF are suitable for research and routine clinical care. Reasons for the small effect size for change scores should be studied. IMPACT The newly developed LQE PROM was reliable and valid and offered efficient administration modes for assessing perceived physical function of patients with LQE, both for research and routine clinical care in busy outpatient rehabilitation settings. As an item response theory-based measure, the LQE PROM allows administration of condition-specific functional questions with low response burden for patients. The 10-item LQE-SF offers a feasible alternative administration mode when CAT administration is not available. This study supports a transition to PROMs that are based on modern measurement approaches to achieve the combined benefits of high accuracy and efficiency.
Collapse
Affiliation(s)
- Daniel Deutscher
- Net Health Systems, Inc., Pittsburgh, Pennsylvania, USA
- Maccabitech Institute for Research & Innovation, Maccabi Healthcare Services, Tel-Aviv, Israel
| | - Michael A Kallen
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Deanna Hayes
- Net Health Systems, Inc., Pittsburgh, Pennsylvania, USA
| | | | | | - Kim Levenhagen
- Program in Physical Therapy, Saint Louis University, St. Louis, Missouri, USA
| | - Megan Pfarr
- HSHS Wisconsin & Prevea Health, Green Bay, Wisconsin, USA
| | | |
Collapse
|
9
|
Williams ZJ, Schaaf R, Ausderau KK, Baranek GT, Barrett DJ, Cascio CJ, Dumont RL, Eyoh EE, Failla MD, Feldman JI, Foss-Feig JH, Green HL, Green SA, He JL, Kaplan-Kahn EA, Keçeli-Kaysılı B, MacLennan K, Mailloux Z, Marco EJ, Mash LE, McKernan EP, Molholm S, Mostofsky SH, Puts NAJ, Robertson CE, Russo N, Shea N, Sideris J, Sutcliffe JS, Tavassoli T, Wallace MT, Wodka EL, Woynaroski TG. Examining the latent structure and correlates of sensory reactivity in autism: a multi-site integrative data analysis by the autism sensory research consortium. Mol Autism 2023; 14:31. [PMID: 37635263 PMCID: PMC10464466 DOI: 10.1186/s13229-023-00563-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 08/11/2023] [Indexed: 08/29/2023] Open
Abstract
BACKGROUND Differences in responding to sensory stimuli, including sensory hyperreactivity (HYPER), hyporeactivity (HYPO), and sensory seeking (SEEK) have been observed in autistic individuals across sensory modalities, but few studies have examined the structure of these "supra-modal" traits in the autistic population. METHODS Leveraging a combined sample of 3868 autistic youth drawn from 12 distinct data sources (ages 3-18 years and representing the full range of cognitive ability), the current study used modern psychometric and meta-analytic techniques to interrogate the latent structure and correlates of caregiver-reported HYPER, HYPO, and SEEK within and across sensory modalities. Bifactor statistical indices were used to both evaluate the strength of a "general response pattern" factor for each supra-modal construct and determine the added value of "modality-specific response pattern" scores (e.g., Visual HYPER). Bayesian random-effects integrative data analysis models were used to examine the clinical and demographic correlates of all interpretable HYPER, HYPO, and SEEK (sub)constructs. RESULTS All modality-specific HYPER subconstructs could be reliably and validly measured, whereas certain modality-specific HYPO and SEEK subconstructs were psychometrically inadequate when measured using existing items. Bifactor analyses supported the validity of a supra-modal HYPER construct (ωH = .800) but not a supra-modal HYPO construct (ωH = .653), and supra-modal SEEK models suggested a more limited version of the construct that excluded some sensory modalities (ωH = .800; 4/7 modalities). Modality-specific subscales demonstrated significant added value for all response patterns. Meta-analytic correlations varied by construct, although sensory features tended to correlate most with other domains of core autism features and co-occurring psychiatric symptoms (with general HYPER and speech HYPO demonstrating the largest numbers of practically significant correlations). LIMITATIONS Conclusions may not be generalizable beyond the specific pool of items used in the current study, which was limited to caregiver report of observable behaviors and excluded multisensory items that reflect many "real-world" sensory experiences. CONCLUSION Of the three sensory response patterns, only HYPER demonstrated sufficient evidence for valid interpretation at the supra-modal level, whereas supra-modal HYPO/SEEK constructs demonstrated substantial psychometric limitations. For clinicians and researchers seeking to characterize sensory reactivity in autism, modality-specific response pattern scores may represent viable alternatives that overcome many of these limitations.
Collapse
Affiliation(s)
- Zachary J Williams
- Medical Scientist Training Program, Vanderbilt University School of Medicine, Nashville, TN, USA.
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, 1215 21st Avenue South, Medical Center East, South Tower, Room 8310, Nashville, TN, 37232, USA.
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA.
- Frist Center for Autism and Innovation, Vanderbilt University, Nashville, TN, USA.
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA.
| | - Roseann Schaaf
- Department of Occupational Therapy, College of Rehabilitation Sciences, Thomas Jefferson University, Philadelphia, PA, USA
- Jefferson Autism Center of Excellence, Farber Institute of Neuroscience, Thomas Jefferson University, Philadelphia, PA, USA
| | - Karla K Ausderau
- Department of Kinesiology, Occupational Therapy Program, University of Wisconsin-Madison, Madison, WI, USA
- Waisman Center, University of Wisconsin-Madison, Madison, WI, USA
| | - Grace T Baranek
- Mrs. T.H. Chan Division of Occupational Science and Occupational Therapy, University of Southern California, Los Angeles, CA, USA
| | - D Jonah Barrett
- Neuroscience Undergraduate Program, Vanderbilt University, Nashville, TN, USA
- School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Carissa J Cascio
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Frist Center for Autism and Innovation, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Rachel L Dumont
- Department of Occupational Therapy, College of Rehabilitation Sciences, Thomas Jefferson University, Philadelphia, PA, USA
| | - Ekomobong E Eyoh
- Institute of Child Development, University of Minnesota, Minneapolis, MN, USA
| | | | - Jacob I Feldman
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, 1215 21st Avenue South, Medical Center East, South Tower, Room 8310, Nashville, TN, 37232, USA
- Frist Center for Autism and Innovation, Vanderbilt University, Nashville, TN, USA
| | - Jennifer H Foss-Feig
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Heather L Green
- Department of Radiology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Shulamite A Green
- Department of Psychiatry and Biobehavioral Sciences, University of California - Los Angeles, Los Angeles, CA, USA
| | - Jason L He
- Department of Forensic and Neurodevelopmental Sciences, Sackler Institute for Translational Neurodevelopment, Institute of Psychiatry, Psychology, and Neuroscience, King's College London, London, UK
| | - Elizabeth A Kaplan-Kahn
- Department of Psychology, Syracuse University, Syracuse, NY, USA
- Department of Child and Adolescent Psychiatry and Behavioral Sciences, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Bahar Keçeli-Kaysılı
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, 1215 21st Avenue South, Medical Center East, South Tower, Room 8310, Nashville, TN, 37232, USA
| | - Keren MacLennan
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
- Department of Psychology, Durham University, Durham, UK
| | - Zoe Mailloux
- Department of Occupational Therapy, College of Rehabilitation Sciences, Thomas Jefferson University, Philadelphia, PA, USA
| | - Elysa J Marco
- Department of Neurodevelopmental Medicine, Cortica Healthcare, San Rafael, CA, USA
| | - Lisa E Mash
- Division of Psychology, Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Elizabeth P McKernan
- Department of Psychology, Syracuse University, Syracuse, NY, USA
- Department of Child and Adolescent Psychiatry and Behavioral Sciences, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Sophie Molholm
- Department of Pediatrics, Albert Einstein College of Medicine, Bronx, NY, USA
- Dominick P. Purpura Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Stewart H Mostofsky
- Center for Neurodevelopmental and Imaging Research, Kennedy Krieger Institute, Baltimore, MD, USA
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Psychiatry and Behavioral Science, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Nicolaas A J Puts
- Department of Forensic and Neurodevelopmental Sciences, Sackler Institute for Translational Neurodevelopment, Institute of Psychiatry, Psychology, and Neuroscience, King's College London, London, UK
- MRC Centre for Neurodevelopmental Disorders, King's College London, London, UK
| | - Caroline E Robertson
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Natalie Russo
- Department of Psychology, Syracuse University, Syracuse, NY, USA
| | - Nicole Shea
- Department of Psychology, Syracuse University, Syracuse, NY, USA
- Division of Pulmonology and Sleep Medicine, Department of Pediatrics, Kaleida Health, Buffalo, NY, USA
| | - John Sideris
- Mrs. T.H. Chan Division of Occupational Science and Occupational Therapy, University of Southern California, Los Angeles, CA, USA
| | - James S Sutcliffe
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA
| | - Teresa Tavassoli
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Mark T Wallace
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Frist Center for Autism and Innovation, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
| | - Ericka L Wodka
- Department of Psychiatry and Behavioral Science, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Center for Autism and Related Disorders, Kennedy Krieger Institute, Baltimore, MD, USA
| | - Tiffany G Woynaroski
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, 1215 21st Avenue South, Medical Center East, South Tower, Room 8310, Nashville, TN, 37232, USA
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Frist Center for Autism and Innovation, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Communication Sciences and Disorders, John A. Burns School of Medicine, University of Hawaii, Honolulu, HI, USA
| |
Collapse
|
10
|
Deutscher D, Kallen MA, Werneke MW, Mioduski JE, Hayes D. Reliability, Validity, and Efficiency of an Item Response Theory-Based Balance Confidence Patient-Reported Outcome Measure. Phys Ther 2023; 103:pzad058. [PMID: 37265368 DOI: 10.1093/ptj/pzad058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 01/22/2023] [Accepted: 05/29/2023] [Indexed: 06/03/2023]
Abstract
OBJECTIVE The aims of this study were to calibrate the original 16 items from the Activities-Specific Balance Confidence (ABC) Scale to create an item response theory (IRT)-based item bank and scoring metric of balance confidence (BC) and to assess psychometric properties of a computerized adaptive test (BC-CAT) and 6-item short-form (BC-SF) administration modes. METHODS This retrospective study included data from patients who were treated in outpatient rehabilitation clinics and assessed for balance impairments by responding to the full ABC Scale at intake. IRT model assumptions of unidimensionality, local item independence, item fit, and presence of differential item functioning (DIF) were evaluated. BC-CAT-generated scores were assessed for reliability, validity, and administration efficiency, and the newly developed BC-SF was assessed for reliability. RESULTS Total cohort included 20,354 patients (mean age [SD] = 66 [16] years; range = 14-89). All 16 items were retained in the final item bank based on support for unidimensionality and fit to the IRT model. No items demonstrated DIF. Reliability estimates were 0.95, 0.96, and 0.98 for the BC-SF, BC-CAT, and the full item bank, respectively. Scores discriminated among patient groups in clinically logical ways. After controlling for scores at intake, better outcomes were achieved for patients who were younger, had more acute symptoms, exercised more, and had fewer comorbidities. Scores were responsive to change with a moderate effect size, with negligible floor and ceiling effects. CAT scores were generated using an average of 4.7 items (median = 4) and correlated highly with full-bank scores (Pearson correlation coefficient = 0.99). CONCLUSION The IRT-based BC patient-reported outcome measure (PROM) was reliable, valid, moderately responsive to change, and efficient, with excellent score coverage. The measure is suitable for research and routine clinical administration using the BC-CAT or BC-SF administration modes. The full ABC Scale can be administered for increased clinical content when appropriate. IMPACT The newly developed BC-PROM was reliable and valid for assessing perceived BC. In addition, the BC-PROM has efficient administration modes with low patient response burden, which enhances feasibility and promotes use during routine clinical practice in busy rehabilitation settings. This study supports a transition to PROMs that are based on modern measurement approaches to achieve the combined benefits of high accuracy and efficiency.
Collapse
Affiliation(s)
- Daniel Deutscher
- Net Health Systems, Inc., Pittsburgh, Pennsylvania, USA
- Maccabitech Institute for Research & Innovation, Maccabi Healthcare Services, Tel-Aviv, Israel
| | - Michael A Kallen
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | | | | | - Deanna Hayes
- Net Health Systems, Inc., Pittsburgh, Pennsylvania, USA
| |
Collapse
|
11
|
Shim H, Bonifay W, Wiedermann W. Parsimonious asymmetric item response theory modeling with the complementary log-log link. Behav Res Methods 2023; 55:200-219. [PMID: 35355241 DOI: 10.3758/s13428-022-01824-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/04/2022] [Indexed: 11/08/2022]
Abstract
Traditional item response theory (IRT) models assume a symmetric error distribution and rely on symmetric (logit or probit) link functions to model the response probabilities. As an alternative, we investigated the one-parameter complementary log-log model (CLLM), which is founded on an asymmetric error distribution and results in an asymmetric item response function with important psychometric properties. In a series of simulation studies, we demonstrate that the CLLM (a) is estimable in small sample sizes, (b) facilitates item-weighted scoring, and (c) accounts for the effect of guessing, despite the presence of a single parameter. We then provide further evidence for these claims by applying the CLLM to empirical data. Finally, we discuss how this work contributes to the growing psychometric literature on model complexity.
Collapse
|
12
|
Cui Y, Lu J, Zhang J, Shi N, Liu J, Meng X. A stochastic approximation expectation maximization algorithm for estimating Ramsay-curve three-parameter normal ogive model with non-normal latent trait distributions. Front Psychol 2022; 13:971126. [PMID: 36506999 PMCID: PMC9730697 DOI: 10.3389/fpsyg.2022.971126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 11/02/2022] [Indexed: 11/25/2022] Open
Abstract
In the estimation of item response models, the normality of latent traits is frequently assumed. However, this assumption may be untenable in real testing. In contrast to the conventional three-parameter normal ogive (3PNO) model, a 3PNO model incorporating Ramsay-curve item response theory (RC-IRT), denoted as the RC-3PNO model, allows for flexible latent trait distributions. We propose a stochastic approximation expectation maximization (SAEM) algorithm to estimate the RC-3PNO model with non-normal latent trait distributions. The simulation studies of this work reveal that the SAEM algorithm produces more accurate item parameters for the RC-3PNO model than those of the 3PNO model, especially when the latent density is not normal, such as in the cases of a skewed or bimodal distribution. Three model selection criteria are used to select the optimal number of knots and the degree of the B-spline functions in the RC-3PNO model. A real data set from the PISA 2018 test is used to demonstrate the application of the proposed algorithm.
Collapse
Affiliation(s)
- Yuzheng Cui
- Key Laboratory of Applied Statistics of Ministry of Education, School of Mathematics and Statistics, Northeast Normal University, Changchun, China
| | - Jing Lu
- Key Laboratory of Applied Statistics of Ministry of Education, School of Mathematics and Statistics, Northeast Normal University, Changchun, China,*Correspondence: Jing Lu
| | - Jiwei Zhang
- Faculty of Education, Northeast Normal University, Changchun, China,Jiwei Zhang
| | - Ningzhong Shi
- Key Laboratory of Applied Statistics of Ministry of Education, School of Mathematics and Statistics, Northeast Normal University, Changchun, China
| | - Jia Liu
- Key Laboratory of Applied Statistics of Ministry of Education, School of Mathematics and Statistics, Northeast Normal University, Changchun, China
| | - Xiangbin Meng
- Key Laboratory of Applied Statistics of Ministry of Education, School of Mathematics and Statistics, Northeast Normal University, Changchun, China
| |
Collapse
|
13
|
Manapat PD, Edwards MC. Examining the Robustness of the Graded Response and 2-Parameter Logistic Models to Violations of Construct Normality. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2022; 82:967-988. [PMID: 35989729 PMCID: PMC9386882 DOI: 10.1177/00131644211063453] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
When fitting unidimensional item response theory (IRT) models, the population distribution of the latent trait (θ) is often assumed to be normally distributed. However, some psychological theories would suggest a nonnormal θ. For example, some clinical traits (e.g., alcoholism, depression) are believed to follow a positively skewed distribution where the construct is low for most people, medium for some, and high for few. Failure to account for nonnormality may compromise the validity of inferences and conclusions. Although corrections have been developed to account for nonnormality, these methods can be computationally intensive and have not yet been widely adopted. Previous research has recommended implementing nonnormality corrections when θ is not "approximately normal." This research focused on examining how far θ can deviate from normal before the normality assumption becomes untenable. Specifically, our goal was to identify the type(s) and degree(s) of nonnormality that result in unacceptable parameter recovery for the graded response model (GRM) and 2-parameter logistic model (2PLM).
Collapse
|
14
|
de Beurs E, Boehnke JR, Fried EI. Common measures or common metrics? A plea to harmonize measurement results. Clin Psychol Psychother 2022; 29:1755-1767. [PMID: 35421265 PMCID: PMC9796399 DOI: 10.1002/cpp.2742] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 03/26/2022] [Accepted: 04/11/2022] [Indexed: 01/01/2023]
Abstract
OBJECTIVE There is a great variety of measurement instruments to assess similar constructs in clinical research and practice. This complicates the interpretation of test results and hampers the implementation of measurement-based care. METHOD For reporting and discussing test results with patients, we suggest converting test results into universally applicable common metrics. Two well-established metrics are reviewed: T scores and percentile ranks. Their calculation is explained, their merits and drawbacks are discussed, and recommendations for the most convenient reference group are provided. RESULTS We propose to express test results as T scores with the general population as reference group. To elucidate test results to patients, T scores may be supplemented with percentile ranks, based on data from a clinical sample. The practical benefits are demonstrated using the published data of four frequently used instruments for measuring depression: the CES-D, PHQ-9, BDI-II and the PROMIS depression measure. DISCUSSION Recent initiatives have proposed to mandate a limited set of outcome measures to harmonize clinical measurement. However, the selected instruments are not without flaws and, potentially, this directive may hamper future instrument development. We recommend using common metrics as an alternative approach to harmonize test results in clinical practice, as this will facilitate the integration of measures in day-to-day practice.
Collapse
Affiliation(s)
- Edwin de Beurs
- Department of Clinical PsychologyLeiden University & Arkin GGZAmsterdamThe Netherlands
| | | | - Eiko I. Fried
- Department of Clinical PsychologyLeiden UniversityLeidenZuid‐HollandThe Netherlands
| |
Collapse
|
15
|
Lee S, Han S, Choi SW. DIF Detection With Zero-Inflation Under the Factor Mixture Modeling Framework. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2022; 82:678-704. [PMID: 35754619 PMCID: PMC9228697 DOI: 10.1177/00131644211028995] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Response data containing an excessive number of zeros are referred to as zero-inflated data. When differential item functioning (DIF) detection is of interest, zero-inflation can attenuate DIF effects in the total sample and lead to underdetection of DIF items. The current study presents a DIF detection procedure for response data with excess zeros due to the existence of unobserved heterogeneous subgroups. The suggested procedure utilizes the factor mixture modeling (FMM) with MIMIC (multiple-indicator multiple-cause) to address the compromised DIF detection power via the estimation of latent classes. A Monte Carlo simulation was conducted to evaluate the suggested procedure in comparison to the well-known likelihood ratio (LR) DIF test. Our simulation study results indicated the superiority of FMM over the LR DIF test in terms of detection power and illustrated the importance of accounting for latent heterogeneity in zero-inflated data. The empirical data analysis results further supported the use of FMM by flagging additional DIF items over and above the LR test.
Collapse
Affiliation(s)
| | - Suhwa Han
- University of Texas at Austin, TX, USA
| | | |
Collapse
|
16
|
Morales-Vives F, Ferrando PJ, Dueñas JM. Should suicidal ideation be regarded as a dimension, a unipolar trait or a mixture? A model-based analysis at the score level. CURRENT PSYCHOLOGY 2022. [DOI: 10.1007/s12144-022-03224-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Abstract
Screening questionnaires administered in community samples may allow to early identify suicidal ideation (S.I.). Although the results found in these samples suggest that S.I. behaves like a unipolar trait or a quasi-trait, it is routinely assessed using procedures developed for bipolar traits. Therefore, the main aim of this study is to determine whether there is a basis for modelling S.I. as a bipolar trait, a unipolar trait, or a quasi-trait with two classes of individuals (symptomatic and asymptomatic). In a community sample and mainly at the scoring level, we compare the results provided by fitting three models based on different assumptions: GRM (bipolar traits), LL-GRM (unipolar traits) and FMA (quasi-traits). 773 Spanish participants answered a S.I. and a life satisfaction questionnaires. GRM and LL-GRM provided equivalent results at the structural level, but not at the scoring level, especially in the conditional and marginal accuracy of the estimated scores. While the GRM scores are highly accurate only in a narrow range well above the mean, the LL-GRM scores are highly accurate in a much wider range around the mean. They also have different implications for the prediction of life satisfaction. FMA results suggest that an asymptomatic and a symptomatic class could not be clearly differentiated. In conclusion, LL-GRM would make it possible to accurately measure a larger number of subjects in a community sample than GRM, leaving fewer cases of vulnerable people unidentified. These results should be considered by researchers and professionals when deciding which modellings to use for screening purposes.
Collapse
|
17
|
Morgan-López AA, McDaniel HL, Bradshaw CP, Saavedra LM, Lochman JE, Kaihoi CA, Powell NP, Qu L, Yaros AC. Design and methodology for an integrative data analysis of coping power: Direct and indirect effects on adolescent suicidality. Contemp Clin Trials 2022; 115:106705. [PMID: 35176503 PMCID: PMC9018598 DOI: 10.1016/j.cct.2022.106705] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 02/08/2022] [Accepted: 02/08/2022] [Indexed: 01/02/2023]
Abstract
As suicide rates have risen in the last decade, there has been greater emphasis on targeting early risk conditions for suicidality among youth and adolescents as a form of suicide "inoculation". Two particular needs that have been raised in this nascent literature are a) the dearth of examination of early intervention effects on distal suicide risk that target externalizing behaviors and b) the need to harmonize multiple existing intervention datasets for greater precision in modeling intervention effects on low base rate outcomes such as suicidal behaviors. This project, entitled "Integrative Data Analysis of Coping Power (CP): Effects on Adolescent Suicidality", funded by the National Institute of Mental Health (NIMH), will harmonize and analyze data from 11 randomized controlled trials of CP (total individual-level N = 3183, total school-level N = 189). CP is an empirically-supported, child- and family-focused preventive intervention that focuses on reducing externalizing more broadly among youth who exhibit early aggression, which makes it ideally suited to targeting externalizing pathways to suicidality. The project utilizes three measurement and data analysis frameworks that have emerged across multiple independent disciplines: integrative data analysis (IDA), random treatment effects multilevel modeling (RTE-MLM), and propensity score weighting (PSW). If successful, the project will a) provide initial evidence that CP would have gender-specific indirect effects on suicidality through reductions in externalizing for boys and reductions in internalizing for girls and b) identify optimal conditions under which CP is delivered (e.g., groups, individuals, online) across participants on reductions in suicidality and other key intermediate endpoints.
Collapse
Affiliation(s)
- Antonio A Morgan-López
- RTI International, Community Health Research Division, Research Triangle Park, NC, United States of America.
| | - Heather L McDaniel
- School of Education and Human Development, University of Virginia, Charlottesville, VA, United States of America
| | - Catherine P Bradshaw
- School of Education and Human Development, University of Virginia, Charlottesville, VA, United States of America
| | - Lissette M Saavedra
- RTI International, Community Health Research Division, Research Triangle Park, NC, United States of America
| | - John E Lochman
- Center for Youth Development and Intervention, University of Alabama, Tuscaloosa, AL, United States of America
| | - Chelsea A Kaihoi
- School of Education and Human Development, University of Virginia, Charlottesville, VA, United States of America
| | - Nicole P Powell
- Center for Youth Development and Intervention, University of Alabama, Tuscaloosa, AL, United States of America
| | - Lixin Qu
- Center for Youth Development and Intervention, University of Alabama, Tuscaloosa, AL, United States of America
| | - Anna C Yaros
- RTI International, Community Health Research Division, Research Triangle Park, NC, United States of America
| |
Collapse
|
18
|
Seitz HH, Grady JG. Measuring veterinary client preferences for autonomy and information when making medical decisions for their pets. J Am Vet Med Assoc 2021; 259:1471-1480. [PMID: 34757930 DOI: 10.2460/javma.19.12.0630] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
OBJECTIVE To adapt the 3 scales of the Autonomy Preference Index to veterinary medicine and validate the 3 new scales to measure pet owner preferences for autonomy and information when making medical decisions for their pets. SAMPLE 10 small-animal veterinarians and 10 small-animal clients at a veterinary school-based community practice (pilot study) and 311 small-animal clients of the practice (validation study), of which 47 participated in a follow-up survey. PROCEDURES Wording of items in the Autonomy Preference Index was adapted, and instrument wording was finalized on the basis of feedback obtained in the pilot study to create 3 scales: the Veterinary General Decision-Making Preferences Scale (VGDMPS), Veterinary Clinical Decision-Making Preferences Scale (VCDMPS), and Veterinary Information-Seeking Preferences Scale (VISPS). The 3 scales were then validated by means of administering them to small-animal clients in a clinical setting. RESULTS The 3 scales had acceptable reliability and validity, but clients expressed concern over item wording in the VGDMPS during the pilot study. Overall, results showed that clients had a very high preference for information (mean ± SD VISPS score, 4.78 ± 0.36 on a scale from 1 to 5). Preferences for autonomy varied, but mean values reflected a low-to-moderate desire for autonomy in clinical decision-making (mean ± SD VCDMPS score, 2.04 ± 0.62 on a scale from 1 to 5). CONCLUSIONS AND CLINICAL RELEVANCE The VCDMPS was a reliable and valid instrument for measuring client preferences for autonomy in clinical decision-making. Veterinarians could potentially use this instrument to better understand pet owner preferences and tailor their communication approach accordingly.
Collapse
Affiliation(s)
- Holli H Seitz
- From the Department of Communication and Social Science Research Center, Mississippi State University, Mississippi State, MS 39762
| | - Jesse G Grady
- From the Department of Clinical Sciences, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS 39762
| |
Collapse
|
19
|
Reise SP, Du H, Wong EF, Hubbard AS, Haviland MG. Matching IRT Models to Patient-Reported Outcomes Constructs: The Graded Response and Log-Logistic Models for Scaling Depression. PSYCHOMETRIKA 2021; 86:800-824. [PMID: 34463910 PMCID: PMC8437930 DOI: 10.1007/s11336-021-09802-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 06/12/2021] [Indexed: 06/13/2023]
Abstract
Item response theory (IRT) model applications extend well beyond cognitive ability testing, and various patient-reported outcomes (PRO) measures are among the more prominent examples. PRO (and like) constructs differ from cognitive ability constructs in many ways, and these differences have model fitting implications. With a few notable exceptions, however, most IRT applications to PRO constructs rely on traditional IRT models, such as the graded response model. We review some notable differences between cognitive and PRO constructs and how these differences can present challenges for traditional IRT model applications. We then apply two models (the traditional graded response model and an alternative log-logistic model) to depression measure data drawn from the Patient-Reported Outcomes Measurement Information System project. We do not claim that one model is "a better fit" or more "valid" than the other; rather, we show that the log-logistic model may be more consistent with the construct of depression as a unipolar phenomenon. Clearly, the graded response and log-logistic models can lead to different conclusions about the psychometrics of an instrument and the scaling of individual differences. We underscore, too, that, in general, explorations of which model may be more appropriate cannot be decided only by fit index comparisons; these decisions may require the integration of psychometrics with theory and research findings on the construct of interest.
Collapse
Affiliation(s)
- Steven P Reise
- Department of Psychology, University of California, Los Angeles, Los Angeles, USA.
| | - Han Du
- Department of Psychology, University of California, Los Angeles, Los Angeles, USA
| | - Emily F Wong
- Department of Psychology, University of California, Los Angeles, Los Angeles, USA
| | - Anne S Hubbard
- Department of Psychology, University of California, Los Angeles, Los Angeles, USA
| | - Mark G Haviland
- Department of Psychiatry, Loma Linda University, Los Angeles, USA
| |
Collapse
|
20
|
Schalet BD, Lim S, Cella D, Choi SW. Linking Scores with Patient-Reported Health Outcome Instruments:A VALIDATION STUDY AND COMPARISON OF THREE LINKING METHODS. PSYCHOMETRIKA 2021; 86:717-746. [PMID: 34173935 DOI: 10.1007/s11336-021-09776-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2020] [Revised: 03/03/2021] [Accepted: 05/19/2021] [Indexed: 06/13/2023]
Abstract
The psychometric process used to establish a relationship between the scores of two (or more) instruments is generically referred to as linking. When two instruments with the same content and statistical test specifications are linked, these instruments are said to be equated. Linking and equating procedures have long been used for practical benefit in educational testing. In recent years, health outcome researchers have increasingly applied linking techniques to patient-reported outcome (PRO) data. However, these applications have some noteworthy purposes and associated methodological questions. Purposes for linking health outcomes include the harmonization of data across studies or settings (enabling increased power in hypothesis testing), the aggregation of summed score data by means of score crosswalk tables, and score conversion in clinical settings where new instruments are introduced, but an interpretable connection to historical data is needed. When two PRO instruments are linked, assumptions for equating are typically not met and the extent to which those assumptions are violated becomes a decision point around how (and whether) to proceed with linking. We demonstrate multiple linking procedures-equipercentile, unidimensional IRT calibration, and calibrated projection-with the Patient-Reported Outcomes Measurement Information System Depression bank and the Patient Health Questionnaire-9. We validate this link across two samples and simulate different instrument correlation levels to provide guidance around which linking method is preferred. Finally, we discuss some remaining issues and directions for psychometric research in linking PRO instruments.
Collapse
Affiliation(s)
- Benjamin D Schalet
- Department of Medical Social Sciences, Northwestern University, Feinberg School of Medicine, 625 N Michigan Ave, 21st Floor, Chicago, IL, 60611, USA.
| | - Sangdon Lim
- Department of Educational Psychology, The University of Texas at Austin, 1912 Speedway, Stop D5800, Austin, TX, 78712-1289, USA
| | - David Cella
- Department of Medical Social Sciences, Northwestern University, Feinberg School of Medicine, 625 N Michigan Ave, 21st Floor, Chicago, IL, 60611, USA
| | - Seung W Choi
- Department of Educational Psychology, The University of Texas at Austin, 1912 Speedway, Stop D5800, Austin, TX, 78712-1289, USA
| |
Collapse
|
21
|
Santos‐Fernandez E, Mengersen K. Understanding the reliability of citizen science observational data using item response models. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Edgar Santos‐Fernandez
- School of Mathematical Sciences Queensland University of Technology Brisbane Qld Australia
- Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS) Parkville Vic. Australia
| | - Kerrie Mengersen
- School of Mathematical Sciences Queensland University of Technology Brisbane Qld Australia
- Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS) Parkville Vic. Australia
| |
Collapse
|
22
|
Abstract
Purpose The aims of this cross-sectional study were to explore reliability and validity of the Norwegian version of the Patient-Reported Outcome Measurement System®—Profile 57 (PROMIS-57) questionnaire in a general population sample, n = 408, and to examine Item Response properties and factor structure.
Methods Reliability measures were obtained from factor analysis and item response theory (IRT) methods. Correlations between PROMIS-57 and RAND-36-item health survey (RAND36) were examined for concurrent and discriminant validity. Factor structure and IRT assumptions were examined with factor analysis methods. IRT Item and model fit and graphic plots were inspected, and differential item functioning (DIF) for language, age, gender, and education level were examined.
Results PROMIS-57 demonstrated excellent reliability and satisfactory concurrent and discriminant validity. Factor structure of seven domains was supported. IRT assumptions were met for unidimensionality, local independence, monotonicity, and invariance with no DIF of consequence for language or age groups. Estimated common variance (ECV) per domain and confirmatory factor analysis (CFA) model fit supported unidimensionality for all seven domains. The GRM IRT Model demonstrates acceptable model fit. Conclusions The psychometric properties and factor structure of Norwegian PROMIS-57 were satisfactory. Hence, the 57-item questionnaire along with PROMIS-29, and the corresponding 8 and 4 item short forms for physical function, anxiety, depression, fatigue, sleep disturbance, social participation ability and pain interference, are considered suitable for use in research and clinical care in Norwegian populations. Further studies on longitudinal reliability and sensitivity in patient populations and for Norwegian item calibration and/or reference scores are needed. Supplementary Information The online version contains supplementary material available at 10.1007/s11136-021-02906-1.
Collapse
|
23
|
Anselmi P, Colledani D, Andreotti A, Robusto E, Fabbris L, Vian P, Genetti B, Mortali C, Minutillo A, Mastrobattista L, Pacifici R. An Item Response Theory-Based Scoring of the South Oaks Gambling Screen-Revised Adolescents. Assessment 2021; 29:1381-1391. [PMID: 34036842 DOI: 10.1177/10731911211017657] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The South Oaks Gambling Screen-Revised Adolescent (SOGS-RA) is one of the most widely used screening tools for problem gambling among adolescents. In this study, item response theory was used for computing measures of problem gambling severity that took into account how much information the endorsed items provided about the presence of problem gambling. A zero-inflated mixture two-parameter logistic model was estimated on the responses of 4,404 adolescents to the South Oaks Gambling Screen-Revised Adolescent to compute the difficulty and discrimination of each item, and the problem gambling severity level (θ score) of each respondent. Receiver operating characteristic curve analysis was used to identify the cutoff on the θ scores that best distinguished daily and nondaily gamblers. This cutoff outperformed the common cutoff defined on the sum scores in identifying daily gamblers but fell behind it in identifying nondaily gamblers. When screening adolescents to be subjected to further investigations, the cutoff on the θ scores must be preferred to that on the sum scores.
Collapse
|
24
|
Smits N, Öğreden O, Garnier-Villarreal M, Terwee CB, Chalmers RP. A study of alternative approaches to non-normal latent trait distributions in item response theory models used for health outcome measurement. Stat Methods Med Res 2020; 29:1030-1048. [PMID: 32156195 PMCID: PMC7221458 DOI: 10.1177/0962280220907625] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
It is often unrealistic to assume normally distributed latent traits in the
measurement of health outcomes. If normality is violated, the item response
theory (IRT) models that are used to calibrate questionnaires may yield
parameter estimates that are biased. Recently, IRT models were developed for
dealing with specific deviations from normality, such as zero-inflation (“excess
zeros”) and skewness. However, these models have not yet been evaluated under
conditions representative of item bank development for health outcomes,
characterized by a large number of polytomous items. A simulation study was
performed to compare the bias in parameter estimates of the graded response
model (GRM), polytomous extensions of the zero-inflated mixture IRT (ZIM-GRM),
and Davidian Curve IRT (DC-GRM). In the case of zero-inflation, the GRM showed
high bias overestimating discrimination parameters and yielding estimates of
threshold parameters that were too high and too close to one another, while
ZIM-GRM showed no bias. In the case of skewness, the GRM and DC-GRM showed
little bias with the GRM showing slightly better results. Consequences for the
development of health outcome measures are discussed.
Collapse
Affiliation(s)
- Niels Smits
- Research Institute of Child Development and Education, University of Amsterdam, Amsterdam, the Netherlands
| | - Oğuzhan Öğreden
- Department of Epidemiology and Biostatistics, VU University Amsterdam, Amsterdam, the Netherlands
| | | | - Caroline B Terwee
- Department of Epidemiology and Biostatistics, VU University Amsterdam, Amsterdam, the Netherlands
| | - R Philip Chalmers
- Quantitative Methods, Faculty of Psychology, York University, Toronto, Canada
| |
Collapse
|
25
|
Stover AM, McLeod LD, Langer MM, Chen WH, Reeve BB. State of the psychometric methods: patient-reported outcome measure development and refinement using item response theory. J Patient Rep Outcomes 2019; 3:50. [PMID: 31359210 PMCID: PMC6663947 DOI: 10.1186/s41687-019-0130-5] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Accepted: 06/19/2019] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND This paper is part of a series comparing different psychometric approaches to evaluate patient-reported outcome (PRO) measures using the same items and dataset. We provide an overview and example application to demonstrate 1) using item response theory (IRT) to identify poor and well performing items; 2) testing if items perform differently based on demographic characteristics (differential item functioning, DIF); and 3) balancing IRT and content validity considerations to select items for short forms. METHODS Model fit, local dependence, and DIF were examined for 51 items initially considered for the Patient-Reported Outcomes Measurement Information System® (PROMIS®) Depression item bank. Samejima's graded response model was used to examine how well each item measured severity levels of depression and how well it distinguished between individuals with high and low levels of depression. Two short forms were constructed based on psychometric properties and consensus discussions with instrument developers, including psychometricians and content experts. Calibrations presented here are for didactic purposes and are not intended to replace official PROMIS parameters or to be used for research. RESULTS Of the 51 depression items, 14 exhibited local dependence, 3 exhibited DIF for gender, and 9 exhibited misfit, and these items were removed from consideration for short forms. Short form 1 prioritized content, and thus items were chosen to meet DSM-V criteria rather than being discarded for lower discrimination parameters. Short form 2 prioritized well performing items, and thus fewer DSM-V criteria were satisfied. Short forms 1-2 performed similarly for model fit statistics, but short form 2 provided greater item precision. CONCLUSIONS IRT is a family of flexible models providing item- and scale-level information, making it a powerful tool for scale construction and refinement. Strengths of IRT models include placing respondents and items on the same metric, testing DIF across demographic or clinical subgroups, and facilitating creation of targeted short forms. Limitations include large sample sizes to obtain stable item parameters, and necessary familiarity with measurement methods to interpret results. Combining psychometric data with stakeholder input (including people with lived experiences of the health condition and clinicians) is highly recommended for scale development and evaluation.
Collapse
Affiliation(s)
- Angela M. Stover
- Department of Health Policy and Management, University of North Carolina at Chapel Hill, 1101-G McGavran-Greenberg Hall (CB# 7411), Chapel Hill, NC 27599 USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill School of Medicine, 101 Manning Drive, Chapel Hill, NC 27599 USA
| | - Lori D. McLeod
- RTI Health Solutions, 3040 Cornwallis Road, Research Triangle Park, NC 27709-2194 USA
| | - Michelle M. Langer
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill School of Medicine, 101 Manning Drive, Chapel Hill, NC 27599 USA
- Current affiliation: Medical Social Sciences; Feinberg School of Medicine, Northwestern University, 625 N Michigan Ave Suite 2700, Chicago, IL 60611 USA
| | - Wen-Hung Chen
- RTI Health Solutions, 3040 Cornwallis Road, Research Triangle Park, NC 27709-2194 USA
| | - Bryce B. Reeve
- Department of Health Policy and Management, University of North Carolina at Chapel Hill, 1101-G McGavran-Greenberg Hall (CB# 7411), Chapel Hill, NC 27599 USA
- Current affiliation: Center for Health Measurement Department of Population Health Sciences and Pediatrics, Duke University School of Medicine, 2200 West Main St, Suite 720A, Durham, NC 27707 USA
| |
Collapse
|