51
|
Huang PH. Accelerating item factor analysis on GPU with Python package xifa. Behav Res Methods 2023; 55:4403-4418. [PMID: 36627436 DOI: 10.3758/s13428-022-02024-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/08/2022] [Indexed: 01/11/2023]
Abstract
Item parameter estimation is a crucial step when conducting item factor analysis (IFA). From the view of frequentist estimation, marginal maximum likelihood (MML) seems to be the gold standard. However, fitting a high-dimensional IFA model by MML is still a challenging task. The current study demonstrates that with the help of a GPU (graphics processing unit) and carefully designed vectorization, the computational time of MML could be largely reduced for large-scale IFA applications. In particular, a Python package called xifa (accelerated item factor analysis) is developed, which implements a vectorized Metropolis-Hastings Robbins-Monro (VMHRM) algorithm. Our numerical experiments show that the VMHRM on a GPU may run 33 times faster than its CPU version. When the number of factors is at least five, VMHRM (on GPU) is much faster than the Bock-Aitkin expectation maximization, MHRM implemented by mirt (on CPU), and the importance-weighted autoencoder (on GPU). The GPU-implemented VMHRM is most appropriate for high-dimensional IFA with large data sets. We believe that GPU computing will play a central role in large-scale psychometric modeling in the near future.
Collapse
|
52
|
Du J, Wang Y, Wu A, Jiang Y, Duan Y, Geng W, Wan L, Li J, Hu J, Jiang J, Shi L, Wei J. The validity and IRT psychometric analysis of Chinese version of Difficult Doctor-Patient Relationship Questionnaire (DDPRQ-10). BMC Psychiatry 2023; 23:900. [PMID: 38041038 PMCID: PMC10693043 DOI: 10.1186/s12888-023-05385-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 11/18/2023] [Indexed: 12/03/2023] Open
Abstract
OBJECTIVE The doctor-patient relationship (DPR) plays a crucial role in the Chinese healthcare system, functioning to improve medical quality and reduce medical costs. This study examined the psychometric properties of the Chinese version of the Difficult Doctor-Patient Relationship Questionnaire (DDPRQ-10) among general hospital inpatients in China. METHODS The research recruited 38 resident doctors responsible for 120 participants, and factor analyses were used to assess the construct validity of the scale. Convergent validity was evaluated by examining the correlation between DDPRQ-10 and depressive symptoms, burnout, and self-efficacy, using the Patient Health Questionnaire Depression Scale-9 item (PHQ-9), and the Maslach Burnout Inventory (MBI). Both multidimensional item response theory (MIRT) and unidimensional item response theory (IRT) frameworks were used to estimate the parameters of each item. RESULTS The Chinese version of DDPRQ-10 showed satisfactory internal consistency (Cronbach's alpha = 0.931), and fitted in a modified two-factor model of positive feelings and negative feelings (χ2/df = 1.494, GFI = 0.925, RMSEA = 0.071, SRMR = 0.008, CFI = 0.985, NFI = 0.958, NNFI = 0.980, TLI = 0.980, IFI = 0.986). Significant correlations with PHQ-9 with DDPRQ-10 and both subscales were revealed (r = 0.293 ~ 0.333, p < .001), while DDPRQ-10 score also significantly correlated with doctors' MBI score (r = -0.467, p < .001). The MIRT model of full scale and IRT models of both subscales showed high discrimination of all items (a = 2.30 ~ 10.18), and the test information within the range of low-quality relationship was relatively high. CONCLUSION The Chinese version of DDPRQ-10 displayed satisfactory reliability and validity and thus was appropriate for measuring the DPR in Chinese medical settings.
Collapse
|
53
|
Hodgson CG, Bonifay W, Yang W, Herman KC. Establishing the measurement precision of the patient health questionnaire in an adolescent sample. J Affect Disord 2023; 342:76-84. [PMID: 37708980 DOI: 10.1016/j.jad.2023.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 08/30/2023] [Accepted: 09/08/2023] [Indexed: 09/16/2023]
Abstract
BACKGROUND Technically sound measures are necessary for accurately identifying youth at risk for depression, but many studies rely on classical test theory metrics or adult samples to evaluate measures. This study examined the use of the PHQ-8, a common and freely available pediatric depression screener, in an adolescent sample using item response theory (IRT). METHODS Secondary analyses were conducted on data from a study conducted in Midwestern middle schools in which 1224 youth completed the PHQ-8 as part of a battery of surveys. Polytomous IRT analyses (a Graded Response Model) were used to evaluate the PHQ-8. Items were examined for their ability to distinguish between respondents of different latent depression severity and for differential item functioning (DIF) across demographic categories. RESULTS All PHQ-8 items had adequate discriminative abilities. Items measuring anhedonia and psychomotor disturbances performed relatively poorly, and items measuring somatic symptoms (appetite and sleep) were most informative when respondents endorsed extreme response options ("not at all" or "nearly every day"). No DIF was found across grade level or race, but several items were flagged for DIF by gender and student income level. LIMITATIONS These results might not be generalizable to a broader youth population due to administration setting and the unique demographic characteristics of this sample (76.0 % African American). CONCLUSIONS Tools such as the PHQ-8 are appropriate to quickly screen for depression in adolescents, but further scrutiny of adolescent response patterns is warranted. Future research should examine items measuring anhedonia and psychomotor and somatic disturbances in adolescents.
Collapse
|
54
|
Aßmann C, Gaasch JC, Stingl D. A Bayesian Approach Towards Missing Covariate Data in Multilevel Latent Regression Models. PSYCHOMETRIKA 2023; 88:1495-1528. [PMID: 36418780 PMCID: PMC10656345 DOI: 10.1007/s11336-022-09888-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 08/29/2022] [Accepted: 09/20/2022] [Indexed: 06/16/2023]
Abstract
The measurement of latent traits and investigation of relations between these and a potentially large set of explaining variables is typical in psychology, economics, and the social sciences. Corresponding analysis often relies on surveyed data from large-scale studies involving hierarchical structures and missing values in the set of considered covariates. This paper proposes a Bayesian estimation approach based on the device of data augmentation that addresses the handling of missing values in multilevel latent regression models. Population heterogeneity is modeled via multiple groups enriched with random intercepts. Bayesian estimation is implemented in terms of a Markov chain Monte Carlo sampling approach. To handle missing values, the sampling scheme is augmented to incorporate sampling from the full conditional distributions of missing values. We suggest to model the full conditional distributions of missing values in terms of non-parametric classification and regression trees. This offers the possibility to consider information from latent quantities functioning as sufficient statistics. A simulation study reveals that this Bayesian approach provides valid inference and outperforms complete cases analysis and multiple imputation in terms of statistical efficiency and computation time involved. An empirical illustration using data on mathematical competencies demonstrates the usefulness of the suggested approach.
Collapse
|
55
|
Zhou T, Wang Y, Chen J, Huang Q, Wu F, Zhang H, Yuan C, Cai T. Psychometric properties of the Chinese version of the PROMIS-Cancer-Anxiety item bank assessed using a graded response model. Asia Pac J Oncol Nurs 2023; 10:100312. [PMID: 38106438 PMCID: PMC10724486 DOI: 10.1016/j.apjon.2023.100312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 09/24/2023] [Indexed: 12/19/2023] Open
Abstract
Objective This study aimed to examine the psychometric properties of the Chinese version of the Patient-Reported Outcome Measurement Information System (PROMIS)-Cancer-Anxiety item bank using a graded response model in a sample of patients with cancer. Methods A cross-sectional study was conducted and the Chinese version of the PROMIS-Cancer-Anxiety item bank was used to measure anxiety in patients with cancer. The unidimensional structure of the item bank was evaluated using principal component analysis. Residual correlations and the graphs of item mean scores conditional on the rest scores were examined to evaluate the local independence and monotonicity of the items, respectively. Item characteristics were described using item parameter estimates and item information. Operating characteristic curves (OCCs) and test information curve (TIC) were also plotted. Measurement invariance across age, gender, and education level was assessed to identify possible differential item functioning (DIF). Results A total of 1075 patients with cancer were enrolled. Under the assumptions of unidimensionality, local independence, and monotonicity, the discrimination parameters a ranged from 2.30 to 5.47, and the threshold parameters b ranged from b1 = -2.87 to b4 = 3.21 with proper intervals. Completely overlapped category curves were not observed among the OCCs of any items. Item information and TIC showed that the item bank had a wide measurement range. The DIFs for age, gender, and education level for all items were not remarkable. Conclusions The results supported using the Chinese version of the PROMIS-Cancer-Anxiety item bank to measure anxiety and develop a computerized adaptive testing (CAT) system for anxiety in patients with cancer.
Collapse
|
56
|
Liu CW. Multidimensional item response theory models for testlet-based doubly bounded data. Behav Res Methods 2023:10.3758/s13428-023-02272-5. [PMID: 37985636 DOI: 10.3758/s13428-023-02272-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/13/2023] [Indexed: 11/22/2023]
Abstract
A testlet-based visual analogue scale (VAS) is a doubly bounded scaling approach (e.g., from 0% to 100% or from 0 to 1) composed of multiple adjectives, nouns, or sentences (statements/items) within testlets for measuring individuals' attitudes, opinions, or career interests. While testlet-based VASs have many advantages over Likert scales, such as reducing response style effects, the development of proper statistical models for analyzing testlet-based VAS data lags behind. This paper proposes a novel beta copula model and a competing logit-normal model based on the item response theory framework, assessed by Bayesian parameter estimation, model comparison, and goodness-of-fit statistics. An empirical career interest dataset based on a testlet-based VAS design was analyzed using the proposed models. Simulation studies were conducted to assess the two models' parameter recovery. The results show that the beta copula model had superior fit in the empirical data analysis, and also exhibited good parameter recovery in the simulation studies, suggesting that it is a promising statistical approach to testlet-based doubly bounded responses.
Collapse
|
57
|
Lv K, Sun R, Chen X, Lan Y. The development and evaluation of the worker-occupation fit inventory. BMC Public Health 2023; 23:2163. [PMID: 37926813 PMCID: PMC10626709 DOI: 10.1186/s12889-023-17080-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 10/27/2023] [Indexed: 11/07/2023] Open
Abstract
BACKGROUND Person-environment fit (PEF) theory, one of the foundational theories of occupational stress, has primarily found applications in organizational behavior and human resource management. Given the alignment between the definition of occupational stress and the essence of PEF, we introduced the concept of worker-occupation fit (WOF). To validate our theoretical model, the development of an instrument to measure WOF becomes imperative. METHODS The Worker-Occupation Fit Inventory (WOFI) comprises three dimensions: personal trait fit (PTF), need-supply fit (NSF) and demand-ability fit (DAF). Job-related mental disorders (JRMDs) were assessed using the DASS-21. During the pre-investigation, items of the WOFI underwent screening through classic test theory (CTT) analysis. In the formal investigation, item response theory (IRT) analysis was employed to evaluate the selected items. The relationship between WOF and JRMD was verified by Pearson's correlation analysis and multiple logistic regression analysis. RESULTS The initial version consisted of 26 items. Three common factors were extracted by exploratory factor analysis (EFA): 6 items were included in the PTF, 6 items were included in the NSF, 4 items were included in the DAF, and 10 items were deleted because of unacceptable factor loadings. The confirmatory factor analysis (CFA) verified the structure of the WOFI with χ2/df = 1.822, CFI = 0.947, and SRMSR = 0.056. The Cronbach's alpha coefficients of the PTF, NSF, and DAF were 0.91, 0.92, and 0.80, respectively. In IRT analysis, the discrimination values of all items ranged from 1.25 to 2.53, and the difficulty values of all items ranged from -6.28 to 1.30 (with no difficulty of reversal). The WOF was negatively related to job-related stress (r = -0.34, p<0.001), anxiety (r = -0.37, p<0.001), and depression (r = -0.41, p<0.001). The multiple logistic regression analysis suggested that a high level of WOF was a protective factor against job-related mental disorders, with ORs all less than 1 (p<0.001), and a low level of WOF was a risk factor for job-related mental disorders, with ORs all more than 1.0 (p<0.001). CONCLUSIONS The results of CTT and IRT analysis indicated that the WOFI exhibits reliability and validation. The WOF effectively predicted job-related mental disorders. Subsequent studies will delve into the influence of WOFI on diverse professions and various health outcomes.
Collapse
|
58
|
Carlozzi NE, Kallen MA, Morin KG, Fyffe DC, Wecht JM. Item Banks for Measuring the Effect of Blood Pressure Dysregulation on Health-Related Quality of Life in Persons With Spinal Cord Injury. Arch Phys Med Rehabil 2023; 104:1872-1881. [PMID: 37172674 DOI: 10.1016/j.apmr.2023.04.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 03/24/2023] [Accepted: 04/15/2023] [Indexed: 05/15/2023]
Abstract
OBJECTIVE To report on the development and calibration of the new Blood Pressure Dysregulation Measurement System (BPD-MS) item banks that assess the effect of BPD on health-related quality of life (HRQOL) and the daily activities of Veterans and non-Veterans with spinal cord injury (SCI). DESIGN Cross-sectional survey study. SETTING Two Veteran Affairs medical centers and a SCI model system site. PARTICIPANTS 454 respondents with SCI (n=262 American Veterans and n=192 non-Veterans; N=454). INTERVENTIONS Not applicable MAIN OUTCOME MEASURES: The BPD-MS item banks. RESULTS BPD item pools were developed and refined using literature reviews, qualitative data from focus groups, and cognitive debriefing of persons with SCI and professional caregivers. The item banks then underwent expert review, reading level assessment, and translatability review prior to field testing. The items pools consisted of 180 unique questions (items). Exploratory and confirmatory factor analyses, item response theory modeling, and differential item function investigations resulted in item banks that included a total of 150 items: 75 describing the effect of autonomic dysreflexia on HRQOL, 55 describing the effect of low blood pressure (LBP) on HRQOL, and 20 describing the effect of LBP on daily activities. In addition, 10-item short forms were constructed based on item response theory-derived item information values and the clinical relevance of item content. CONCLUSIONS The new BPD-MS item banks and corresponding 10-item short forms were developed using established rigorous measurement development standards, which represents the first BPD-specific patient-reported outcomes measurement system unique for use in the SCI population.
Collapse
|
59
|
Harrison CJ, Hossain A, Bruce J, Rodrigues JN. Psychometric sensitivity analyses can identify bias related to measurement properties in trials that use patient-reported outcome measures: a secondary analysis of a clinical trial using the disabilities of the arm, shoulder, and hand questionnaire. J Clin Epidemiol 2023; 163:21-28. [PMID: 37774956 DOI: 10.1016/j.jclinepi.2023.09.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 06/23/2023] [Accepted: 09/21/2023] [Indexed: 10/01/2023]
Abstract
OBJECTIVES Demonstrate psychometric sensitivity analyses for testing the stability of study findings to assumptions made about patient-reported outcome measures. STUDY DESIGN AND SETTING We performed secondary analyses of Disability of Arm, Shoulder, and Hand (DASH) data collected within the Prevention of Shoulder Problems clinical trial, which compared upper limb function scores in women who had undergone breast cancer surgery, randomized to either an exercise program or usual care. We repeated the principal trial analyses after grouping DASH items into subscales suggested by factorial analyses in this dataset and applied item response theory to account for unequal item weighting. We checked for measurement invariance by participant age and response shift bias using established techniques. RESULTS Our analyses suggested that the DASH measured two constructs: motor function and sensory symptoms. The majority of the six-month difference in DASH score was driven by motor function. With item response theory scoring, we found differences in both constructs at 12 months (P = 0.019 and P = 0.007), but in neither construct at 6 months, contrary to the original trial results. We found no differential item function by age or between baseline and 12-month measurements. CONCLUSIONS Psychometric sensitivity analyses aid in the interpretation of the Prevention of Shoulder Problems trial's results.
Collapse
|
60
|
Chatton A, Khazaal Y, Penzenstadler L. A 13-item Health of the Nation Outcome Scale (HoNOS-13): validation by item response theory (IRT) in patients with substance use disorder. Addict Sci Clin Pract 2023; 18:64. [PMID: 37876018 PMCID: PMC10594779 DOI: 10.1186/s13722-023-00416-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Accepted: 10/06/2023] [Indexed: 10/26/2023] Open
Abstract
BACKGROUND The Health of the Nation Outcome Scale (HoNOS) is a widely used 12-item tool to assess mental health and social functioning. The French version has an added 13th item measuring adherence to psychotropic medication. The aim of the current study is to uncover the unknown pattern of the new item 13 and to compare the unidimensional and multidimensional fit of the new HoNOS-13 using Item Response Theory (IRT). This research question was studied among inpatients with substance use disorder (SUD). METHODS Six hundred and nine valid questionnaires of HoNOS-13 were analyzed using unidimensional (one-factor) and multidimensional (two-factor) IRT modeling. RESULTS The multidimensional model suggesting a first factor capturing psychiatric/impairment-related issues and a second factor reflecting social-related issues yielded better goodness-of-fit values compared to the unidimensional solution. This resulted in an improvement of all slope parameters which in turn translates to better discriminative power. Significant improvement in item location parameters were observed as well. The new item 13 had a good discriminative power (1.17) and covered a wide range of the latent trait (- 0.14 to 2.64). CONCLUSIONS We were able to validate the 13-item questionnaire including medication compliance and suggest that the HoNOS-13 can be recommended as a clinical evaluation tool to assess the problems and treatment needs for inpatients with SUD. Interestingly, the majority of item response categories are endorsed by respondents who are below and above the average levels of HoNOS. This indicates that the scale is able to discriminate between participants both at the low and at the high ends of the latent trait continuum. More importantly, the new item 13 has a good discriminative power and covers a broad range of the latent trait below and above the mean. It therefore has the desired profile of a good item and is a useful measure for the assessment of mental health and social functioning. Trial registration ClinicalTrials.gov, Identifier: NCT03551301. Registered: 11.06.2018. Retrospectively registered, https://clinicaltrials.gov/ct2/show/NCT03551301 .
Collapse
|
61
|
Nishio M, Ota E, Matsuo H, Matsunaga T, Miyazaki A, Murakami T. Comparison between pystan and numpyro in Bayesian item response theory: evaluation of agreement of estimated latent parameters and sampling performance. PeerJ Comput Sci 2023; 9:e1620. [PMID: 37869462 PMCID: PMC10588711 DOI: 10.7717/peerj-cs.1620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 09/06/2023] [Indexed: 10/24/2023]
Abstract
Purpose The purpose of this study is to compare two libraries dedicated to the Markov chain Monte Carlo method: pystan and numpyro. In the comparison, we mainly focused on the agreement of estimated latent parameters and the performance of sampling using the Markov chain Monte Carlo method in Bayesian item response theory (IRT). Materials and methods Bayesian 1PL-IRT and 2PL-IRT were implemented with pystan and numpyro. Then, the Bayesian 1PL-IRT and 2PL-IRT were applied to two types of medical data obtained from a published article. The same prior distributions of latent parameters were used in both pystan and numpyro. Estimation results of latent parameters of 1PL-IRT and 2PL-IRT were compared between pystan and numpyro. Additionally, the computational cost of the Markov chain Monte Carlo method was compared between the two libraries. To evaluate the computational cost of IRT models, simulation data were generated from the medical data and numpyro. Results For all the combinations of IRT types (1PL-IRT or 2PL-IRT) and medical data types, the mean and standard deviation of the estimated latent parameters were in good agreement between pystan and numpyro. In most cases, the sampling time using the Markov chain Monte Carlo method was shorter in numpyro than that in pystan. When the large-sized simulation data were used, numpyro with a graphics processing unit was useful for reducing the sampling time. Conclusion Numpyro and pystan were useful for applying the Bayesian 1PL-IRT and 2PL-IRT. Our results show that the two libraries yielded similar estimation result and that regarding to sampling time, the fastest libraries differed based on the dataset size.
Collapse
|
62
|
Barnard-Brak L, Yang Z. A 4pL item response theory examination of perceived stigma in the screening of eating disorders with the SCOFF among college students. Eat Weight Disord 2023; 28:79. [PMID: 37792143 PMCID: PMC10550868 DOI: 10.1007/s40519-023-01604-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 08/27/2023] [Indexed: 10/05/2023] Open
Abstract
We examined the psychometric properties of the SCOFF, a screening instrument for eating disorders, with consideration of the perceived stigma of items that can produce socially desirable responding among a sample of college students. The results of the current study suggest evidence of the sufficient psychometric properties of the SCOFF in terms of confirmatory factor and item response theory analyses. However, two items of the SCOFF revealed that individuals who otherwise endorsed other items of the SCOFF were less likely to endorse the items of Fat and Food. It is hypothesized that this is the result of perceived stigma regarding those two items that prompts individuals to respond in a socially desirable way. A weighted scoring procedure was developed to counteract the performance of these two items, but the psychometric performance was only slightly better and there would be a clear tradeoff of specificity over sensitivity if utilized. Future research should consider other ways to counteract such perceived stigma.Level of evidence Level III: Evidence obtained from cohort or case-control analytic studies.
Collapse
|
63
|
Chapron SA, Kervran C, Da Rosa M, Fournet L, Shmulewitz D, Hasin D, Denis C, Collombat J, Monsaingeon M, Fatseas M, Gatta-Cherifi B, Serre F, Auriacombe M. Does food use disorder exist? Item response theory analyses of a food use disorder adapted from the DSM-5 substance use disorder criteria in a treatment seeking clinical sample. Drug Alcohol Depend 2023; 251:110937. [PMID: 37666092 DOI: 10.1016/j.drugalcdep.2023.110937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 08/09/2023] [Accepted: 08/10/2023] [Indexed: 09/06/2023]
Abstract
BACKGROUND Increased consumption of food that are high in energy and sugar have been pointed as a major factor in the obesity epidemic. Impaired control of food intake and the concept of food addiction has been developed as a potential contributor. Our objective was to evaluate the dimensionality and psychometric validity of diagnostic criteria for food addiction adapted from the 11 DSM-5 substance use disorder (SUD) criteria (i.e.: Food Use Disorder (FUD) criteria), and to evaluate the influence of age, gender, and body mass index (BMI). METHODS Cross-sectional observational study including 508 participants (56.1% male; mean age 42.2) from outpatient treatment clinics for obesity or addiction disorders at time of admission. FUD diagnostic criteria were analyzed using confirmatory factor and 2-parameter item response theory analyses. Differential Item and Test Functioning analyses were performed across age, gender, and BMI. RESULTS We demonstrated the one-factor dimensionality of the criteria set. The criterion "craving" presented the strongest factor loading and discrimination parameter and the second-lowest difficulty. We found some significant uniform differential item functioning for body mass index. We found some differential test functioning for gender and BMI. CONCLUSIONS This study reports, for the first time, the validity of a potential Food Use Disorder (derived from the 11 DSM-5 SUD criteria adapted to food) in a sample of treatment seeking adults. This has great implications both at the clinical level and in terms of public health policy in the context of the global obesity epidemic.
Collapse
|
64
|
Wind S, Wang Y. Using Mokken scaling techniques to explore carelessness in survey research. Behav Res Methods 2023; 55:3370-3415. [PMID: 36131197 DOI: 10.3758/s13428-022-01960-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/15/2022] [Indexed: 11/08/2022]
Abstract
Careless responding is a pervasive issue that impacts the interpretation and use of responses from survey instruments. Researchers have proposed numerous useful methods for detecting carelessness in survey research, including relatively simple summary statistics such as the frequency of adjacent responses in the same category (e.g., "long-string" analysis) and outlier statistics (e.g., Mahalanobis distance). Researchers have also used methods based on item response theory (IRT) models to identify examinees whose response patterns are unexpected given item parameters. However, researchers have not fully considered the use of nonparametric IRT methods based on Mokken scale analysis (MSA) to detect carelessness in survey research. MSA is a promising framework in which to consider participant carelessness because it is well suited to contexts in which parametric IRT models may not be appropriate, while still maintaining a focus on fundamental measurement requirements. We used a real data analysis and a simulation study to examine the sensitivity of MSA indicators of response quality to examinee carelessness and compared the results to those from standalone indicators. We also examined the impact of carelessness on the sensitivity of MSA item quality indicators. Numeric and graphical indicators of response quality from MSA indicators were sensitive to examinee carelessness. Graphical displays of nonparametric person response functions (PRFs) provided supplementary insight that can alert researchers to potentially problematic responses. Our results also indicated that MSA indicators of item quality are robust to the presence of participant carelessness. We consider the implications of our findings for research and practice.
Collapse
|
65
|
Uto M. A Bayesian many-facet Rasch model with Markov modeling for rater severity drift. Behav Res Methods 2023; 55:3910-3928. [PMID: 36284065 PMCID: PMC10615980 DOI: 10.3758/s13428-022-01997-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/30/2022] [Indexed: 11/08/2022]
Abstract
Fair performance assessment requires consideration of the effects of rater severity on scoring. The many-facet Rasch model (MFRM), an item response theory model that incorporates rater severity parameters, has been widely used for this purpose. Although a typical MFRM assumes that rater severity does not change during the rating process, in actuality rater severity is known to change over time, a phenomenon called rater severity drift. To investigate this drift, several extensions of the MFRM have been proposed that incorporate time-specific rater severity parameters. However, these previous models estimate the severity parameters under the assumption of temporal independence. This introduces inefficiency into the parameter estimation because severities between adjacent time points tend to have temporal dependency in practice. To resolve this problem, we propose a Bayesian extension of the MFRM that incorporates time dependency for the rater severity parameters, based on a Markov modeling approach. The proposed model can improve the estimation accuracy of the time-specific rater severity parameters, resulting in improved estimation accuracy for the other rater parameters and for model fitting. We demonstrate the effectiveness of the proposed model through simulation experiments and application to actual data.
Collapse
|
66
|
Zanini DS, Peixoto EM, de Andrade JM, Fernandes IA, da Silva MPP. European health literacy survey questionnaire short form (HLS-Q12): adaptation and evidence of validity for the Brazilian context. PSICOLOGIA-REFLEXAO E CRITICA 2023; 36:25. [PMID: 37672100 PMCID: PMC10482809 DOI: 10.1186/s41155-023-00263-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 07/21/2023] [Indexed: 09/07/2023] Open
Abstract
Health literacy (HL) refers to knowledge, motivation and skills to understand, evaluate and apply health information, enabling appropriate decision making in daily life on health care and health promotion. Studies show that HL is associated with several social determinants, health outcomes, and health promotion. In Brazil, studies on the thematic are still scarce. Thus, the present study aimed to adapt, seek evidence of validity, reliability and estimate the parameters of the items of the European Health Literacy Survey Questionnaire Short Form (HLS-Q12) for the Brazilian context. 770 individuals participated, recruited through advertisements in the media and social networks, 82.1% female, aged between 18 and 83 (M = 35.5, SD = 13.52), from 21 Federative Units of Brazil and the Federal District. The subjects answered the HLS-Q12 and a sociodemographic questionnaire. Exploratory factor analysis indicated a unifactorial structure with good psychometric characteristics (GFI = 0.98; CFI = 0.98; RMSEA = 0.08; RMSR = 0.07). Cronbach's alpha, Guttman's lambda 2 and McDonald's omega reliability indicators were equal to 0.87. We conclude that the HLS-Q12 is an adequate instrument to assess the level of HL in the Brazilian population.
Collapse
|
67
|
Williams ZJ, Schaaf R, Ausderau KK, Baranek GT, Barrett DJ, Cascio CJ, Dumont RL, Eyoh EE, Failla MD, Feldman JI, Foss-Feig JH, Green HL, Green SA, He JL, Kaplan-Kahn EA, Keçeli-Kaysılı B, MacLennan K, Mailloux Z, Marco EJ, Mash LE, McKernan EP, Molholm S, Mostofsky SH, Puts NAJ, Robertson CE, Russo N, Shea N, Sideris J, Sutcliffe JS, Tavassoli T, Wallace MT, Wodka EL, Woynaroski TG. Examining the latent structure and correlates of sensory reactivity in autism: a multi-site integrative data analysis by the autism sensory research consortium. Mol Autism 2023; 14:31. [PMID: 37635263 PMCID: PMC10464466 DOI: 10.1186/s13229-023-00563-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 08/11/2023] [Indexed: 08/29/2023] Open
Abstract
BACKGROUND Differences in responding to sensory stimuli, including sensory hyperreactivity (HYPER), hyporeactivity (HYPO), and sensory seeking (SEEK) have been observed in autistic individuals across sensory modalities, but few studies have examined the structure of these "supra-modal" traits in the autistic population. METHODS Leveraging a combined sample of 3868 autistic youth drawn from 12 distinct data sources (ages 3-18 years and representing the full range of cognitive ability), the current study used modern psychometric and meta-analytic techniques to interrogate the latent structure and correlates of caregiver-reported HYPER, HYPO, and SEEK within and across sensory modalities. Bifactor statistical indices were used to both evaluate the strength of a "general response pattern" factor for each supra-modal construct and determine the added value of "modality-specific response pattern" scores (e.g., Visual HYPER). Bayesian random-effects integrative data analysis models were used to examine the clinical and demographic correlates of all interpretable HYPER, HYPO, and SEEK (sub)constructs. RESULTS All modality-specific HYPER subconstructs could be reliably and validly measured, whereas certain modality-specific HYPO and SEEK subconstructs were psychometrically inadequate when measured using existing items. Bifactor analyses supported the validity of a supra-modal HYPER construct (ωH = .800) but not a supra-modal HYPO construct (ωH = .653), and supra-modal SEEK models suggested a more limited version of the construct that excluded some sensory modalities (ωH = .800; 4/7 modalities). Modality-specific subscales demonstrated significant added value for all response patterns. Meta-analytic correlations varied by construct, although sensory features tended to correlate most with other domains of core autism features and co-occurring psychiatric symptoms (with general HYPER and speech HYPO demonstrating the largest numbers of practically significant correlations). LIMITATIONS Conclusions may not be generalizable beyond the specific pool of items used in the current study, which was limited to caregiver report of observable behaviors and excluded multisensory items that reflect many "real-world" sensory experiences. CONCLUSION Of the three sensory response patterns, only HYPER demonstrated sufficient evidence for valid interpretation at the supra-modal level, whereas supra-modal HYPO/SEEK constructs demonstrated substantial psychometric limitations. For clinicians and researchers seeking to characterize sensory reactivity in autism, modality-specific response pattern scores may represent viable alternatives that overcome many of these limitations.
Collapse
|
68
|
Zhang J, Wang C, Lu J. Modeling item revisiting behavior in computer-based testing: Exploring the effect of item revisitations as collateral information. Behav Res Methods 2023:10.3758/s13428-023-02209-y. [PMID: 37608234 DOI: 10.3758/s13428-023-02209-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/26/2023] [Indexed: 08/24/2023]
Abstract
Item revisiting behavior is one of the most frequently occurring test-taking strategies, and it can decrease test anxiety and improve test validity. Examinees either confirm the initial answers due to persistence of their beliefs or change to different answers after careful rethought on each part of the questions. Item revisiting sequences as collateral information reveal the examinees' underlying psychological processes, such as motivation, effort, and engagement, which supports policy makers in taking further steps to facilitate instructions for the examinees. Item revisiting behavior is commonly correlated with the latent traits of examinees, and it needs to be properly analyzed in order to make valid statistical inference. In this paper, we proposed a novel item revisiting model, in which a monotonicity assumption is considered based on the observation that examinees are more likely to revisit the current item if more revisiting behavior occurs previously. Three simulation studies were conducted: (1) to evaluate the performance of the proposed Bayesian estimation algorithm for the new model; (2) to show that ignoring item revisiting sequences induces biased parameter estimates; (3) to assess the model fit of the proposed model with the ignorable and nonignorable item revisiting behavior assumptions. The results indicate that item revisiting behavior can be effectively utilized in conjunction with responses and response times to improve parameter estimation precision. A real data example is provided to illustrate the application of the proposed model.
Collapse
|
69
|
Ogawa M, Sago T, Furukawa H, Saito A. Psychometric evaluation of the Japanese version of the fear of pain questionnaire-III and its association with dental anxiety: a cross-sectional study. BMC Oral Health 2023; 23:559. [PMID: 37573290 PMCID: PMC10422720 DOI: 10.1186/s12903-023-03273-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 07/31/2023] [Indexed: 08/14/2023] Open
Abstract
BACKGROUND Fear of pain is a significant concern related to chronic pain and its impact on daily functioning. It is also associated with dental anxiety, highlighting its relevance in dental practice. This study aimed to validate the Japanese version of the Fear of Pain Questionnaire-III (FPQ-III) and explore its relationship with dental anxiety. METHODS 400 participants completed the Japanese version of the FPQ-III, with 100 participants re-evaluated after one month. Convergent validity was tested against dental anxiety and pain catastrophizing, while discriminant validity was assessed by examining general anxiety and depression correlations. Confirmatory factor analysis was used to examine the factorial validity of the FPQ-III and a shortened version of the FPQ-III (FPQ-9). Item response theory was applied for each subscale to estimate the discriminative power of each item and draw a test information curve. Structural equation modeling (SEM) was used to investigate the relationship between fear of pain and dental anxiety. RESULTS Data from 400 participants (200 women, 44.9 ± 14.5 years) were analyzed. The FPQ-III showed good internal validity, intra-examiner reliability, discriminant validity, and convergent validity. Confirmatory factor analysis results supported a three-factor structure, and the FPQ-9 showed a good fit. Test information curves demonstrated that the FPQ-9 maintained high accuracy over a similarly wide range as the FPQ-III. SEM revealed that fear of minor pain was associated with dental anxiety via fear of medical pain even in individuals without painful medical or dental experiences (indirect effect 0.48 [95% CI: 0.32-0.81]). Fear of severe pain tended to be higher in individuals with chronic pain compared to those without (latent mean values 0 vs. 0.27, p = 0.002) and was also associated with dental anxiety via fear of medical pain in women (indirect effect 0.15 [95% CI: 0.01-0.34]). CONCLUSION The Japanese version of the FPQ-9 demonstrated high reliability and validity, making it a valuable tool in dental clinical and research settings. It provides insights into the fear of pain among individuals with chronic pain and dental anxiety, informing potential intervention strategies.
Collapse
|
70
|
Shim H, Bonifay W, Wiedermann W. Parsimonious item response theory modeling with the negative log-log link: The role of inflection point shift. Behav Res Methods 2023:10.3758/s13428-023-02189-z. [PMID: 37537489 DOI: 10.3758/s13428-023-02189-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/30/2023] [Indexed: 08/05/2023]
Abstract
In item response theory (IRT) modeling, the magnitude of the lower and upper asymptote parameters determines the degree to which the inflection point shifts above or below P = 0.50. The current study examines the one-parameter negative log-log model (NLLM), which is characterized by a downward shift in the inflection point, among other distinctive psychometric properties. After detailing the statistical foundations of the NLLM, we present a series of simulation studies to establish item and person parameter estimation accuracy and to demonstrate that this parsimonious model addresses the "slipping" effect (i.e., unexpectedly incorrect answers) via an inflection point < 0.50 rather than through computationally difficult estimation of the upper asymptote. We then provide further support for these simulation results through empirical data analysis. Finally, we discuss how the NLLM contributes to recent methodological literature on the utility of asymmetric IRT models.
Collapse
|
71
|
Player L, Hanel PH, Whitmarsh L, Shah P. The 19-Item Environmental Knowledge Test (EKT-19): A short, psychometrically robust measure of environmental knowledge. Heliyon 2023; 9:e17862. [PMID: 37609389 PMCID: PMC10440470 DOI: 10.1016/j.heliyon.2023.e17862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 06/11/2023] [Accepted: 06/29/2023] [Indexed: 08/24/2023] Open
Abstract
Environmental knowledge is considered an important pre-cursor to pro-environmental behaviour. Though several tools have been designed to measure environmental knowledge, there remains no concise, psychometrically grounded measure. We validated an existing measure in a British sample, confirming that it had good one- and three-factor structures in line with previous literature. For the first time in this field, we built upon previous Classical Test Theory approaches and used discrimination values derived from Item Response Theory to select the best items, resulting in the 19-Item Environmental Knowledge Test (EKT-19). This measure retained a clear factor structure and had moderate-to-good internal reliability, indicating that it is a parsimonious and psychometrically robust measure for the assessment of overall and specific types of environmental knowledge. The theoretical implications and real-world applications of this measure are discussed.
Collapse
|
72
|
Geiger SJ, Vintr J, Rachev NR. A reassessment of the Resistance to Framing scale. Behav Res Methods 2023; 55:2320-2332. [PMID: 35851678 PMCID: PMC10439025 DOI: 10.3758/s13428-022-01876-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2022] [Indexed: 11/08/2022]
Abstract
Risky-choice and attribute framing effects are well-known cognitive biases, where choices are influenced by the way information is presented. To assess susceptibility to these framing types, the Resistance to Framing scale is often used, although its performance has rarely been extensively tested. In an online survey among university students from Bulgaria (N = 245) and North America (N = 261), we planned to examine the scale's psychometric properties, structural validity, and measurement invariance. However, some of these examinations were not possible because the scale displayed low and mostly non-significant inter-item correlations as well as low item-total correlations. Followingly, exploratory item response theory analyses indicated that the scale's reliability was low, especially for high levels of resistance to framing. This suggests problems with the scale at a basic level of conceptualization, namely that the items may not represent the same content domain. Overall, the scale in its current version is of limited use, at least in university student samples, due to the identified problems. We discuss potential remedies to these problems, as well as provide open code and data ( https://osf.io/j5n6f ) which facilitates testing the scale in other samples (e.g., general population, different languages and countries) to obtain a comprehensive picture of its performance.
Collapse
|
73
|
Kaat AJ, Croen LA, Constantino J, Newshaffer CJ, Lyall K. Modifying the social responsiveness scale for adaptive administration. Qual Life Res 2023; 32:2353-2360. [PMID: 36943606 PMCID: PMC11034771 DOI: 10.1007/s11136-023-03397-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/10/2023] [Indexed: 03/23/2023]
Abstract
PURPOSE The social responsiveness scale (SRS) is frequently used to quantify the autism-related phenotype and is gaining use in health outcomes research. However, it has a high respondent burden (65 items) for large-scale studies. Further, most evaluations of it have focused on the school-age form, not the preschool form. More validity evidence of shortened forms is necessary in the general population to support the broader health outcomes context of use. METHODS We evaluated the psychometrics of the SRS in 7030 individuals from multiple predominantly neurotypical samples in order to shorten it based on non-autistic sample metrics. Analyses included item factor analysis, differential item functioning (DIF), and multiple-group item response theory (IRT) to place the SRS items on a comparable scale, which was then simulated via computer adaptive testing (CAT) administration. RESULTS The SRS was broadly unidimensional with few methodological residual dependencies. On average, males had more autistic characteristics than females, and preschoolers had fewer characteristics than school-age children. The final IRT calibration included 45 items equated across forms, and each form had 11 with significant wording discrepancies and 9 items with near-identical wording that exhibited form-related DIF. The CAT simulation suggested a median of 14 items was sufficient to reach a reliable score, demonstrating its feasibility across the range of impairments. CONCLUSION IRT allows practitioners the ability to get highly reliable scores with fewer items than the full-length SRS. This supports the future application of the SRS in a computer adaptive testing mode in both neurotypical and ASD samples.
Collapse
|
74
|
Gerstenecker A, Kennedy R, Zhang Y, Martin RC, Mackin RS, Weiner MW, Howell T, Petersen RC, Roberson ED, Marson DC. Item Response Analysis of the Financial Capacity Instrument-Short Form. Arch Clin Neuropsychol 2023; 38:739-758. [PMID: 36644855 PMCID: PMC10369359 DOI: 10.1093/arclin/acac112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/29/2022] [Indexed: 06/17/2023] Open
Abstract
OBJECTIVE The Financial Capacity Instrument-Short Form (FCI-SF) is a performance-based measure of everyday financial skills that takes 15 min to administer. Although the FCI-SF has demonstrated excellent psychometric properties, advanced psychometric methods such as item response theory (IRT) can provide important information on the performance of individual test items in measuring financial capacity and in distinguishing between healthy and cognitively impaired individuals. METHOD Participants were 272 older adults diagnosed with mild cognitive impairment (MCI) and 1,344 cognitively healthy controls recruited from the Mayo Clinic Study of Aging at the Mayo Clinic in Rochester, Minnesota and also from the Cognitive Observations in Seniors study at the University of Alabama at Birmingham. Participants in each study were administered the FCI-SF, which evaluates coin/currency calculation, financial conceptual knowledge, use of a checkbook/register, and use of a bank statement. RESULTS A unidimensional two-parameter logistic model best fit the 37 FCI-SF Test items, and most FCI-SF items fit the unidimensional two-parameter model well. The results indicated that all FCI-SF items robustly distinguished cognitively healthy controls from persons with MCI. CONCLUSIONS The study results showed that the FCI-SF performed well under IRT analysis, further highlighted the psychometric properties of the FCI-SF as a valid and reliable measure of financial capacity, and demonstrated the clinical utility of the FCI-SF in distinguishing between cognitively normal and cognitively impaired individuals.
Collapse
|
75
|
Fu Y, Zhan P, Chen Q, Jiao H. Joint modeling of action sequences and action time in computer-based interactive tasks. Behav Res Methods 2023:10.3758/s13428-023-02178-2. [PMID: 37429984 DOI: 10.3758/s13428-023-02178-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/16/2023] [Indexed: 07/12/2023]
Abstract
Process data refers to data recorded in computer-based assessments that reflect the problem-solving processes of participants and provide greater insight into how they solve problems. Action time, namely the amount of time required to complete a state transition, is also included in such data along with actions. In this study, an action-level joint model of action sequences and action time is proposed, in which the sequential response model (SRM) is used as the measurement model for action sequences, and a new log-normal action time model is proposed as the measurement model for action time. The proposed model can be regarded as an extension of the SRM by incorporating action time within the joint-hierarchical modeling framework and as an extension of the conventional item-level joint models in process data analysis. Results of the empirical and simulation studies demonstrated that the model setup was justified, model parameters could be interpreted, parameter estimates were accurate, and taking into account participants' action time further was beneficial for obtaining a deep understanding of participants' behavioral patterns. Overall, the proposed action-level joint model provides an innovative modeling framework for analyzing process data in computer-based assessments from the latent variable modeling perspective.
Collapse
|