1
|
Ensuring the "health" of a curricular program evaluation: Alignment and analytic quality of two instruments for use in evaluating the effectiveness of an interprofessional collaboration curriculum. EVALUATION AND PROGRAM PLANNING 2024; 102:102377. [PMID: 37783173 DOI: 10.1016/j.evalprogplan.2023.102377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 07/17/2023] [Accepted: 09/20/2023] [Indexed: 10/04/2023]
Abstract
To cultivate competencies in interprofessional collaboration (IPC) for patient-centered, team-based care, a multi-faceted training enhancement initiative was implemented at our academic primary care residency site. Evaluation of the activities from previously collected survey data occurred upon a 2-year review. First, the evaluation team scrutinized the instruments for alignment and appropriateness with planned IPC educational learning and behavior objectives. We found the two instruments were well supported by the literature and with appropriate evidence for validation, but were not well aligned to the objectives of this IPC training initiative, reducing appropriateness of potential inferences of the findings for this context. Second, the team assessed the analytic quality of survey results in item difficulty distribution and item fit to the requirements of a Rasch measurement model. This revealed low person separation due to high overall item agreement. Most residents agreed with most items, so the measures lacked the precision necessary to capture change in residents' IPC competency. Our instrument review serves as a reminder of the need to gather validity evidence for the use of any existing tool within a new context, and offers a generalizable strategy to evaluate data sources for appropriateness and quality within a specific program.
Collapse
|
2
|
Using Rasch measurement for instrument rating scale refinement. CURRENTS IN PHARMACY TEACHING & LEARNING 2023; 15:110-118. [PMID: 36898895 DOI: 10.1016/j.cptl.2023.02.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 08/31/2022] [Accepted: 02/23/2023] [Indexed: 06/18/2023]
Abstract
OUR SITUATION Rasch measurement is an analysis tool that can provide validity evidence for instruments that attempt to measure student learning or other psychosocial behaviors, regardless if tools are newly created, modified, or previously developed. Rating scales are exceedingly common among psychosocial instruments and properly functioning rating scales are critical to effective measurement. Rasch measurement can help investigate this. METHODOLOGICAL LITERATURE REVIEW Aside from using Rasch measurement from the beginning to help create rigorous new measurement instruments, researchers can also benefit from employing Rasch measurement on previously developed instruments that had not included Rasch measurement during development. This article is focused on Rasch measurement's unique analysis of rating scales. That is, Rasch measurement can uniquely help examine if and how an instrument's rating scale is functioning among newly studied respondents (who will likely differ from the originally researched sample). OUR RECOMMENDATIONS AND THEIR APPLICATION After reviewing this article, the reader should be able to describe Rasch measurement, including how it is focused on fundamental measurement and how it differs from classical test theory and item-response theory, and reflect on situations in their own research where a Rasch measurement analysis might be helpful for generating validation evidence with a previously developed instrument. POTENTIAL IMPACT In the end, Rasch measurement can offer a helpful, unique, rigorous approach to further developing instruments that scientifically measure, accurately and precisely.
Collapse
|
3
|
Evidentiary Standards for Patient-Centered Core Impact (PC-CIS) Value Claims. Innov Pharm 2022; 13:10.24926/iip.v13i3.5016. [PMID: 36627906 PMCID: PMC9815867 DOI: 10.24926/iip.v13i3.5016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Proposals for a patient centered core impact set (PC-CIS) are of little relevance to formulary and health system decisions, let alone patients and providers, unless the elements included in the data set meet the standards of normal science and fundamental measurement. Adhering to these standards will have the effect of focusing on the adequacy of proposed core impact measures, with a filter in place to accept only those that meet the standards not only of the physical sciences but also mainstream economics. and health economics. Fortunately, we are well aware of what the criteria for acceptance and rejection of the core impacts within disease states should be in terms of their required attributes and their relevance for supporting evaluable value claims, notably for patient reported outcomes, Rasch or modern measurement theory. Care must be taken to delineate the core impact elements: separately identifying those that are purely clinical from core patient centric impacts, which is turn should be separated from impacts defined in terms of drug utilization and resource utilization. The purpose of this brief commentary is to set out the required standards for core impact patient-centric value claims and the framework for evaluating those claims. The critical issue for patient-centered core impacts is to recognize the constraints imposed by the standards of fundamental measurement for target patient populations within disease areas; unless these constraints are recognized we will fail. The leads to the role of Rasch or modern measurement theory calibration as the framework for patient centric measures of latent traits or attributes. From these perspectives PC-CIS is premature; until we have agreed standards for measurement for the impact or outcomes for clinical, patient-centric and resource utilization as a core set of disease specific instruments, it seems pointless to push forward to a wider scope when the present evidentiary foundation is so weak.
Collapse
|
4
|
Development, validation and item reduction of a food literacy questionnaire (IFLQ-19) with Australian adults. Int J Behav Nutr Phys Act 2022; 19:113. [PMID: 36050778 PMCID: PMC9438317 DOI: 10.1186/s12966-022-01351-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 08/18/2022] [Indexed: 11/24/2022] Open
Abstract
Background Food literacy is theorised to improve diet quality, nutrition behaviours, social connectedness and food security. The definition and conceptualisation by Vidgen & Gallegos, consisting of 11 theoretical components within the four domains of planning and managing, selecting, preparing and eating, is currently the most highly cited framework. However, a valid and reliable questionnaire is needed to comprehensively measure this conceptualisation. Therefore, this study draws on existing item pools to develop a comprehensive food literacy questionnaire using item response theory. Methods Five hundred Australian adults were recruited in Study 1 to refine a food literacy item pool using principal component analysis (PCA) and item response theory (IRT) which involved detailed item analysis on targeting, responsiveness, validity and reliability. Another 500 participants were recruited in Study 2 to replicate item analysis on validity and reliability on the refined item pool, and 250 of these participants re-completed the food literacy questionnaire to determine its test–retest reliability. Results The PCA saw the 171-item pool reduced to 100-items across 19 statistical components of food literacy. After the thresholds of 26 items were combined, responses to the food literacy questionnaire had ordered thresholds (targeting), acceptable item locations (< -0.01 to + 1.53) and appropriateness of the measurement model (n = 92% expected responses) (responsiveness), met outfit mean-squares MSQ (0.48—1.42) (validity) and had high person, item separation (> 0.99) and test–retest (ICC 2,1 0.55–0.88) scores (reliability). Conclusions We developed a 100-item food literacy questionnaire, the IFLQ-19 to comprehensively address the Vidgen & Gallegos theoretical domains and components with good targeting, responsiveness, reliability and validity in a diverse sample of Australian adults. Supplementary Information The online version contains supplementary material available at 10.1186/s12966-022-01351-8.
Collapse
|
5
|
Psychometric properties of Leisure Satisfaction Scale (LSS)-short form: a Rasch rating model calibration approach. BMC Psychol 2022; 10:151. [PMID: 35706062 PMCID: PMC9202183 DOI: 10.1186/s40359-022-00861-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 06/06/2022] [Indexed: 05/31/2023] Open
Abstract
Background Leisure satisfaction has been one of primary variables to explain an individual’s choice of leisure and recreational activities’ participation. The Leisure Satisfaction Scale (LSS)-short form has been widely utilized to measure leisure and recreation participants’ satisfaction levels. However, limited research has been studied on the LSS-short form that would provide sufficient evidence to use it to measure individual leisure satisfaction levels. Thus, the purpose of the study was to determine whether the LSS-short form would be appropriate to measure individuals’ leisure satisfaction levels. Method The convenience sampling was used in this study from the south-central United States. The LSS-short form questionnaire was administered to 436 individuals after removing 20 surveys due to incomplete questions. The WINSTEPS computer program was utilized to analyze the Rating scale fit; Item fit; Differential Item Functioning (DIF); and Person-Item map by utilizing Rasch rating scale model. Results The results indicated that the five-point Likert-type LSS-short form was appropriate to utilize. Two of 24 LSS-short form items had overfit or misfit and were eliminated. DIF indicated that all remained 22 items were suitable to measure leisure satisfaction levels. Overall, 22 item were finally selected for the reconstructed version of the LSS-short form. In addition, Person-Item map showed that ability and item difficulty were fit matched. Conclusions As the importance of leisure has been increased, the newly reconstructed LSS-short form would be recommended to evaluate individual leisure satisfaction levels in future studies. Furthermore, leisure and recreation professionals can provide and develop effective leisure activities or programs by measuring individual’s leisure satisfaction level with the new version of LSS-short form.
Collapse
|
6
|
Re-Developing the Adversity Response Profile for Chinese University Students. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19116389. [PMID: 35681973 PMCID: PMC9180553 DOI: 10.3390/ijerph19116389] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 05/22/2022] [Accepted: 05/23/2022] [Indexed: 12/03/2022]
Abstract
Adversity response is fundamental to dealing with adversity. This paper reports the re-development and subsequent psychometric evaluation of the Adversity Response Profile for Chinese University Students (ARP-CUS). The data were collected from a Chinese university student sample (n = 474). Factor analysis and Rasch analysis were used to examine the psychometric properties of the ARP-CUS. Exploratory factor analysis revealed a six-factor model; then confirmatory factor analysis supported a five-factor solution. Rasch analysis provided further evidence of the psychometric quality of the instrument in terms of dimensionality, rating scale effectiveness, and item fit statistics for those six dimensions. The final version of the ARP-CUS contains 24 items across five subscales for assessing students’ responses to adversity, including control, attribution, reach, endurance, and transcendence. Overall, ARP-CUS demonstrates satisfactory psychometric properties for quantifying the adversity quotient of Chinese university students.
Collapse
|
7
|
Evaluating the reliability and validity of a questionnaire used to measure experiences of teamwork among student pharmacists in a quality improvement course. CURRENTS IN PHARMACY TEACHING & LEARNING 2022; 14:552-560. [PMID: 35715095 DOI: 10.1016/j.cptl.2022.04.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 02/13/2022] [Accepted: 04/28/2022] [Indexed: 06/15/2023]
Abstract
INTRODUCTION The psychometric properties of instruments used to capture student pharmacists' perspectives of teamwork have not been well assessed. This study measured the reliability and validity of an instrument designed to assess teamwork experiences among student pharmacists in a quality improvement (QI) class at one United States pharmacy school. METHODS The psychometric properties of a previously conducted 17-item questionnaire (response options: "strongly agree," "agree," "disagree," or "strongly disagree") about second-year student pharmacists' teamworking experiences were assessed. A Rasch rating scale model was used to construct measures of teamwork experience. Principal component analysis (PCA) assessed unidimensionality. Item- and person-fit statistics were assessed. Construct and content validity and reliability were estimated utilizing student and item separation indices (SI) and reliability coefficients (RC). RESULTS Sixty student pharmacists were included. PCA conveyed a unidimensional construct. Four items with infit and outfit mean-squared values outside the suggested range were removed. Item responses "disagree" and "strongly disagree" were merged to improve scale functionality. The average person measure was 1.74 ± 2.03 logits. Student and item RC were 0.81 (SI = 2.04) and 0.97 (SI = 2.17), respectively. The easiest item endorsed was team's ability to reach consensus, while the most difficult item was interest to do collaborative work again. Mismatch of student experience and item difficulty level on the continuum scale suggested additional items are needed to match student teamwork experience. CONCLUSION The instrument demonstrated evidence of reliability and validity to measure student pharmacists' teamwork experience in a QI class, but additional instrument modifications are recommended.
Collapse
|
8
|
Sophisticated Statistics Cannot Compensate for Method Effects If Quantifiable Structure Is Compromised. Front Psychol 2022; 13:812963. [PMID: 35250744 PMCID: PMC8888847 DOI: 10.3389/fpsyg.2022.812963] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 01/24/2022] [Indexed: 11/29/2022] Open
Abstract
Researchers rely on psychometric principles when trying to gain understanding of unobservable psychological phenomena disconfounded from the methods used. Psychometric models provide us with tools to support this endeavour, but they are agnostic to the meaning researchers intend to attribute to the data. We define method effects as resulting from actions which weaken the psychometric structure of measurement, and argue that solution to this confounding will ultimately rest on testing whether data collected fit a psychometric model based on a substantive theory, rather than a search for a model that best fits the data. We highlight the importance of taking the notions of fundamental measurement seriously by reviewing distinctions between the Rasch measurement model and more generalised 2PL and 3PL IRT models. We then present two lines of research that highlight considerations of making method effects explicit in experimental designs. First, we contrast the use of experimental manipulations to study measurement reactivity during the assessment of metacognitive processes with factor-analytic research of the same. The former suggests differential performance-facilitating and -inhibiting reactivity as a function of other individual differences, whereas factor-analytic research suggests a ubiquitous monotonically predictive confidence factor. Second, we evaluate differential effects of context and source on within-individual variability indices of personality derived from multiple observations, highlighting again the importance of a structured and theoretically grounded observational framework. We conclude by arguing that substantive variables can act as method effects and should be considered at the time of design rather than after the fact, and without compromising measurement ideals.
Collapse
|
9
|
Validation of exercise motivations inventory - 2 (EMI-2) scale for college students. JOURNAL OF AMERICAN COLLEGE HEALTH : J OF ACH 2022; 70:114-121. [PMID: 32150522 DOI: 10.1080/07448481.2020.1726929] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Accepted: 02/02/2020] [Indexed: 06/10/2023]
Abstract
Objective The purpose of this study was to determine whether the Exercise Motivations Inventory - 2 (EMI-2) scale would be appropriate to measure college students' exercise motivation. Participants: The EMI-2 scale questionnaire was administered to 325 college students in the southwestern U.S. Method: The WINSTEPS program was conducted to analyze Rating Scale Fit, Differential Item Functioning (DIF), and Item fit by applying Rasch rating scale model calibration. Results: A 5-point Likert-type rating scale of the EMI-2 was more appropriate to investigate college students' exercise motivation. Seventeen of 51 items were selected as the DIF, and one item had over standard item fit. Overall, 33 items were finally selected for a new version of the EMI-2 scale for college students. Additionally, Person-Item map showed that person ability and item difficulty were fit matched. Conclusions: This reconstructed EMI-2 scale can be utilized to assess exercise motivations of college students.
Collapse
|
10
|
Evaluating item difficulty patterns for assessing student misconceptions in science across physics, chemistry, and biology concepts. Heliyon 2021; 7:e08352. [PMID: 34825081 PMCID: PMC8605188 DOI: 10.1016/j.heliyon.2021.e08352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 09/13/2021] [Accepted: 11/05/2021] [Indexed: 11/24/2022] Open
Abstract
Understanding item difficulty in science concepts is essential for teachers in teaching and learning to avoid student misconceptions. This study aims to evaluate the patterns of item difficulty estimates in science concepts exploring student misconceptions across physics, biology, and chemistry and to explore differential item functioning (DIF) items in the developed diagnostic test on the basis of gender and grade. Participants were drawn from 856 students (52.3% females and 47.7% males) comprising senior high school students from 11th to 12th grades and pre-service science teachers in the West Kalimantan province, Indonesia. Out of 16 science concepts categorized, the common science concepts causing misconceptions among students were investigated to understand item difficulty patterns using Rasch measurement. The findings of this study evaluated that 32 developed items are valid and reliable whereby the item difficulty estimates ranged from −5.13 logits to 5.06 logits. Chemistry is the scientific discipline with the highest mean logits than other disciplines. There is no significant item difficulty estimate across the science disciplines. We also found DIF issues in one item based on gender and four items based on grade. This study contributes a significant role in mapping and informing item difficulty patterns in science concepts to tackle teachers' problems in assessing and teaching science concepts to improve the students’ science performance. Future studies and limitations are also discussed.
Collapse
|
11
|
Measuring Quality of Life in Carers of People With Dementia: Development and Psychometric Evaluation of Scales measuring the Impact of DEmentia on CARers (SIDECAR). THE GERONTOLOGIST 2021; 61:e1-e11. [PMID: 31688902 PMCID: PMC8023371 DOI: 10.1093/geront/gnz136] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Indexed: 02/07/2023] Open
Abstract
Background and Objectives A 2008 European consensus on research outcome measures in dementia care concluded that measurement of carer quality of life (QoL) was limited. Three systematic reviews (2012, 2017, and 2018) of dementia carer outcome measures found existing instruments wanting. In 2017, recommendations were published for developing reliable measurement tools of carers’ needs for research and clinical application. The aim of this study was to develop a new instrument to measure the QoL of dementia carers (family/friends). Methods Items were generated directly from carers following an inductive needs-led approach. Carers (n = 566) from 22 English and Welsh locations then completed the items and comparator measures at three time points. Rasch, factor, and psychometric (reliability, validity, responsiveness, and minimally important differences [MIDs]) analyses were undertaken. Results Following factor analysis, the pool of 70 items was refined to three independent scales: primary SIDECAR-D (direct impact of caring upon carer QOL, 18 items), secondary SIDECAR-I (indirect impact, 10 items), and SIDECAR-S (support and information, 11 items). All three scales satisfy Rasch model assumptions. SIDECAR-D, I, S psychometrics: reliability (internal ≥ .70; test–retest ≥ .85); convergent validity (as hypothesized); responsiveness (effect sizes: D: moderate; I and S: small); MIDs (D = 9/100, I = 10/100, S = 11/100). Discussion and Implications SIDECAR scales demonstrate robust measurement properties, meeting COSMIN quality standards for study design and psychometrics. SIDECAR provides a theoretically based needs-led QoL profile specifically for dementia carers. SIDECAR is free for use in public health, social care, and voluntary sector services, and not-for-profit organizations.
Collapse
|
12
|
The Validation and Further Development of the Multidimensional Cognitive Load Scale for Physical and Online Lectures (MCLS-POL). Front Psychol 2021; 12:642084. [PMID: 33815228 PMCID: PMC8014070 DOI: 10.3389/fpsyg.2021.642084] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 02/18/2021] [Indexed: 11/19/2022] Open
Abstract
Cognitive load theory (CLT) has been widely used to help understand the process of learning and to design teaching interventions. The Cognitive Load Scale (CLS) developed by Leppink and colleagues has emerged as one of the most validated and widely used self-report measures of intrinsic load (IL), extraneous load (EL), and germane load (GL). In this paper we investigated an expansion of the CLS by using a multidimensional conceptualization of the EL construct that is relevant for physical and online teaching environments. The Multidimensional Cognitive Load Scale for Physical and Online Lectures (MCLS-POL) goes beyond the CLS's operationalization of EL by expanding the EL component which originally included factors related to instructions/explanations with sub-dimensions including EL stemming from noises, and EL stemming from both media and devices within the environment. Through three studies, we investigated the reliability, and internal and external validity of the MCLS-POL using the Partial Credit Model, Confirmatory Factor Analysis, and differences between students either attending a lecture physically or online (Study 2 and 3). The results of Study 1 (N = 250) provide initial evidence for the validity and reliability of the MCLS-POL within a higher education sample, but also highlighted several potential improvements which could be made to the measure. These changes were made before re-evaluating the validity and reliability of the measure in a new sample of higher education psychology students (N = 140, Study 2), and psychological testing students (N = 119, Study 3). Together the studies provide evidence for a multidimensional conceptualization cognitive load and provide evidence of the validity, reliability, and sensitivity of the MCLS-POL and provide suggestions for future research directions.
Collapse
|
13
|
Development and validation of the mental health professionals' attitude towards people living with HIV/AIDS scale (MHP-PLHIV-AS). AIDS Care 2020; 32:10-18. [PMID: 32951447 DOI: 10.1080/09540121.2020.1822503] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
This study focused on the creation and validation of an instrument to measure mental health professionals' attitudes towards people living with HIV/AIDS. Rasch analyses (Rash, 1960, 1980) provided evidence to support a twodimensional (societal and personal dimensions) measurement of this attitude construct.
Collapse
|
14
|
The National Pharmaceutical Council: Endorsing the Construction of Imaginary Worlds in Health Technology Assessment. PHARMACY 2020; 8:E119. [PMID: 32668706 PMCID: PMC7557741 DOI: 10.3390/pharmacy8030119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 07/06/2020] [Accepted: 07/08/2020] [Indexed: 12/03/2022] Open
Abstract
All too often, organizations embrace standards for health technology assessment that fail to meet those of normal science. A value assessment framework has been endorsed that is patently in the realm of pseudoscience. If a value assessment framework is to be accepted, then claims for the value of competing products must be credible, evaluable and replicable. If not, for example, when the assessment relies on the construction of an imaginary lifetime incremental cost-per-quality-adjusted-life-year (QALY) world, then that assessment should be rejected. Such an assessment would fail one of the central roles of normal science: the discovery of new facts through an ongoing process of conjecture and refutation where provisional claims can be continually challenged. It is no good defending an endorsement of a value framework that fails expected standards on the grounds that it has been endorsed by professional groups and reflects decades of development. This is intellectually lazy. If this is the case, then the scientific revolution of the 17th century need not have happened. The purpose of this commentary is to consider the recommended standards for health technology assessment of the National Pharmaceutical Council (NPC), with particular reference to proposed methodological standards in value assessment and the commitment to mathematically impossible QALYs.
Collapse
|
15
|
Psychometric properties of the Assessment Tool for Perceived Agency (ATPA-22) - utility for the rehabilitation of young adults not in education, employment or training (NEETs). Scand J Occup Ther 2020; 28:97-109. [PMID: 32589859 DOI: 10.1080/11038128.2020.1782983] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
BACKGROUND Promoting and supporting agency have been at the heart of the debate multidisciplinary. To promote self-awareness of young people's agency and identify persons in need of support the Assessment Tool for Perceived Agency (ATPA-22) was developed. AIM This study aims to evaluate the psychometric properties of the ATPA-22. Participants were young adults not in education, employment or training (NEETs) and students in higher education (HEI). MATERIALS AND METHODS The main data analysis was implemented by Many Faceted Rasch (MFR) analysis. RESULTS The ATPA-22 items defined a unidimensional construct with reasonable internal consistency and separation ability. The ATPA-22 was capable of detecting differences between HEI students and young adult NEETs. Nine differential functioning items emerged between the groups. CONCLUSIONS ATPA-22 shows promise as a tool to assess young adults' perceived agency. Anyhow, as the individual life situation affects strongly to perceived agency, research on the stability of the ATPA-22 among different populations is needed. SIGNIFICANCE The purpose of the ATPA-22 is to measure perceived agency of individuals, and to identify aspects of agency in need for support. ATPA-22 can be used as a tool for promoting self-awareness of occupational challenges.
Collapse
|
16
|
A Scientometric Review of Rasch Measurement: The Rise and Progress of a Specialty. Front Psychol 2019; 10:2197. [PMID: 31695632 PMCID: PMC6817464 DOI: 10.3389/fpsyg.2019.02197] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 09/12/2019] [Indexed: 12/01/2022] Open
Abstract
A recent review of the literature concluded that Rasch measurement is an influential approach in psychometric modeling. Despite the major contributions of Rasch measurement to the growth of scientific research across various fields, there is currently no research on the trends and evolution of Rasch measurement research. The present study used co-citation techniques and a multiple perspectives approach to investigate 5,365 publications on Rasch measurement between 01 January 1972 and 03 May 2019 and their 108,339 unique references downloaded from the Web of Science (WoS). Several methods of network development involving visualization and text-mining were used to analyze these data: author co-citation analysis (ACA), document co-citation analysis (DCA), journal author co-citation analysis (JCA), and keyword analysis. In addition, to investigate the inter-domain trends that link the Rasch measurement specialty to other specialties, we used a dual-map overlay to investigate specialty-to-specialty connections. Influential authors, publications, journals, and keywords were identified. Multiple research frontiers or sub-specialties were detected and the major ones were reviewed, including “visual function questionnaires”, “non-parametric item response theory”, “valid measures (validity)”, “latent class models”, and “many-facet Rasch model”. One of the outstanding patterns identified was the dominance and impact of publications written for general groups of practitioners and researchers. In personal communications, the authors of these publications stressed their mission as being “teachers” who aim to promote Rasch measurement as a conceptual model with real-world applications. Based on these findings, we propose that sociocultural and ethnographic factors have a huge capacity to influence fields of science and should be considered in future investigations of psychometrics and measurement. As the first scientometric review of the Rasch measurement specialty, this study will be of interest to researchers, graduate students, and professors seeking to identify research trends, topics, major publications, and influential scholars.
Collapse
|
17
|
Supporting construct validity of the Evaluation of Daily Activity Questionnaire using Linear Logistic Test Models. Qual Life Res 2019; 28:1627-1639. [PMID: 30852765 DOI: 10.1007/s11136-019-02146-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/21/2019] [Indexed: 10/27/2022]
Abstract
PURPOSE Construct validity is commonly assessed by applying statistical methods to data. However, purely empirical methods cannot explain what happens between the attribute and the instrument scores, which is the core of construct validity. Linear Logistic Test Models (LLTMs) can provide such explanation by decomposing item difficulties into a weighted sum of theoretical item properties. In this study, we aim to support construct validity of the Evaluation of Daily Activity Questionnaire (EDAQ) by using item properties accounting for item difficulties. METHODS Dichotomized responses to the EDAQ were analyzed with (1) the Rasch model (to estimate item difficulties), and (2) LLTMs (to predict item difficulties). Seven properties of the items were identified and rated in ordinal scales by 39 Occupational Therapists worldwide. Aggregated metric estimates-the weights used to predict item difficulties in LLTMs-were derived from the ratings using seven cumulative link mixed models. Estimated and predicted item difficulties were compared. RESULTS The Rasch model showed acceptable fit and unidimensionality for a sample of 42 locally independent EDAQ items. The LLTM plus error showed significantly better fit than the LLTM. In the former, three of the seven properties were not significant, and the corresponding model including only the significant properties was used to predict item difficulties; they explained 77.5% of the variance in estimated item difficulties. CONCLUSION A satisfactory theoretical explanation of what makes an activity of daily living task more difficult than another has been provided by a LLTM plus error model, therefore supporting construct validity of the EDAQ.
Collapse
|
18
|
Scale Separation Reliability: What Does It Mean in the Context of Comparative Judgment? APPLIED PSYCHOLOGICAL MEASUREMENT 2018; 42:428-445. [PMID: 30787486 PMCID: PMC6373854 DOI: 10.1177/0146621617748321] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Comparative judgment (CJ) is an alternative method for assessing competences based on Thurstone's law of comparative judgment. Assessors are asked to compare pairs of students work (representations) and judge which one is better on a certain competence. These judgments are analyzed using the Bradly-Terry-Luce model resulting in logit estimates for the representations. In this context, the Scale Separation Reliability (SSR), coming from Rasch modeling, is typically used as reliability measure. But, to the knowledge of the authors, it has never been systematically investigated if the meaning of the SSR can be transferred from Rasch to CJ. As the meaning of the reliability is an important question for both assessment theory and practice, the current study looks into this. A meta-analysis is performed on 26 CJ assessments. For every assessment, split-halves are performed based on assessor. The rank orders of the whole assessment and the halves are correlated and compared with SSR values using Bland-Altman plots. The correlation between the halves of an assessment was compared with the SSR of the whole assessment showing that the SSR is a good measure for split-half reliability. Comparing the SSR of one of the halves with the correlation between the two respective halves showed that the SSR can also be interpreted as an interrater correlation. Regarding SSR as expressing a correlation with the truth, the results are mixed.
Collapse
|
19
|
The Stabilizing Influences of Linking Set Size and Model-Data Fit in Sparse Rater-Mediated Assessment Networks. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2018; 78:679-707. [PMID: 30147122 PMCID: PMC6096472 DOI: 10.1177/0013164417703733] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Previous research includes frequent admonitions regarding the importance of establishing connectivity in data collection designs prior to the application of Rasch models. However, details regarding the influence of characteristics of the linking sets used to establish connections among facets, such as locations on the latent variable, model-data fit, and sample size, have not been thoroughly explored. These considerations are particularly important in assessment systems that involve large proportions of missing data (i.e., sparse designs) and are associated with high-stakes decisions, such as teacher evaluations based on teaching observations. The purpose of this study is to explore the influence of characteristics of linking sets in sparsely connected rating designs on examinee, rater, and task estimates. A simulation design whose characteristics were intended to reflect practical large-scale assessment networks with sparse connections were used to consider the influence of locations on the latent variable, model-data fit, and sample size within linking sets on the stability and model-data fit of estimates. Results suggested that parameter estimates for examinee and task facets are quite robust to modifications in the size, model-data fit, and latent-variable location of the link. Parameter estimates for the rater, while still quite robust, are more sensitive to reductions in link size. The implications are discussed as they relate to research, theory, and practice.
Collapse
|
20
|
The development of the physical fitness construct across childhood. Scand J Med Sci Sports 2017; 28:212-219. [PMID: 28376240 DOI: 10.1111/sms.12889] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/28/2017] [Indexed: 11/28/2022]
Abstract
The measurement of physical fitness (PF) is an important factor from many different perspectives. PF is a determinant of healthy child development as it is related to several health outcomes. However, existing taxonomies of the construct and frequently used fitness assessments vary concerning their theoretical assumptions and practical implications. From a theoretical perspective, the construct of physical fitness covers a variety of motor domains, such as cardiovascular endurance, strength, coordination, or flexibility (eg, Caspersen et al., 1985). However, most fitness assessments provide a single (composite) score including all items as test outcome. This implicitly relates to a one-dimensional structure of physical fitness, which has been shown for other motor performance assessments in early childhood (eg, Utesch et al., 2016). This study investigated this one-dimensional structure for 6- to 9-year-old children within the item response theory framework (Partial Credit Model). Seven fitness subtests covering a variety of motor dimensions (6-minute run, pushups, sit-ups, standing broad jump, 20 m sprint, jumping sideways, and balancing backwards) were conducted to a total of 790 six-year-olds, 1371 seven-year-olds, 1331 eight-year-olds, and 925 nine-year-olds (48.2% females). Each item was transformed into five performance categories controlling for sex and age. This study indicates that a one-dimensional testing of PF is feasible across middle childhood. Furthermore, for 6- and 7-year-olds, all seven items including balancing backwards can be accumulated to one factor. From the age of about 8 and 9 years balancing backwards seems to become too easy. Altogether, analyses show no diversification of PF across childhood.
Collapse
|
21
|
Hospital nurses' attitudes, negative perceptions, and negative acts regarding workplace bullying. Ann Gen Psychiatry 2017; 16:33. [PMID: 28936227 PMCID: PMC5603093 DOI: 10.1186/s12991-017-0156-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Accepted: 09/01/2017] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Workplace bullying is a prevalent problem in today's work places that has adverse effects on both bullying victims and organizations. To investigate the predictors of workplace bullying is an important task to prevent bullying victims of nurses in hospitals. OBJECTIVE This study aims to explore the relationships among nurses' attitudes, negative perceptions, and negative acts regarding workplace bullying under the framework of the theory of planned behavior (TPB). METHODS A total of 811 nurses from three hospitals in Taiwan were surveyed. Nurses' responses to the 201 items of 10 scales were calibrated using Rasch analysis and then subjected to path analysis with partial least-squares structural equation modeling (PLS-SEM). RESULTS The instrumental attitude was significant predictors of nurses' negative perceptions to be bullied in the workplace. Instead, the other TPB components of subjective norm and perceived behavioral control were not effective predictors of nurses' negative acts regarding workplace bullying. CONCLUSIONS The findings provided hospital nurse management with important implications for prevention of bullying, particularly to them who are tasked with providing safer and more productive workplaces to hospital nurses. Awareness of workplace bullying was recommended to other kinds of workplaces for further studies in future.
Collapse
|
22
|
Is it possible to develop a cross-country test of social interaction? Scand J Occup Ther 2016; 24:421-430. [PMID: 27809634 DOI: 10.1080/11038128.2016.1250813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
OBJECTIVE The Evaluation of Social Interaction (ESI) is used in Asia, Australia, North America and Europe. What is considered to be appropriate social interaction, however, differs amongst countries. If social interaction varies, the relative difficulty of the ESI items and types of social exchange also could vary, resulting in differential item functioning (DIF) and test bias in the form of differential test functioning (DTF). Yet, because the ESI scoring criteria are designed to account for culture, the ESI should be free of DIF and DTF. The purpose, therefore, was to determine whether the ESI demonstrates DIF or DTF related to country. METHODS A retrospective, descriptive, cross-sectional study of 9811 participants 2-102 years, 55% female, from 12 countries was conducted using many-facet Rasch analyses. DIF analyses compared paired item and social exchange type values by country against a critical effect size (±0.55 logit). DTF analyses compared paired ESI measures by country to 95% confidence intervals. RESULTS All paired social exchange types and 98.3% of paired items differed by less than ±0.55 logit. All persons fell within 95% confidence intervals. CONCLUSIONS Minimal DIF resulted in no test bias, supporting the cross-country validity of the ESI.
Collapse
|
23
|
Rasch analysis of professional behavior in medical education. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2015; 20:1179-94. [PMID: 25737275 DOI: 10.1007/s10459-015-9594-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Accepted: 02/19/2015] [Indexed: 05/25/2023]
Abstract
The use of students' "consumer feedback" to assess faculty behavior and improve the process of medical education is a significant challenge. We used quantitative Rasch measurement to analyze pre-categorized student comments listed by 385 graduating medical students. We found that students differed little with respect to the number of comments they provided and that their comments indeed form a probabilistic Rasch hierarchy. However, different hierarchies were found across medical departments and faculty. An analysis of these interactions provides valuable, detailed, and quantitative information that can augment qualitative research approaches. In addition, we suggest how the Rasch scaling of student comments can assist researchers in the design and implementation of new faculty evaluation instruments. Finally, the interactions between student and department identified a subset of behaviors that appear to guide and possibly elicit students' comments.
Collapse
|
24
|
Development and validation of a generic scale for use in transition programmes to measure self-management skills in adolescents with chronic health conditions: the TRANSITION-Q. Child Care Health Dev 2015; 41:547-58. [PMID: 25351414 DOI: 10.1111/cch.12207] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/18/2014] [Indexed: 12/01/2022]
Abstract
AIM To develop a generic self-management skills scale for use with adolescents diagnosed with a chronic health condition who are aged 12 to 18 years. BACKGROUND There is a lack of methodologically sound scales for healthcare teams to use to measure self-management skills in adolescents with chronic conditions transitioning to adult care. METHODS Adolescents aged 12 to 18 years with a broad range of chronic health conditions, including neurodevelopmental conditions, were recruited from May to August 2013 from nine outpatient clinics at McMaster Children's Hospital (Canada). Thirty-two participated in a cognitive interview, and 337 completed a questionnaire booklet. Interviews were used to develop the TRANSITION-Q. Rasch measurement theory (RMT) analysis was used to identify items that represent the best indicators of self-management skills. Traditional psychometric tests of measurement performance were also conducted. RESULTS The response rate was 92% (32/32 cognitive; 337/371 field test). RMT analysis resulted in a 14-item scale with three response options. The overall fit of the observed data to that expected by the Rasch model was non-significant, providing support that this new scale measured a unidimensional construct. Other tests supported the scale as scientifically sound, e.g. Person Separation Index = 0.82; good item fit statistics; no differential item function by age or gender; low residual correlations between items; Cronbach's alpha = 0.85; test-retest reliability = 0.90; and tests of construct validity that showed, as hypothesized, fewer skills in younger participants and in participants who required assistance to complete the scale. Finally, participants who agreed they are ready to transfer to adult healthcare reported higher TRANSITION-Q scores than did participants who disagreed. CONCLUSIONS The TRANSITION-Q is a short, clinically meaningful and psychometrically sound scale. This generic scale can be used in research and in paediatric and adolescent clinics to help evaluate readiness for transition.
Collapse
|
25
|
Traditional and Rasch psychometric analyses of the Quality of Life in Adult Cancer Survivors (QLACS) questionnaire in shorter-term cancer survivors 15 months post-diagnosis. J Psychosom Res 2014; 77:322-9. [PMID: 25190179 DOI: 10.1016/j.jpsychores.2014.07.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Revised: 06/08/2014] [Accepted: 07/06/2014] [Indexed: 12/13/2022]
Abstract
OBJECTIVE The aim of this paper is to provide new insights into the psychometrics of the Quality of Life in Adult Cancer Survivors (QLACS) questionnaire, originally developed for longer-term survivors 5+years post-diagnosis. Specifically, to examine the classic psychometric properties of QLACS in a sample of shorter-term survivors, and to undertake Rasch analysis to explore the extent to which the Generic and Cancer-Specific summary scales (and separately-analysed Benefits of cancer domain) are unidimensional, with linear measurement properties and no differential item functioning (DIF). METHODS Patients with potentially curable breast, colorectal or prostate cancer completed QLACS 15 months post-diagnosis (N=407). Score distributions, floor and ceiling effects, internal reliability, and feasibility (completion time and missing data) were examined. Rasch analysis included examination of item fit, DIF and unidimensionality. RESULTS The QLACS domains and summary scales had very similar score distributions and classic psychometric properties (no ceiling effects, majority no floor effects, acceptable reliability) to those found in development work with longer-term survivors. Median completion time was 10 min and total missing data 2.3%. The Generic summary scale contained several misfitting items and exhibited multidimensionality. The Cancer-Specific summary scale and Benefits domain showed fit to the Rasch model and demonstrated unidimensionality and no DIF, with just one or no item modifications respectively. CONCLUSION QLACS demonstrates similarly good classic psychometric properties among shorter-term as among longer-term survivors, and has good feasibility. The Cancer-Specific summary scale and Benefits domain showed an impressive degree of fit to the Rasch model, although the validity of computing the Generic summary score was not supported.
Collapse
|
26
|
Development of item bank to measure deliberate self-harm behaviours: facilitating tailored scales and computer adaptive testing for specific research and clinical purposes. Psychiatry Res 2014; 217:240-7. [PMID: 24703572 DOI: 10.1016/j.psychres.2014.03.015] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/26/2013] [Revised: 02/07/2014] [Accepted: 03/11/2014] [Indexed: 11/23/2022]
Abstract
The purpose of this study was to investigate the application of item banking to questionnaire items intended to measure Deliberate Self-Harm (DSH) behaviours. The Rasch measurement model was used to evaluate behavioural items extracted from seven published DSH scales administered to 568 Australians aged 18-30 years (62% university students, 21% mental health patients, and 17% community members). Ninety four items were calibrated in the item bank (including 12 items with differential item functioning for gender and age). Tailored scale construction was demonstrated by extracting scales covering different combinations of DSH methods but with the same raw score for each person location on the latent DSH construct. A simulated computer adaptive test (starting with common self-harm methods to minimise presentation of extreme behaviours) demonstrated that 11 items (on average) were needed to achieve a standard error of measurement of 0.387 (corresponding to a Cronbach׳s Alpha of 0.85). This study lays the groundwork for advancing DSH measurement to an item bank approach with the flexibility to measure a specific definitional orientation (e.g., non-suicidal self-injury) or a broad continuum of self-harmful acts, as appropriate to a particular research/clinical purpose.
Collapse
|
27
|
The development of scales to measure childhood cancer survivors' readiness for transition to long-term follow-up care as adults. Health Expect 2014; 18:1941-55. [PMID: 25052198 PMCID: PMC5810698 DOI: 10.1111/hex.12241] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Purpose To develop and validate scales to measure constructs that survivors of childhood cancer report as barriers and/or facilitators to the process of transitioning from paediatric to adult‐oriented long‐term follow‐up (LTFU) care. Methods Qualitative interviews provided a dataset that were used to develop items for three new scales that measure cancer worry, self‐management skills and expectations about adult care. These scales were field‐tested in a sample of 250 survivors aged 15–26 years recruited from three Canadian hospitals between July 2011 and January 2012. Rasch Measurement Theory (RMT) analysis was used to identify the items that represent the best indicators of each scale using tests of validity (i.e. thresholds for item response options, item fit statistics, item locations, differential item function) and reliability (Person Separation Index). Traditional psychometric tests of measurement performance were also conducted. Results RMT led to the refinement of a 6‐item Cancer Worry scale (focused on worry about cancer‐related issues such as late effects), a 15‐item Self‐Management Skills scale (focused on skills an adolescent needs to acquire to manage their own health care), and a 12‐item Expectations scale (about the nature of adult LTFU care). Our study provides preliminary evidence about the reliability and validity of these new scales (e.g. Person Separation Index ≥ 0.81; Cronbach's α ≥ 0.81; test–retest reliability ≥ 0.85). Conclusion There is limited knowledge about the transition experience of childhood cancer survivors. These scales can be used to investigate barriers survivors face in the process of transition from paediatric to adult care.
Collapse
|
28
|
Factors influencing student perceptions of high-school science laboratory environments. LEARNING ENVIRONMENTS RESEARCH 2013; 16:37-41. [PMID: 23950693 PMCID: PMC3740975 DOI: 10.1007/s10984-012-9107-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Science laboratory learning has been lauded for decades for its role in fostering positive student attitudes about science and developing students' interest in science and ability to use equipment. An expanding body of research has demonstrated the significant influence of laboratory environment on student learning. Further research has demonstrated differences in student perceptions based on giftedness. To explore the relationship between giftedness and students' perceptions of their learning environment, we examined students' perceptions of their laboratory learning environment in biology courses, including courses designated for high-achieving versus regular-achieving students. In addition, to explore the relationship between students' perceptions and the extent of their experience with laboratory learning in a particular discipline, we examined students' perceptions of their laboratory learning environment in first-year biology courses versus elective biology courses that require first-year biology as a prerequisite. We found that students in high-achieving courses had a more favourable perception of all aspects of their learning environment when compared with students in regular courses. In addition, student perceptions of their laboratory appeared to be influenced by the extent of their experience in learning science. Perceptions were consistent amongst regular- and high-achieving students regardless of grade level. In addition, perceptions of students in first year and beyond were consistent regardless of grade level. These findings have critical applications in curriculum development as well as in the classroom. Teachers can use student perceptions of their learning environment to emphasize critical pedagogical approaches and modify other areas that enable enhancement of the science laboratory learning environment.
Collapse
|