1
|
Habes E, Kolk J, Van Brunschot M, Bouwes A. Development of script concordance test for assessment of clinical reasoning in nursing: Lessons learned regarding construct validity. Heliyon 2024; 10:e35151. [PMID: 39161805 PMCID: PMC11332874 DOI: 10.1016/j.heliyon.2024.e35151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 07/17/2024] [Accepted: 07/23/2024] [Indexed: 08/21/2024] Open
Abstract
Background The script concordance test (SCT) has been shown to be an effective tool to assess the clinical reasoning skills of nursing students. Various nursing studies have demonstrated the construct validity of this test. However, studies on the barriers that may impede construct validity during the development process are limited. Objective This evaluation describes the barriers to the development of SCT for Bachelor's nursing students and the lessons learned regarding construct validity. Methods We conducted a descriptive evaluation of the SCT development and a validation process was performed. The evaluation was based on written comments during the assessment (N = 327), a Student's Perspective Questionnaire (N = 100), and student feedback during three live review sessions (N = 27). Results Despite consideration of the guidelines during SCT development, we encountered three main barriers that may impede construct validity. We undertook the necessary efforts to recruit an appropriate expert panel. We overestimated the experts' and students' understanding of the SCT methodology. Additionally, four potential causes of invalid item construction were identified. These possible causes were 'questionable intervention, hypothesis, or investigation', 'blurred data in new information', 'regression to the middle', and 'misinterpretation of the midpoint'. Conclusion The three lessons learned are as follows: 1) The recruitment of an appropriate expert panel must not be underestimated. Besides clinical expertise, experts need training in SCT methodology, including awareness of possible pitfalls; 2) SCT training is a prerequisite for SCT as an assessment; and 3) student feedback may offer a deeper understanding of potential hidden script errors and causes for misinterpretation of SCT. Further studies are necessary to identify additional causes which may impede the construct validity of SCT in nursing education.
Collapse
Affiliation(s)
- E.V. Habes
- Institute of Nursing Studies, HU University of Applied Sciences Utrecht, Utrecht, the Netherlands
| | - J.E.M. Kolk
- Institute of Nursing Studies, HU University of Applied Sciences Utrecht, Utrecht, the Netherlands
| | | | - A. Bouwes
- Institute of Nursing Studies, HU University of Applied Sciences Utrecht, Utrecht, the Netherlands
| |
Collapse
|
2
|
Torre D, Daniel M, Ratcliffe T, Durning SJ, Holmboe E, Schuwirth L. Programmatic Assessment of Clinical Reasoning: New Opportunities to Meet an Ongoing Challenge. TEACHING AND LEARNING IN MEDICINE 2024:1-9. [PMID: 38794865 DOI: 10.1080/10401334.2024.2333921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 02/29/2024] [Indexed: 05/26/2024]
Abstract
Issue: Clinical reasoning is essential to physicians' competence, yet assessment of clinical reasoning remains a significant challenge. Clinical reasoning is a complex, evolving, non-linear, context-driven, and content-specific construct which arguably cannot be assessed at one point in time or with a single method. This has posed challenges for educators for many decades, despite significant development of individual assessment methods. Evidence: Programmatic assessment is a systematic assessment approach that is gaining momentum across health professions education. Programmatic assessment, and in particular assessment for learning, is well-suited to address the challenges with clinical reasoning assessment. Several key principles of programmatic assessment are particularly well-aligned with developing a system to assess clinical reasoning: longitudinality, triangulation, use of a mix of assessment methods, proportionality, implementation of intermediate evaluations/reviews with faculty coaches, use of assessment for feedback, and increase in learners' agency. Repeated exposure and measurement are critical to develop a clinical reasoning assessment narrative, thus the assessment approach should optimally be longitudinal, providing multiple opportunities for growth and development. Triangulation provides a lens to assess the multidimensionality and contextuality of clinical reasoning and that of its different, yet related components, using a mix of different assessment methods. Proportionality ensures the richness of information on which to draw conclusions is commensurate with the stakes of the decision. Coaching facilitates the development of a feedback culture and allows to assess growth over time, while enhancing learners' agency. Implications: A programmatic assessment model of clinical reasoning that is developmentally oriented, optimizes learning though feedback and coaching, uses multiple assessment methods, and provides opportunity for meaningful triangulation of data can help address some of the challenges of clinical reasoning assessment.
Collapse
Affiliation(s)
- Dario Torre
- Department of Medical Education, University of Central Florida, Orlando, FL, USA
| | - Michelle Daniel
- Department of Emergency Medicine, University of California, San Diego, CA, USA
| | - Temple Ratcliffe
- Department of Medicine, The Joe R and Teresa Lozano Long School of Medicine at University of Texas Health, Texas, USA
| | - Steven J Durning
- Center for Heath Profession Education, Uniformed Services University Center for Neuroscience and Regenerative Medicine, Bethesda, Maryland, USA
| | - Eric Holmboe
- Milestones Development and Evaluation, Accreditation Council for Graduate Medical Education, Chicago, IL, USA
| | | |
Collapse
|
3
|
Hudon A, Kiepura B, Pelletier M, Phan V. Using ChatGPT in Psychiatry to Design Script Concordance Tests in Undergraduate Medical Education: Mixed Methods Study. JMIR MEDICAL EDUCATION 2024; 10:e54067. [PMID: 38596832 PMCID: PMC11007379 DOI: 10.2196/54067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 03/06/2024] [Accepted: 03/07/2024] [Indexed: 04/11/2024]
Abstract
Background Undergraduate medical studies represent a wide range of learning opportunities served in the form of various teaching-learning modalities for medical learners. A clinical scenario is frequently used as a modality, followed by multiple-choice and open-ended questions among other learning and teaching methods. As such, script concordance tests (SCTs) can be used to promote a higher level of clinical reasoning. Recent technological developments have made generative artificial intelligence (AI)-based systems such as ChatGPT (OpenAI) available to assist clinician-educators in creating instructional materials. Objective The main objective of this project is to explore how SCTs generated by ChatGPT compared to SCTs produced by clinical experts on 3 major elements: the scenario (stem), clinical questions, and expert opinion. Methods This mixed method study evaluated 3 ChatGPT-generated SCTs with 3 expert-created SCTs using a predefined framework. Clinician-educators as well as resident doctors in psychiatry involved in undergraduate medical education in Quebec, Canada, evaluated via a web-based survey the 6 SCTs on 3 criteria: the scenario, clinical questions, and expert opinion. They were also asked to describe the strengths and weaknesses of the SCTs. Results A total of 102 respondents assessed the SCTs. There were no significant distinctions between the 2 types of SCTs concerning the scenario (P=.84), clinical questions (P=.99), and expert opinion (P=.07), as interpretated by the respondents. Indeed, respondents struggled to differentiate between ChatGPT- and expert-generated SCTs. ChatGPT showcased promise in expediting SCT design, aligning well with Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition criteria, albeit with a tendency toward caricatured scenarios and simplistic content. Conclusions This study is the first to concentrate on the design of SCTs supported by AI in a period where medicine is changing swiftly and where technologies generated from AI are expanding much faster. This study suggests that ChatGPT can be a valuable tool in creating educational materials, and further validation is essential to ensure educational efficacy and accuracy.
Collapse
Affiliation(s)
- Alexandre Hudon
- Department of Psychiatry and Addictology, University of Montreal, Montreal, QC, Canada
| | - Barnabé Kiepura
- Department of Psychiatry and Addictology, University of Montreal, Montreal, QC, Canada
| | | | - Véronique Phan
- Department of Pediatrics, Université de Montréal, Montreal, QC, Canada
| |
Collapse
|
4
|
Silva Ríos AP, del Campo Rivas MN, Kuncar Uarac PK, Calvo Sprovera VA. Reliability of a script agreement test for undergraduate speech-language therapy students. Codas 2023; 35:e20220098. [PMID: 37970957 PMCID: PMC10688298 DOI: 10.1590/2317-1782/20232022098es] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 06/09/2023] [Indexed: 11/19/2023] Open
Abstract
PURPOSE To estimate the reliability of scripts designed for undergraduate Speech-Language Therapy students. METHODS A descriptive cross-sectional study was carried out. Qualitative variables were summarized by frequency or proportion and quantitative through means (CI 95%). Reliability was estimated through Cronbach's α coefficient, and inter-rater agreement was determined using Fleiss's Kappa index. The analytical tests considered a significance level of p<0.05. RESULTS 80 scripts organized in four areas of speech-language therapy were validated by 41 speech-language pathologists. The average experience of the professionals was 17.1 years. The reliability of the corpus was α: 0.67 (min= 0.34; max: 0.84), and the inter-rater agreement κ: 0.29 (min: 0.07; max: 0.45). CONCLUSION The corpus's reliability scores were similar to those reported by previous studies in different health professions. Having validated strategies aimed at developing proficiency and supporting classic training actions in undergraduate courses will contribute to increasing the quality of future health professionals.
Collapse
Affiliation(s)
- Angélica Pilar Silva Ríos
- Escuela de Fonoaudiología, Facultad de Salud, Universidad Santo Tomás - Chile.
- Centro Interdisciplinario de Innovación Educativa, Universidad Santo Tomás - Chile.
| | - Manuel Nibaldo del Campo Rivas
- Escuela de Fonoaudiología, Facultad de Ciencias de la Salud, Universidad Católica Silva Henríquez - Santiago, Región Metropolitana, Chile.
| | | | | |
Collapse
|
5
|
Deschênes MF, Maheu-Cadotte MA, Fontaine G, Dionne É. Scoring Methods in Script Concordance Tests: An Exploratory Psychometric Study. J Nurs Educ 2023; 62:549-555. [PMID: 37812827 DOI: 10.3928/01484834-20230815-05] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2023]
Abstract
BACKGROUND Despite the increasingly popular role of script concordance test (SCT) scoring methods in the evaluation of clinical reasoning, studies examining these methods in nursing are relatively scarce. This study explored the psychometric properties of five SCT scoring methods. METHOD An SCT was administered to 12 experts and 43 learners. Scores were calculated using five methods and descriptive statistics. Differences in scores were assessed with the Mann-Whitney U test, and Spearman correlation coefficients were calculated for the different methods. RESULTS The median scores of both experts and learners differed substantially according to the scoring method used. Learners' scores were statistically different from experts' scores (p < .01) for each method. Spearman coefficients (range, 0.44 to 0.95) were positive for the different methods. CONCLUSION Further research is needed to refine the influence of SCT scoring methods for use in certifying assessment of clinical reasoning in nursing. [J Nurs Educ. 2023;62(10):549-555.].
Collapse
|
6
|
Mok SF, Tan TMD, Seow CJ. Modified endocrinology script concordance test: evaluating the reliability and construct validity for assessing clinical reasoning. Singapore Med J 2023:384045. [PMID: 37675672 DOI: 10.4103/singaporemedj.smj-2021-230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Affiliation(s)
- Shao Feng Mok
- Department of Medicine, National University Hospital, Singapore
| | | | - Cherng Jye Seow
- Department of Endocrinology, Tan Tock Seng Hospital, Singapore
| |
Collapse
|
7
|
Baudou E, Guilbeau-Frugier C, Tack I, Muscari F, Claudet I, Mas E, Taillefer A, Breinig S, Bréhin C. Clinical decision-making training using the Script Concordance Test and simulation: A pilot study for pediatric residents. Arch Pediatr 2023:S0929-693X(23)00056-8. [PMID: 37147153 DOI: 10.1016/j.arcped.2023.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 12/06/2022] [Accepted: 03/25/2023] [Indexed: 05/07/2023]
Abstract
BACKGROUND Each year, new pediatric residents begin their shifts in the pediatric emergency room. While technical skills are often acquired during workshops, non-technical skills such as communication, professionalism, situational awareness, or decision-making are rarely tested. Simulation enables non-technical skills to be developed in situations frequently encountered in pediatric emergencies. Adopting an innovative approach, we combined two pedagogical methods: the Script Concordance Test (SCT) and simulation to improve clinical reasoning and non-technical skills of first-year pediatric residents in dealing with clinical situations involving febrile seizures. The aim of this work is to report the feasibility of such a combined training. METHODS The first-year pediatric residents participated in a training session on how to manage a child attending the emergency department with a febrile seizure. At the beginning of the session, the trainees had to complete the SCT (seven clinical situations) and then participated in three simulation scenarios. Student satisfaction was assessed by means of a questionnaire at the end of the session. RESULTS In this pilot study, 20 residents participated in the training. The SCT scores for the first-year pediatric residents were lower and more widely distributed than those of the experts with better concordance for diagnostic items compared to investigation or treatment items. All were satisfied with the teaching methods employed. Further sessions on additional topics relating to the management of pediatric emergency cases were requested. CONCLUSION Although limited by the small size of our study, this combination of teaching methods was possible and seemed promising for the development of non-technical skills of pediatric residents. These methods are in line with the changes being made to the third cycle of medical studies in France and can be adapted to other situations and other specialties.
Collapse
Affiliation(s)
- E Baudou
- Unité de Neurologie Pédiatrique, Hôpital des Enfants, CHU de Toulouse, Toulouse, France
| | | | - I Tack
- Explorations Fonctionnelles Physiologiques, Hôpital Rangueil, CHU de Toulouse, Toulouse, France
| | - F Muscari
- Unité de Chirurgie Digestive, CHU de Toulouse, Toulouse, France
| | - I Claudet
- Unité d'Urgences et Infectiologie Pédiatrique, Hôpital des Enfants, CHU de Toulouse, Toulouse, France
| | - E Mas
- Unité de Gastroentérologie, Hépatologie, Nutrition, Diabétologie et Maladies Héréditaires du Métabolisme, Hôpital des Enfants, CHU de Toulouse, F-31300, IRSD, Université de Toulouse, INSERM, INRAE, ENVT, UPS, Toulouse, France
| | - A Taillefer
- Unité d'Urgences et Infectiologie Pédiatrique, Hôpital des Enfants, CHU de Toulouse, Toulouse, France
| | - S Breinig
- Unité de Réanimation Pédiatrique, Hôpital des Enfants, CHU de Toulouse, Toulouse, France
| | - C Bréhin
- Unité d'Urgences et Infectiologie Pédiatrique, Hôpital des Enfants, CHU de Toulouse, Toulouse, France.
| |
Collapse
|
8
|
Ganesan S, Bhandary S, Thulasingam M, Chacko TV, Zayapragassarazan Z, Ravichandran S, Raja K, Ramasamy K, Alexander A, Penubarthi LK. Developing Script Concordance Test Items in Otolaryngology to Improve Clinical Reasoning Skills: Validation using Consensus Analysis and Psychometrics. Int J Appl Basic Med Res 2023; 13:64-69. [PMID: 37614842 PMCID: PMC10443453 DOI: 10.4103/ijabmr.ijabmr_604_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 02/19/2023] [Accepted: 04/15/2023] [Indexed: 08/25/2023] Open
Abstract
Background Script concordance testing is widely practiced to foster and assess clinical reasoning. Our study aimed to develop script concordance test (SCT) in the specialty of otolaryngology and test the validation using panel response pattern and consensus index. Materials and Methods The methodology was an evolving pattern of constructing SCTs, administering them to the panel members, and optimizing the panel with response patterns and consensus index. The SCT's final items were chosen to be administered to the students. Results We developed 98 items of SCT and administered them to 20 panel members. The mean score of the panel members for these 98 items was 79.5 (standard deviation [SD] = 4.4). The consensus index calculated for the 98-item SCT ranged from 25.81 to 100. Sixteen items had bimodal and uniform response patterns; the consensus index improved when eliminated. We administered the rest 82 items of SCT to 30 undergraduate and ten postgraduate students. The mean score of undergraduate students was 61.1 (SD = 7.5) and that of postgraduate students was 67.7 (SD = 6.3). Cronbach's alpha for the 82-item SCT was 0.74. Excluding the 22 poor items, the final SCT instrument of 60 items had a Cronbach's alpha of 0.82. Conclusion Our study revealed that a consensus index above 60 had a good item-total correlation and be used to optimize the items for panel responses in SCT, necessitating further studies on this aspect. Our study also revealed that the panel response clustering pattern could be used to categorize the items, although bimodal and uniform distribution patterns need further differentiation.
Collapse
Affiliation(s)
- Sivaraman Ganesan
- Department of ENT, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| | - Shital Bhandary
- Department of Public Health and Medical Education, Patan Academy of Health Sciences, Lalitpur, Nepal, India
| | - Mahalakshmy Thulasingam
- Department of Preventive and Social Medicine, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| | - Thomas Vengail Chacko
- Department of Community Medicine, Believers Church Medical College Hospital, Thiruvalla, Kerala, India
| | - Z. Zayapragassarazan
- Department of Medical Education, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| | - Surya Ravichandran
- Department of ENT, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| | - Kalaiarasi Raja
- Department of ENT, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| | - Karthikeyan Ramasamy
- Department of ENT, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| | - Arun Alexander
- Department of ENT, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| | - Lokesh Kumar Penubarthi
- Department of ENT, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India
| |
Collapse
|
9
|
Pusic MV, Cook DA, Friedman JL, Lorin JD, Rosenzweig BP, Tong CK, Smith S, Lineberry M, Hatala R. Modeling Diagnostic Expertise in Cases of Irreducible Uncertainty: The Decision-Aligned Response Model. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2023; 98:88-97. [PMID: 36576770 PMCID: PMC9780042 DOI: 10.1097/acm.0000000000004918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
PURPOSE Assessing expertise using psychometric models usually yields a measure of ability that is difficult to generalize to the complexity of diagnoses in clinical practice. However, using an item response modeling framework, it is possible to create a decision-aligned response model that captures a clinician's decision-making behavior on a continuous scale that fully represents competing diagnostic possibilities. In this proof-of-concept study, the authors demonstrate the necessary statistical conceptualization of this model using a specific electrocardiogram (ECG) example. METHOD The authors collected a range of ECGs with elevated ST segments due to either ST-elevation myocardial infarction (STEMI) or pericarditis. Based on pilot data, 20 ECGs were chosen to represent a continuum from "definitely STEMI" to "definitely pericarditis," including intermediate cases in which the diagnosis was intentionally unclear. Emergency medicine and cardiology physicians rated these ECGs on a 5-point scale ("definitely STEMI" to "definitely pericarditis"). The authors analyzed these ratings using a graded response model showing the degree to which each participant could separate the ECGs along the diagnostic continuum. The authors compared these metrics with the discharge diagnoses noted on chart review. RESULTS Thirty-seven participants rated the ECGs. As desired, the ECGs represented a range of phenotypes, including cases where participants were uncertain in their diagnosis. The response model showed that participants varied both in their propensity to diagnose one condition over another and in where they placed the thresholds between the 5 diagnostic categories. The most capable participants were able to meaningfully use all categories, with precise thresholds between categories. CONCLUSIONS The authors present a decision-aligned response model that demonstrates the confusability of a particular ECG and the skill with which a clinician can distinguish 2 diagnoses along a continuum of confusability. These results have broad implications for testing and for learning to manage uncertainty in diagnosis.
Collapse
Affiliation(s)
- Martin V. Pusic
- M.V. Pusic is associate professor of emergency medicine, Departments of Pediatrics and Emergency Medicine, Harvard Medical School, Boston, Massachusetts; ORCID: https://orcid.org/0000-0001-5236-6598
| | - David A. Cook
- D.A. Cook is professor of medicine and medical education, chair, Mayo Clinic Multidisciplinary Simulation Center Research Committee, and consultant, Division of General Internal Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota; ORCID: https://orcid.org/0000-0003-2383-4633
| | - Julie L. Friedman
- J.L. Friedman is assistant professor of clinical medicine, Department of Medicine, Weill Cornell Medical College, New York, New York
| | - Jeffrey D. Lorin
- J.D. Lorin is assistant professor, Department of Medicine, NYU Grossman School of Medicine, New York, New York
| | - Barry P. Rosenzweig
- B.P. Rosenzweig is associate professor, Department of Medicine, associate director for educational affairs, Leon H. Charney Division of Cardiology, and assistant dean for graduate medical education, NYU Grossman School of Medicine, New York, New York
| | - Calvin K.W. Tong
- C.K.W. Tong is cardiologist and codirector, Heart Failure Services, Surrey Memorial Hospital, Surrey, British Columbia, Canada
| | - Silas Smith
- S. Smith is associate professor of emergency medicine, Department of Emergency Medicine, NYU Grossman School of Medicine, New York, New York
| | - Matthew Lineberry
- M. Lineberry is associate professor of population health, Department of Population Health, University of Kansas Medical Center and Health System, Kansas City, Kansas; ORCID: https://orcid.org/0000-0002-0177-5305
| | - Rose Hatala
- R. Hatala is professor, Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada; ORCID: https://orcid.org/0000-0003-0521-2590
| |
Collapse
|
10
|
Tayce JD, Saunders AB. The Use of a Modified Script Concordance Test in Clinical Rounds to Foster and Assess Clinical Reasoning Skills. JOURNAL OF VETERINARY MEDICAL EDUCATION 2022; 49:556-559. [PMID: 34784257 DOI: 10.3138/jvme-2021-0090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The development of clinical reasoning skills is a high priority during clinical service, but an unpredictable case load and limited time for formal instruction makes it challenging for faculty to foster and assess students' individual clinical reasoning skills. We developed an assessment for learning activity that helps students build their clinical reasoning skills based on a modified version of the script concordance test (SCT). To modify the standard SCT, we simplified it by limiting students to a 3-point Likert scale instead of a 5-point scale and added a free-text box for students to provide justification for their answer. Students completed the modified SCT during clinical rounds to prompt a group discussion with the instructor. Student feedback was positive, and the instructor gained valuable insight into the students' thought process. A modified SCT can be adopted as part of a multimodal approach to teaching on the clinic floor. The purpose of this article is to describe our modifications to the standard SCT and findings from implementation in a clinical rounds setting as a method of formative assessment for learning and developing clinical reasoning skills.
Collapse
|
11
|
Khan RN, Siddiqui NA. The Use of Formative Assessment in Postgraduate Urology Training: A Systematic Review. Cureus 2022; 14:e27162. [PMID: 36017282 PMCID: PMC9393543 DOI: 10.7759/cureus.27162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/22/2022] [Indexed: 11/05/2022] Open
Abstract
Formative assessment is an essential component of surgical training. However, it is not usually a mandatory component in postgraduate curricula. The purpose of this study is to identify and evaluate how formative assessments are integrated into postgraduate urology training in programs across the globe. This study consisted of a systemic review to see how formative assessments are being implemented in various urology programs globally. A total of 427 articles were identified for the literature review. Of these, only 10 were included and critically appraised. These studies explored various techniques for exploration of formative assessments in urology training programs, which included established tools, such as portfolio reviews, and direct observations of procedure skills (DOPS); novel tools, including the Dutch urology practical skills (D-UPS) program and Ottawa surgical competency operating room evaluation (O-SCORE); and curricular models. Nine of the 10 articles favored their potential utility in formative assessments. Current literature involving formative assessments in postgraduate urology programs is scarce, and available resources have a high heterogeneity between them. More structured formative assessments need to be incorporated into surgical training programs, and affiliated training institutions should be encouraged to integrate them into their curricula.
Collapse
|
12
|
Kün-Darbois JD, Annweiler C, Lerolle N, Lebdai S. Script concordance test acceptability and utility for assessing medical students' clinical reasoning: a user's survey and an institutional prospective evaluation of students' scores. BMC MEDICAL EDUCATION 2022; 22:277. [PMID: 35418078 PMCID: PMC9008989 DOI: 10.1186/s12909-022-03339-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 04/01/2022] [Indexed: 06/14/2023]
Abstract
Script Concordance Testing (SCT) is a method for clinical reasoning assessment in the field of health-care training. Our aim was to assess SCT acceptability and utility with a survey and an institutional prospective evaluation of students' scores.With a user's online survey, we collected the opinions and satisfaction data of all graduate students and teachers involved in the SCT setting. We performed a prospective analysis comparing the scores obtained with SCT to those obtained with the national standard evaluation modality.General opinions about SCT were mostly negative. Students tended to express more negative opinions and perceptions. There was a lower proportion of negative responses in the teachers' satisfaction survey. The proportion of neutral responses was higher for teachers. There was a higher proportion of positive positions towards all questions among teachers. PCC scores significantly increased each year, but SCT scores increased only between the first and second tests. PCC scores were found significantly higher than SCT scores for the second and third tests. Medical students' and teachers' global opinion on SCT was negative. At the beginning SCT scores were found quite similar to PCC scores. There was a higher progression for PCC scores through time.
Collapse
Affiliation(s)
- Jean-Daniel Kün-Darbois
- Maxillofacial Surgery Department, University Hospital of Angers, 49933, Angers Cedex, France.
- Faculty for Health Sciences and Medicine, University of Angers, Angers, Angers, France.
| | - Cédric Annweiler
- Faculty for Health Sciences and Medicine, University of Angers, Angers, Angers, France
- Geriatric Department, University Hospital of Angers, Angers, France
| | - Nicolas Lerolle
- Faculty for Health Sciences and Medicine, University of Angers, Angers, Angers, France
- Intensive Care Department, University Hospital of Angers, Angers, France
| | - Souhil Lebdai
- Faculty for Health Sciences and Medicine, University of Angers, Angers, Angers, France
- Urology Department, University Hospital of Angers, Angers, France
| |
Collapse
|
13
|
Patel R. General practice trainees’ learning experiences of formative think-aloud script concordance testing. EDUCATION FOR PRIMARY CARE 2022; 33:229-236. [DOI: 10.1080/14739879.2022.2057240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Rajan Patel
- Academic Clinic Fellow, Nuffield Department of Primary Care Health Sciences, University of Oxford Medical Sciences Division, Radcliffe Primary Care Building, Radcliffe Observatory Quarter, Oxford, United Kingdom
| |
Collapse
|
14
|
Kelkar A, Bhandary S, Chacko T. Addressing the need to develop critical thinking skills in the new competency-based medical education post graduate curriculum in pathology: Experience-sharing of the process of development and validation of script concordance test. ARCHIVES OF MEDICINE AND HEALTH SCIENCES 2022. [DOI: 10.4103/amhs.amhs_227_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
|
15
|
Bryant GA, Dy-Boarman EA, Herring MS, Witry MJ. Use of a script concordance test to evaluate the impact of a targeted educational strategy on clinical reasoning in advanced pharmacy practice experiential students. CURRENTS IN PHARMACY TEACHING & LEARNING 2021; 13:1024-1031. [PMID: 34294243 DOI: 10.1016/j.cptl.2021.06.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 01/27/2021] [Accepted: 06/09/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND PURPOSE It is unclear how clinical reasoning is impacted by a single advanced pharmacy practice experience (APPE) and how preceptors can further develop these skills. EDUCATIONAL ACTIVITY AND SETTING Students completing an APPE within four sites were invited to participate. To assess clinical reasoning skills, students completed a 30 item script concordance test (SCT) during week 1 and week 5 of a rotation. Students were divided into control and intervention groups. The intervention group participated in a clinical reasoning discussion, during which students presented a case and led a discussion on how to reason through treatment options. FINDINGS Change in mean SCT scores between week 1 and week 5 were 0.84 (2.8%) and 1.23 (4.1%) in the control (n = 15) and intervention groups (n = 28), respectively. There was no significant change in scores in the control group (P = .07, CI -0.34, 2.01). The change in scores was statistically significant in the intervention group (P = .02, CI 0.23, 2.23). An independent samples t-test comparing the SCT score change for the control and intervention group showed no significant difference (P = .62, CI -1.18, 1.96). SUMMARY This study demonstrated the feasibility of implementing a SCT in experiential education. SCT scores did not significantly improve beyond the standard APPE in response to the focused educational intervention, but investigators found that the discussion facilitated rich conversations about patient cases and was valuable for assessing a student's thinking pattern.
Collapse
Affiliation(s)
- Ginelle A Bryant
- Department of Clinical Sciences, Drake University College of Pharmacy and Health Sciences, 2507 University Avenue, Des Moines, IA 50311-4505, United States.
| | - Eliza A Dy-Boarman
- Department of Clinical Sciences, Drake University College of Pharmacy and Health Sciences, 2507 University Avenue, Des Moines, IA 50311-4505, United States.
| | - Morgan S Herring
- Department of Pharmacy Practice and Science, Division of Applied Clinical Sciences, University of Iowa College of Pharmacy, 180 South Grand Avenue, Iowa City, Iowa 52242, United States.
| | - Matthew J Witry
- Department of Pharmacy Practice and Science, Division of Health Services Research, University of Iowa College of Pharmacy, 180 South Grand Avenue 342 CPB, Iowa City, Iowa 52242, United States.
| |
Collapse
|
16
|
|
17
|
Gawad N, Wood TJ, Malvea A, Cowley L, Raiche I. The Impact of Surgeon Experience on Script Concordance Test Scoring. J Surg Res 2021; 265:265-271. [PMID: 33964636 DOI: 10.1016/j.jss.2021.03.057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 03/24/2021] [Accepted: 03/29/2021] [Indexed: 10/21/2022]
Abstract
OBJECTIVE The Script Concordance Test (SCT) is a test of clinical decision-making that relies on an expert panel to create its scoring key. Existing literature demonstrates the value of specialty-specific experts, but the effect of experience among the expert panel is unknown. The purpose of this study was to explore the role of surgeon experience in SCT scoring. DESIGN An SCT was administered to 29 general surgery residents and 14 staff surgeons. Staff surgeons were stratified as either junior or senior experts based on years since completing residency training (<15 versus >25 years). The SCT was scored using the full expert panel, the senior panel, the junior panel, and a subgroup junior panel in practice <5 years. A one-way ANOVA was used to compare the scores of first (R1) and fifth (R5) year residents using each scoring scheme. Cognitive interviews were analyzed for differences between junior and senior expert panelist responses. RESULTS There was no statistically significant difference between the mean score of six R1s and five R5s using the full expert panel (R1 69.08 versus R5 67.06, F1,9 = 0.10, P = 0.76), the junior panel (R1 66.73 versus R5 62.50, F1,9 = 0.35, P = 0.57), or the subgroup panel in practice <5 years (R1 61.07 versus R5 58.79, F1,9 = 0.18, P = 0.75). However, the average score of R1s was significantly lower than R5s when using the senior faculty panel (R1 52.04 versus R5 63.26, F1,9 = 26.90, P = 0.001). Cognitive interview data suggests that some responses of junior experts demonstrate less confidence than those of senior experts. CONCLUSIONS SCT scores are significantly affected by the responses of the expert panel. Expert differences between first and fifth year residents were only demonstrated when using an expert panel consisting of senior faculty members. Confidence may play a role in the response selections of junior experts. When constructing an SCT expert panel, consideration must be given to the experience of panel members.
Collapse
Affiliation(s)
- Nada Gawad
- Division of General Surgery, Department of Surgery, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada; Department of Innovation in Medical Education (DIME), University of Ottawa, Ottawa, Ontario, Canada.
| | - Timothy J Wood
- Department of Innovation in Medical Education (DIME), University of Ottawa, Ottawa, Ontario, Canada
| | - Anahita Malvea
- Division of General Surgery, Department of Surgery, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Lindsay Cowley
- Department of Innovation in Medical Education (DIME), University of Ottawa, Ottawa, Ontario, Canada
| | - Isabelle Raiche
- Division of General Surgery, Department of Surgery, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada; Department of Innovation in Medical Education (DIME), University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
18
|
Ottolini MC, Chua I, Campbell J, Ottolini M, Goldman E. Pediatric Hospitalists' Performance and Perceptions of Script Concordance Testing for Self-Assessment. Acad Pediatr 2021; 21:252-258. [PMID: 33065290 DOI: 10.1016/j.acap.2020.10.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 09/25/2020] [Accepted: 10/10/2020] [Indexed: 12/25/2022]
Abstract
OBJECTIVES The cognitive expertise of Pediatric Hospitalists (PH) lies not in standard knowledge but in making decisions under conditions of uncertainty. To maintain expertise, PH should engage in deliberate practice via self-assessments that promote higher-level cognitive processes necessary to address problems with missing or ambiguous information. Higher levels of cognition are purported with Script Concordance Test (SCT) questions compared to Multiple Choice Questions (MCQ). To determine if PH use higher levels of cognition when answering SCT versus MCQ questions and to analyze participants' perceptions of the utility of using SCT self-assessment for deliberate practice in addressing clinical problems encountered in daily practice. METHODS This is a mixed methods study comparing the cognitive level expressed according to Bloom's Taxonomy by PH answering MCQ versus SCT questions using a "think aloud" (TA) exercise, followed by qualitative analysis of interviews conducted afterward. RESULTS A significantly greater percentage of comments were coded as higher cognitive processes (apply, analyze, evaluate, and create) for SCT versus MCQ (74% vs 19%) compared with lower order (remember, understand); chi-square P < .00001. Analysis of interviews revealed 6 themes. CONCLUSION SCT questions elicited higher level cognition essential to clinical reasoning compared to MCQ questions. PH-indicated MCQ questions measure standard knowledge, while SCT questions better measure decision-making under conditions of uncertainty. PH-perceived SCT could be useful for deliberate practice in Pediatric Hospital Medicine decision-making if they could compare their rationale in answering questions with that of experts.
Collapse
Affiliation(s)
- Mary C Ottolini
- Department of Pediatrics, The Barbara Bush Children's Hospital at Maine Medical Center (MC Ottolini), Portland, Maine.
| | - Ian Chua
- Department of Pediatrics, Children's National Medical Center, George Washington University School of Medicine and Health Sciences (I Chua and J Campbell), Washington, DC
| | - Joyce Campbell
- Department of Pediatrics, Children's National Medical Center, George Washington University School of Medicine and Health Sciences (I Chua and J Campbell), Washington, DC
| | - Martin Ottolini
- Department of Pediatrics, Uniformed Services University of the Health Sciences (M Ottolini), Bethesda, Md
| | - Ellen Goldman
- George Washington University Graduate School of Education and Human Development, George Washington University School of Medicine and Health Sciences (E Goldman), Washington, DC
| |
Collapse
|
19
|
Gawad N, Wood TJ, Cowley L, Raiche I. How do cognitive processes influence script concordance test responses? MEDICAL EDUCATION 2021; 55:354-364. [PMID: 33185303 DOI: 10.1111/medu.14416] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Revised: 10/15/2020] [Accepted: 11/09/2020] [Indexed: 06/11/2023]
Abstract
INTRODUCTION The script concordance test (SCT) is a test of clinical decision-making (CDM) that compares the thought process of learners to that of experts to determine to what extent their cognitive 'scripts' align. Without understanding test-takers' cognitive process, however, it is unclear what influences their responses. The objective of this study was to gather response process validity evidence by studying the cognitive process of test-takers to determine whether the SCT tests CDM and what cognitive processes may influence SCT responses. METHODS Cases from an SCT used in a national validation study were administered and semi-structured cognitive interviews were conducted with ten residents and five staff surgeons. A retrospective verbal probing technique was used. Data was independently analysed and coded by two analysts. Themes were identified as factors that influence SCT responses during the cognitive interview. RESULTS Cognitive interviews demonstrated variability in CDM among test-takers. Consistent with dual process theory, test-takers relied on scripts formed through past experiences, when available, to make decisions and used conscious deliberation in the absence of experience. However, test-takers' response process was also influenced by their comprehension of specific terms, desire for additional information, disagreement with the planned management, underlying knowledge gaps and desire to demonstrate confidence or humility. CONCLUSION The rationale behind SCT answers may be influenced by comprehension, underlying knowledge and social desirability in addition to formed scripts and/or conscious deliberation. Having test-takers verbalise their rationale for responses provides a depth of assessment that is otherwise lost in the SCT's current format. With the improved ability to standardise CDM assessment using the SCT, consideration of test-makers improving the SCT construction process and combining the SCT question format with verbal responses may improve the use of the SCT for CDM assessment.
Collapse
Affiliation(s)
- Nada Gawad
- Division of General Surgery, Department of Surgery, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
- Department of Innovation in Medical Education (DIME), University of Ottawa, Ottawa, ON, Canada
| | - Timothy J Wood
- Department of Innovation in Medical Education (DIME), University of Ottawa, Ottawa, ON, Canada
| | - Lindsay Cowley
- Department of Innovation in Medical Education (DIME), University of Ottawa, Ottawa, ON, Canada
| | - Isabelle Raiche
- Division of General Surgery, Department of Surgery, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
- Department of Innovation in Medical Education (DIME), University of Ottawa, Ottawa, ON, Canada
| |
Collapse
|
20
|
Cohen Aubart F, Papo T, Hertig A, Renaud MC, Steichen O, Amoura Z, Braun M, Palombi O, Duguet A, Roux D. Are script concordance tests suitable for the assessment of undergraduate students? A multicenter comparative study. Rev Med Interne 2020; 42:243-250. [PMID: 33288231 DOI: 10.1016/j.revmed.2020.11.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 10/04/2020] [Accepted: 11/08/2020] [Indexed: 11/20/2022]
Abstract
INTRODUCTION Script concordance tests (SCTs) have been developed to assess clinical reasoning in uncertain situations. Their reliability for the evaluation of undergraduate medical students has not been evaluated. METHODS Twenty internal medicine SCT cases were implemented in undergraduate students of two programs. The results obtained on the SCTs were compared to those obtained by the same students on clinical-based classical multiple-choice questions (MCQs). RESULTS A total of 551/883 students (62%) answered the SCTs. The mean aggregate score (based on a total 20 points) was 11.54 (3.29). The success rate and mean score for each question did not differ depending on the modal response but the discrimination rate did. The results obtained by the students on the SCT test correlated with their scores on the MCQ tests. Among students, 446/517 (86%) considered the SCTs to be more difficult than classical MCQs, although the mean score did not differ between the SCT and MCQ tests. CONCLUSION The use of SCTs is a feasible option for the evaluation of undergraduate students. The SCT scores correlated with those obtained on classical MCQ tests.
Collapse
Affiliation(s)
- F Cohen Aubart
- Service de médecine interne 2, Centre national de référence maladies systémiques rares et histiocytoses, hôpital Pitié-Salpêtrière, Sorbonne université, Assistance publique-Hôpitaux de Paris, 75013 Paris, France.
| | - T Papo
- Département de médecine interne, hôpital Bichat, université de Paris, Assistance publique-Hôpitaux de Paris, 75018 Paris, France
| | - A Hertig
- Service de néphrologie et transplantation rénale, hôpital Pitié-Salpêtrière, Sorbonne université, Assistance publique-Hôpitaux de Paris, 75013 Paris, France
| | - M-C Renaud
- Faculté de médecine, Sorbonne université, 75013 Paris, France
| | - O Steichen
- Service de médecine interne, hôpital Tenon, Sorbonne université, Assistance publique-Hôpitaux de Paris, 75020 Paris, France
| | - Z Amoura
- Service de médecine interne 2, Centre national de référence maladies systémiques rares et histiocytoses, hôpital Pitié-Salpêtrière, Sorbonne université, Assistance publique-Hôpitaux de Paris, 75013 Paris, France
| | - M Braun
- Service de neuroradiologie, université de Lorraine, CHRU de Nancy, 54035 Nancy, France
| | - O Palombi
- Service de neurochirurgie, université Grenoble Alpes, CHU de Grenoble, 38000 Grenoble, France
| | - A Duguet
- Service de Pneumologie, hôpital Pitié-Salpêtrière, Sorbonne université, Assistance publique-Hôpitaux de Paris, 75013 Paris, France
| | - D Roux
- Service de médecine intensive réanimation, hôpital Louis-Mourier, université de Paris, Assistance publique-Hôpitaux de Paris, 92700 Colombes, France; Inserm, IAME, UMR-1137, 75018 Paris, France
| |
Collapse
|
21
|
Schuwirth LWT, Durning SJ, King SM. Assessment of clinical reasoning: three evolutions of thought. Diagnosis (Berl) 2020; 7:191-196. [PMID: 32182208 DOI: 10.1515/dx-2019-0096] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 02/12/2020] [Indexed: 02/17/2024]
Abstract
Although assessing clinical reasoning is almost universally considered central to medical education it is not a straightforward issue. In the past decades, our insights into clinical reasoning as a phenomenon, and consequently the best ways to assess it, have undergone significant changes. In this article, we describe how the interplay between fundamental research, practical applications, and evaluative research has pushed the evolution of our thinking and our practices in assessing clinical reasoning.
Collapse
Affiliation(s)
- Lambert W T Schuwirth
- Prideaux Centre for Research in Health Professions Education, Flinders University, Adelaide, South Australia, Australia
| | | | - Svetlana M King
- Prideaux Centre for Research in Health Professions Education, Flinders University, Adelaide, South Australia, Australia
| |
Collapse
|
22
|
Steinberg E, Cowan E, Lin MP, Sielicki A, Warrington S. Assessment of Emergency Medicine Residents' Clinical Reasoning: Validation of a Script Concordance Test. West J Emerg Med 2020; 21:978-984. [PMID: 32726273 PMCID: PMC7390545 DOI: 10.5811/westjem.2020.3.46035] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 03/23/2020] [Indexed: 11/11/2022] Open
Abstract
INTRODUCTION A primary aim of residency training is to develop competence in clinical reasoning. However, there are few instruments that can accurately, reliably, and efficiently assess residents' clinical decision-making ability. This study aimed to externally validate the script concordance test in emergency medicine (SCT-EM), an assessment tool designed for this purpose. METHODS Using established methodology for the SCT-EM, we compared EM residents' performance on the SCT-EM to an expert panel of emergency physicians at three urban academic centers. We performed adjusted pairwise t-tests to compare differences between all residents and attending physicians, as well as among resident postgraduate year (PGY) levels. We tested correlation between SCT-EM and Accreditation Council for Graduate Medical Education Milestone scores using Pearson's correlation coefficients. Inter-item covariances for SCT items were calculated using Cronbach's alpha statistic. RESULTS The SCT-EM was administered to 68 residents and 13 attendings. There was a significant difference in mean scores among all groups (mean + standard deviation: PGY-1 59 + 7; PGY-2 62 + 6; PGY-3 60 + 8; PGY-4 61 + 8; 73 + 8 for attendings, p < 0.01). Post hoc pairwise comparisons demonstrated that significant difference in mean scores only occurred between each PGY level and the attendings (p < 0.01 for PGY-1 to PGY-4 vs attending group). Performance on the SCT-EM and EM Milestones was not significantly correlated (r = 0.12, p = 0.35). Internal reliability of the exam was determined using Cronbach's alpha, which was 0.67 for all examinees, and 0.89 in the expert-only group. CONCLUSION The SCT-EM has limited utility in reliably assessing clinical reasoning among EM residents. Although the SCT-EM was able to differentiate clinical reasoning ability between residents and expert faculty, it did not between PGY levels, or correlate with Milestones scores. Furthermore, several limitations threaten the validity of the SCT-EM, suggesting further study is needed in more diverse settings.
Collapse
Affiliation(s)
- Eric Steinberg
- St. Joseph's University Medical Center, Department of Emergency Medicine, Paterson, New Jersey
| | - Ethan Cowan
- Mount Sinai Beth Israel, Icahn School of Medicine at Mount Sinai, Department of Emergency Medicine, New York, New York
| | - Michelle P Lin
- Mount Sinai Beth Israel, Icahn School of Medicine at Mount Sinai, Department of Emergency Medicine, New York, New York
| | - Anthony Sielicki
- Mount Sinai Beth Israel, Icahn School of Medicine at Mount Sinai, Department of Emergency Medicine, New York, New York
| | - Steven Warrington
- Orange Park Medical Center, Department of Emergency Medicine, Orange Park, Florida
| |
Collapse
|
23
|
Gawad N, Wood TJ, Cowley L, Raiche I. The cognitive process of test takers when using the script concordance test rating scale. MEDICAL EDUCATION 2020; 54:337-347. [PMID: 31912562 DOI: 10.1111/medu.14056] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 12/24/2019] [Accepted: 01/02/2020] [Indexed: 06/10/2023]
Abstract
CONTEXT Clinical decision making (CDM) skills are important to learn and assess in order to establish competence in trainees. A common tool for assessing CDM is the script concordance test (SCT), which asks test takers to indicate how a new clinical finding influences a proposed plan using a Likert-type scale. Most criticisms of the SCT relate to its rating scale but are largely theoretical. The cognitive process of test takers when selecting their responses using the SCT rating scale remains understudied, but is essential to gathering validity evidence for use of the SCT in CDM assessment. METHODS Cases from an SCT used in a national validation study were administered to 29 residents and 14 staff surgeons. Semi-structured cognitive interviews were then conducted with 10 residents and five staff surgeons based on the SCT results. Cognitive interview data were independently coded by two data analysts, who specifically sought to elucidate how participants mapped their internally generated responses to any of the rating scale options. RESULTS Five major issues were identified with the response matching cognitive process: (a) the meaning of the '0' response option; (b) which response corresponds to agreement with the planned management; (c) the rationale for picking '±1' versus '±2'; (d) which response indicates the desire to undertake the planned management plus an additional procedure, and (e) the influence of time on response selection. CONCLUSIONS Studying how test takers (experts and trainees) interpret the SCT rating scale has revealed several issues related to inconsistent and unintended use. Revising the scale to address the variety of interpretations could help to improve the response process validity of the SCT and therefore improve the SCT's ability to be used in CDM skills assessments.
Collapse
Affiliation(s)
- Nada Gawad
- Division of General Surgery, Department of Surgery, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada
- Department of Innovation in Medical Education, University of Ottawa, Ottawa, Ontario, Canada
| | - Timothy J Wood
- Department of Innovation in Medical Education, University of Ottawa, Ottawa, Ontario, Canada
| | - Lindsay Cowley
- Department of Innovation in Medical Education, University of Ottawa, Ottawa, Ontario, Canada
| | - Isabelle Raiche
- Division of General Surgery, Department of Surgery, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada
- Department of Innovation in Medical Education, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
24
|
Monteiro SD, Sherbino J, Schmidt H, Mamede S, Ilgen J, Norman G. It's the destination: diagnostic accuracy and reasoning. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2020; 25:19-29. [PMID: 31332589 DOI: 10.1007/s10459-019-09903-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Accepted: 07/09/2019] [Indexed: 06/10/2023]
Abstract
While multiple theories exist to explain the diagnostic process, there are few available assessments that reliably determine diagnostic competence in trainees. Most methods focus on aspects of the process of diagnostic reasoning, such as the relation between case features and diagnostic hypotheses. Inevitably, detailed elucidation of aspects of the process requires substantial time per case and limits the number of cases that can be examined given a limited testing time. Shifting assessment to the outcome of diagnostic reasoning, accuracy of the diagnosis, may serve as a reliable measure of diagnostic competence and would allow increased sampling across cases. The present study is a retrospective analysis of 7 large studies, conducted by 3 research teams, that all used a series of brief written cases to examine the outcome of diagnostic reasoning-the diagnosis. The studies involved over 600 clinicians ranging from final year medical students to practicing emergency physicians. For 4 studies with usable reliability data, reliability for a 2 h test ranged from .63 to .94. On average speeded tests were more reliable (.85 vs. .73).To achieve a reliability of .75 required an average test time of 1.11 h for speeded tests and 1.99 for unspeeded tests. The measure was shown to be positively correlated with both written knowledge tests and measures of problem solving derived from OSCE performance tests. This retrospective analysis provides evidence to support the implementation of outcome-based assessments of clinical reasoning.
Collapse
Affiliation(s)
- Sandra D Monteiro
- Department of Health Research Methods, Evidence and Impact, McMaster University, 1280 Main Street West, Hamilton, ON, L8S 4L8, Canada.
- McMaster Faculty of Health Sciences Program in Education Research, Innovation and Theory (MERIT), McMaster University, Hamilton, ON, Canada.
| | - Jonathan Sherbino
- McMaster Faculty of Health Sciences Program in Education Research, Innovation and Theory (MERIT), McMaster University, Hamilton, ON, Canada
- Division of Emergency Medicine, McMaster Faculty of Health Sciences Program in Education Research, Innovation and Theory (MERIT), McMaster University, Hamilton, ON, Canada
| | - Henk Schmidt
- Department of Psychology, Erasmus School of Social and Behavioural Sciences, Erasmus University, Rotterdam, The Netherlands
- Institute of Medical Education Research Rotterdam, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Silvia Mamede
- Department of Psychology, Erasmus School of Social and Behavioural Sciences, Erasmus University, Rotterdam, The Netherlands
- Institute of Medical Education Research Rotterdam, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Jonathan Ilgen
- Division of Emergency Medicine, School of Medicine, Center for the Leadership and Innovation in Medical Education, University of Washington, Seattle, WA, USA
| | - Geoff Norman
- Department of Health Research Methods, Evidence and Impact, McMaster University, 1280 Main Street West, Hamilton, ON, L8S 4L8, Canada
- McMaster Faculty of Health Sciences Program in Education Research, Innovation and Theory (MERIT), McMaster University, Hamilton, ON, Canada
| |
Collapse
|
25
|
van der Vleuten CPM, Schuwirth LWT. Assessment in the context of problem-based learning. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2019; 24:903-914. [PMID: 31578642 PMCID: PMC6908559 DOI: 10.1007/s10459-019-09909-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 08/07/2019] [Indexed: 05/29/2023]
Abstract
Arguably, constructive alignment has been the major challenge for assessment in the context of problem-based learning (PBL). PBL focuses on promoting abilities such as clinical reasoning, team skills and metacognition. PBL also aims to foster self-directed learning and deep learning as opposed to rote learning. This has incentivized researchers in assessment to find possible solutions. Originally, these solutions were sought in developing the right instruments to measure these PBL-related skills. The search for these instruments has been accelerated by the emergence of competency-based education. With competency-based education assessment moved away from purely standardized testing, relying more heavily on professional judgment of complex skills. Valuable lessons have been learned that are directly relevant for assessment in PBL. Later, solutions were sought in the development of new assessment strategies, initially again with individual instruments such as progress testing, but later through a more holistic approach to the assessment program as a whole. Programmatic assessment is such an integral approach to assessment. It focuses on optimizing learning through assessment, while at the same gathering rich information that can be used for rigorous decision-making about learner progression. Programmatic assessment comes very close to achieving the desired constructive alignment with PBL, but its wide adoption-just like PBL-will take many years ahead of us.
Collapse
Affiliation(s)
- Cees P M van der Vleuten
- School of Health Professions Education, Faculty of Health, Medicine and Life Sciences, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands.
| | - Lambert W T Schuwirth
- Prideaux Centre for Research in Health Professions Education, College of Medicine and Public Health, Flinders University, Sturt Road, Bedford Park, SA, 5042, Australia
| |
Collapse
|
26
|
Thiessen N, Fischer MR, Huwendiek S. Assessment methods in medical specialist assessments in the DACH region - overview, critical examination and recommendations for further development. GMS JOURNAL FOR MEDICAL EDUCATION 2019; 36:Doc78. [PMID: 31844650 PMCID: PMC6905366 DOI: 10.3205/zma001286] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Revised: 07/29/2019] [Accepted: 09/04/2019] [Indexed: 06/01/2023]
Abstract
Introduction: Specialist medical assessments fulfil the task of ensuring that physicians have the clinical competence to independently represent their field and provide the best possible care to patients, taking into account the current state of knowledge. To date, there are no comprehensive reports on the status of specialist assessments in the German-speaking countries (DACH). For that reason, the assessment methods used in the DACH region are compiled and critically evaluated in this article, and recommendations for further development are described. Methods: The websites of the following institutions were searched for information regarding testing methods used and the organisation of specialist examinations: Homepage of the Swiss Institute for Medical Continuing Education (SIWF), Homepage of the Academy of Physicians (Austria) and Homepage of the German Federal Medical Association (BAEK). Further links were considered and the results were presented in tabular form. The assessment methods used in the specialist assessments are critically examined with regard to established quality criteria and recommendations for the further development of the specialist assessments are derived from these. Results: The following assessment methods are already used in Switzerland and Austria: written examinations with multiple choice and short answer questions, structured oral examinations, the Script Concordance Test (SCT) and the Objective Structured Clinical Examination (OSCE). In some cases, these assessment methods are combined (triangulation). In Germany, on the other hand, the oral examination has so far been conducted in an unstructured manner in the form of a 'collegial content discussion'. In order to test knowledge, practical and communicative competences equally, it is recommended to implement a triangulation of methods and follow the further recommendations described in this article. Conclusion: While there are already accepted approaches for quality-assured and competence-based specialist assessments in Switzerland and Austria at present, there is still a long way to go in Germany. Following the recommendations presented in this article, a contribution could be made to improving the specialist assessments in the DACH region according to the specialist assessments objectives.
Collapse
Affiliation(s)
- Nils Thiessen
- EDU - a degree smarter, Digital Education Holdings Ltd., Kalkara, Republic of Malta
| | - Martin R. Fischer
- LMU München, Klinikum der Universität München, Institut für Didaktik und Ausbildungsforschung in der Medizin, München, Germany
| | - Sören Huwendiek
- Universität Bern, Institut für Medizinische Lehre, Abteilung für Assessment und Evaluation, Bern, Switzerland
| |
Collapse
|
27
|
Wan MSH, Tor E, Hudson JN. Construct validity of script concordance testing: progression of scores from novices to experienced clinicians. INTERNATIONAL JOURNAL OF MEDICAL EDUCATION 2019; 10:174-179. [PMID: 31562807 PMCID: PMC6766395 DOI: 10.5116/ijme.5d76.1eee] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 09/09/2019] [Indexed: 06/10/2023]
Abstract
OBJECTIVES To investigate the construct validity of Script Concordance Testing (SCT) scores as a measure of the clinical reasoning ability of medical students and practising General Practitioners with different levels of clinical experience. METHODS Part I involved a cross-sectional study, where 105 medical students, 19 junior registrars and 13 experienced General Practitioners completed the same set of SCT questions, and their mean scores were compared using one-way ANOVA. In Part II, pooled and matched SCT scores for 5 cohorts of students (2012 to 2017) in Year 3 (N=584) and Year 4 (N=598) were retrospectively analysed for evidence of significant progression. RESULTS A significant main effect of clinical experience was observed [F(2, 136)=6.215, p=0.003]. The mean SCT score for General Practitioners (M=70.39, SD=4.41, N=13) was significantly higher (p=0.011) than that of students (M = 64.90, SD = 6.30, N=105). Year 4 students (M=68.90, SD= 7.79, N=584) scored a significantly higher mean score [t(552)=12.78, p<0.001] than Year 3 students (M = 64.03, SD=7.98, N=598). CONCLUSIONS The findings that candidate scores increased with increasing level of clinical experience add to current evidence in the international literature in support of the construct validity of Script Concordance Testing. Prospective longitudinal studies with larger sample sizes are recommended to further test and build confidence in the construct validity of SCT scores.
Collapse
Affiliation(s)
| | - Elina Tor
- School of Medicine, University of Notre Dame, Australia
| | - Judith N. Hudson
- Faculty of Health and Medical Sciences, University of Adelaide, Australia
| |
Collapse
|
28
|
Cook DA, Durning SJ, Sherbino J, Gruppen LD. Management Reasoning: Implications for Health Professions Educators and a Research Agenda. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2019; 94:1310-1316. [PMID: 31460922 DOI: 10.1097/acm.0000000000002768] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Substantial research has illuminated the clinical reasoning processes involved in diagnosis (diagnostic reasoning). Far less is known about the processes entailed in patient management (management reasoning), including decisions about treatment, further testing, follow-up visits, and allocation of limited resources. The authors' purpose is to articulate key differences between diagnostic and management reasoning, implications for health professions education, and areas of needed research.Diagnostic reasoning focuses primarily on classification (i.e., assigning meaningful labels to a pattern of symptoms, signs, and test results). Management reasoning involves negotiation of a plan and ongoing monitoring/adjustment of that plan. A diagnosis can usually be established as correct or incorrect, whereas there are typically multiple reasonable management approaches. Patient preferences, clinician attitudes, clinical contexts, and logistical constraints should not influence diagnosis, whereas management nearly always involves prioritization among such factors. Diagnostic classifications do not necessarily require direct patient interaction, whereas management prioritizations require communication and negotiation. Diagnoses can be defined at a single time point (given enough information), whereas management decisions are expected to evolve over time. Finally, management is typically more complex than diagnosis.Management reasoning may require educational approaches distinct from those used for diagnostic reasoning, including teaching distinct skills (e.g., negotiating with patients, tolerating uncertainty, and monitoring treatment) and developing assessments that account for underlying reasoning processes and multiple acceptable solutions.Areas of needed research include if and how cognitive processes differ for management and diagnostic reasoning, how and when management reasoning abilities develop, and how to support management reasoning in clinical practice.
Collapse
Affiliation(s)
- David A Cook
- D.A. Cook is professor of medicine and medical education, director of education science, Office of Applied Scholarship and Education Science, and consultant, Division of General Internal Medicine, Mayo Clinic College of Medicine and Science, Rochester, Minnesota; ORCID: http://orcid.org/0000-0003-2383-4633. S.J. Durning is professor of medicine and director, Division of Health Professions Education, Uniformed Services University of the Health Sciences, Bethesda, Maryland. J. Sherbino is assistant dean, Health Professions Education Research, Faculty of Health Sciences, and professor, Department of Medicine, McMaster University, Hamilton, Ontario, Canada. L.D. Gruppen is professor, Department of Learning Health Sciences, and director, Master of Health Professions Education Program, University of Michigan Medical School, Ann Arbor, Michigan
| | | | | | | |
Collapse
|
29
|
Lineberry M, Hornos E, Pleguezuelos E, Mella J, Brailovsky C, Bordage G. Experts' responses in script concordance tests: a response process validity investigation. MEDICAL EDUCATION 2019; 53:710-722. [PMID: 30779204 DOI: 10.1111/medu.13814] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 06/25/2018] [Accepted: 12/28/2018] [Indexed: 06/09/2023]
Abstract
CONTEXT The script concordance test (SCT), designed to measure clinical reasoning in complex cases, has recently been the subject of several critical research studies. Amongst other issues, response process validity evidence remains lacking. We explored the response processes of experts on an SCT scoring panel to better understand their seemingly divergent beliefs about how new clinical data alter the suitability of proposed actions within simulated patient cases. METHODS A total of 10 Argentine gastroenterologists who served as the expert panel on an existing SCT re-answered 15 cases 9 months after their original panel participation. They then answered questions probing their reasoning and reactions to other experts' perspectives. RESULTS The experts sometimes noted they would not ordinarily consider the actions proposed for the cases at all (30/150 instances [20%]) or would collect additional data first (54/150 instances [36%]). Even when groups of experts agreed about how new clinical data in a case affected the suitability of a proposed action, there was often disagreement (118/133 instances [89%]) about the suitability of the proposed action before the new clinical data had been introduced. Experts reported confidence in their responses, but showed limited consistency with the responses they had given 9 months earlier (linear weighted kappa = 0.33). Qualitative analyses showed nuanced and complex reasons behind experts' responses, revealing, for example, that experts often considered the unique affordances and constraints of their varying local practice environments when responding. Experts generally found other experts' alternative responses moderately compelling (mean ± standard deviation 2.93 ± 0.80 on a 5-point scale, where 3 = moderately compelling). Experts switched their own preferred responses after seeing others' reasoning in 30 of 150 (20%) instances. CONCLUSIONS Expert response processes were not consistent with the classical interpretation and use of SCT scores. However, several fruitful and justifiable alternatives for the use of SCT-like methods are proposed, such as to guide assessments for learning.
Collapse
Affiliation(s)
- Matthew Lineberry
- Zamierowski Institute for Experiential Learning, University of Kansas Medical Center and University of Kansas Health System, Kansas City, Kansas, USA
| | - Eduardo Hornos
- Practicum Institute of Applied Research in Health Sciences Education, Madrid, Spain
| | - Eduardo Pleguezuelos
- Practicum Institute of Applied Research in Health Sciences Education, Madrid, Spain
| | - Jose Mella
- Practicum Institute of Applied Research in Health Sciences Education, Madrid, Spain
| | | | - Georges Bordage
- Department of Medical Education, College of Medicine, University of Illinois at Chicago, Chicago, Illinois, USA
| |
Collapse
|
30
|
Wan SH, Tor E, Hudson JN. Commentary: expert responses in script concordance tests: a response process validity investigation. MEDICAL EDUCATION 2019; 53:644-646. [PMID: 30989693 DOI: 10.1111/medu.13889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Affiliation(s)
- Siu Hong Wan
- School of Medicine, University of Notre Dame, Sydney, New South Wales, Australia
| | - Elina Tor
- School of Medicine, University of Notre Dame, Sydney, New South Wales, Australia
| | - Judith N Hudson
- Faculty of Health and Medical Sciences, University of Adelaide, Adelaide, South Australia, Australia
| |
Collapse
|
31
|
Daniel M, Rencic J, Durning SJ, Holmboe E, Santen SA, Lang V, Ratcliffe T, Gordon D, Heist B, Lubarsky S, Estrada CA, Ballard T, Artino AR, Sergio Da Silva A, Cleary T, Stojan J, Gruppen LD. Clinical Reasoning Assessment Methods: A Scoping Review and Practical Guidance. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2019; 94:902-912. [PMID: 30720527 DOI: 10.1097/acm.0000000000002618] [Citation(s) in RCA: 125] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
PURPOSE An evidence-based approach to assessment is critical for ensuring the development of clinical reasoning (CR) competence. The wide array of CR assessment methods creates challenges for selecting assessments fit for the purpose; thus, a synthesis of the current evidence is needed to guide practice. A scoping review was performed to explore the existing menu of CR assessments. METHOD Multiple databases were searched from their inception to 2016 following PRISMA guidelines. Articles of all study design types were included if they studied a CR assessment method. The articles were sorted by assessment methods and reviewed by pairs of authors. Extracted data were used to construct descriptive appendixes, summarizing each method, including common stimuli, response formats, scoring, typical uses, validity considerations, feasibility issues, advantages, and disadvantages. RESULTS A total of 377 articles were included in the final synthesis. The articles broadly fell into three categories: non-workplace-based assessments (e.g., multiple-choice questions, extended matching questions, key feature examinations, script concordance tests); assessments in simulated clinical environments (objective structured clinical examinations and technology-enhanced simulation); and workplace-based assessments (e.g., direct observations, global assessments, oral case presentations, written notes). Validity considerations, feasibility issues, advantages, and disadvantages differed by method. CONCLUSIONS There are numerous assessment methods that align with different components of the complex construct of CR. Ensuring competency requires the development of programs of assessment that address all components of CR. Such programs are ideally constructed of complementary assessment methods to account for each method's validity and feasibility issues, advantages, and disadvantages.
Collapse
Affiliation(s)
- Michelle Daniel
- M. Daniel is assistant dean for curriculum and associate professor of emergency medicine and learning health sciences, University of Michigan Medical School, Ann Arbor, Michigan; ORCID: http://orcid.org/0000-0001-8961-7119. J. Rencic is associate program director of the internal medicine residency program and associate professor of medicine, Tufts University School of Medicine, Boston, Massachusetts; ORCID: http://orcid.org/0000-0002-2598-3299. S.J. Durning is director of graduate programs in health professions education and professor of medicine and pathology, Uniformed Services University of the Health Sciences, Bethesda, Maryland. E. Holmboe is senior vice president of milestone development and evaluation, Accreditation Council for Graduate Medical Education, and adjunct professor of medicine, Northwestern Feinberg School of Medicine, Chicago, Illinois; ORCID: http://orcid.org/0000-0003-0108-6021. S.A. Santen is senior associate dean and professor of emergency medicine, Virginia Commonwealth University, Richmond, Virginia; ORCID: http://orcid.org/0000-0002-8327-8002. V. Lang is associate professor of medicine, University of Rochester School of Medicine and Dentistry, Rochester, New York; ORCID: http://orcid.org/0000-0002-2157-7613. T. Ratcliffe is associate professor of medicine, University of Texas Long School of Medicine at San Antonio, San Antonio, Texas. D. Gordon is medical undergraduate education director, associate residency program director of emergency medicine, and associate professor of surgery, Duke University School of Medicine, Durham, North Carolina. B. Heist is clerkship codirector and assistant professor of medicine, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania. S. Lubarsky is assistant professor of neurology, McGill University, and faculty of medicine and core member, McGill Center for Medical Education, Montreal, Quebec, Canada; ORCID: http://orcid.org/0000-0001-5692-1771. C.A. Estrada is staff physician, Birmingham Veterans Affairs Medical Center, and director, Division of General Internal Medicine, and professor of medicine, University of Alabama, Birmingham, Alabama; ORCID: https://orcid.org/0000-0001-6262-7421. T. Ballard is plastic surgeon, Ann Arbor Plastic Surgery, Ann Arbor, Michigan. A.R. Artino Jr is deputy director for graduate programs in health professions education and professor of medicine, preventive medicine, and biometrics pathology, Uniformed Services University of the Health Sciences, Bethesda, Maryland; ORCID: http://orcid.org/0000-0003-2661-7853. A. Sergio Da Silva is senior lecturer in medical education and director of the masters in medical education program, Swansea University Medical School, Swansea, United Kingdom; ORCID: http://orcid.org/0000-0001-7262-0215. T. Cleary is chair, Applied Psychology Department, CUNY Graduate School and University Center, New York, New York, and associate professor of applied and professional psychology, Rutgers University, New Brunswick, New Jersey. J. Stojan is associate professor of internal medicine and pediatrics, University of Michigan Medical School, Ann Arbor, Michigan. L.D. Gruppen is director of the master of health professions education program and professor of learning health sciences, University of Michigan Medical School, Ann Arbor, Michigan; ORCID: http://orcid.org/0000-0002-2107-0126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Ten Cate O, Regehr G. The Power of Subjectivity in the Assessment of Medical Trainees. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2019; 94:333-337. [PMID: 30334840 DOI: 10.1097/acm.0000000000002495] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Objectivity in the assessment of students and trainees has been a hallmark of quality since the introduction of multiple-choice items in the 1960s. In medical education, this has extended to the structured examination of clinical skills and workplace-based assessment. Competency-based medical education, a pervasive movement that started roughly around the turn of the century, similarly calls for rigorous, objective assessment to ensure that all medical trainees meet standards to assure quality of health care. At the same time, measures of objectivity, such as reliability, have consistently shown disappointing results. This raises questions about the extent to which objectivity in such assessments can be ensured.In fact, the legitimacy of "objective" assessment of individual trainees, particularly in the clinical workplace, may be questioned. Workplaces are highly dynamic and ratings by observers are inherently subjective, as they are based on expert judgment, and experts do not always agree-for good, idiosyncratic, reasons. Thus, efforts to "objectify" these assessments may be problematically distorting the assessment process itself. In addition, "competence" must meet standards, but it is also context dependent.Educators are now arriving at the insight that subjective expert judgments by medical professionals are not only unavoidable but actually should be embraced as the core of assessment of medical trainees. This paper elaborates on the case for subjectivity in assessment.
Collapse
Affiliation(s)
- Olle Ten Cate
- O. ten Cate is professor of medical education and senior scientist, Center for Research and Development of Education, University Medical Center Utrecht, Utrecht, the Netherlands; ORCID: https://orcid.org/0000-0002-6379-8780. G. Regehr is professor, Department of Surgery, and associate director of research, Centre for Health Education Scholarship, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada; ORCID: http://orcid.org/0000-0002-3144-331X
| | | |
Collapse
|
33
|
Development and psychometrics of script concordance test (SCT) in midwifery. Med J Islam Repub Iran 2018; 32:75. [PMID: 30643750 PMCID: PMC6325274 DOI: 10.14196/mjiri.32.75] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Indexed: 11/18/2022] Open
Abstract
Background: Clinical reasoning plays an important role in the accurate diagnosis and treatment of diseases. Script Concordance test (SCT) is one of the tools that assess clinical reasoning skill. This study was conducted to determine the reliability and concurrent and predictive validity of SCT in assessing final lessons and gynecology exams of undergraduate midwifery students.
Methods: At first, 20 clinical scenarios followed by 3 questions were designed by 2 experienced midwives. Then, after examining the content validity, 15 scenarios were selected. The test was used for 55 midwifery students. The correlation of SCT results with grade point average (GPA) was measured. To evaluate the concurrent validity of SCT, the correlation between SCT scores and the final exam of the gynecology course was measured. To measure predictive validity, the correlation of SCT scores with comprehensive exams of midwifery was calculated. Data were analyzed using SPSS software. Descriptive statistics, Pearson correlation, and coefficient Cronbach's alpha were used for analysis. The test’s item difficulty level (IDL) and item discriminative index (IDI) were determined using Whitney and Sabers’ method.
Results: The internal reliability of the test (calculated using Cronbach’s alpha coefficient) was 0.74. All questions were positively correlated with the total score. The highest correlation coefficient was related to GPA and comprehensive test with the score of 0.91. The correlation coefficient between SCT and the final test (concurrent validity) was 0.654, and the correlation coefficient between SCT and comprehensive test (predictive validity) was 0.721. The range of item discriminative index and item difficulty level in this exam was 0.39-0.59 and 0.32-0.66, respectively.
Conclusion: SCT shows a relatively high internal validity and can predict the success rate of students in the comprehensive exams of midwifery. Also, it showed a high concurrent validity in the final test of gynecology course. This test could be a good alternative for formative and summative tests of clinical courses.
Collapse
|
34
|
Elvén M, Hochwälder J, Dean E, Hällman O, Söderlund A. Criterion scores, construct validity and reliability of a web-based instrument to assess physiotherapists' clinical reasoning focused on behaviour change: 'Reasoning 4 Change'. AIMS Public Health 2018; 5:235-259. [PMID: 30280115 PMCID: PMC6141557 DOI: 10.3934/publichealth.2018.3.235] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2018] [Accepted: 06/29/2018] [Indexed: 01/22/2023] Open
Abstract
Background and aim: 'Reasoning 4 Change' (R4C) is a newly developed instrument, including four domains (D1-D4), to assess clinical practitioners' and students' clinical reasoning with a focus on clients' behaviour change in a physiotherapy context. To establish its use in education and research, its psychometric properties needed to be evaluated. The aim of the study was to generate criterion scores and evaluate the reliability and construct validity of a web-based version of the R4C instrument. Methods: Fourteen physiotherapy experts and 39 final-year physiotherapy students completed the R4C instrument and the Pain Attitudes and Beliefs Scale for Physiotherapists (PABS-PT). Twelve experts and 17 students completed the R4C instrument on a second occasion. The R4C instrument was evaluated with regard to: internal consistency (five subscales of D1); test-retest reliability (D1-D4); inter-rater reliability (D2-D4); and construct validity in terms of convergent validity (D1.4, D2, D4). Criterion scores were generated based on the experts' responses to identify the scores of qualified practitioners' clinical reasoning abilities. Results: For the expert and student samples, the analyses demonstrated satisfactory internal consistency (α range: 0.67-0.91), satisfactory test-retest reliability (ICC range: 0.46-0.94) except for D3 for the experts and D4 for the students. The inter-rater reliability demonstrated excellent agreement within the expert group (ICC range: 0.94-1.0). The correlations between the R4C instrument and PABS-PT (r range: 0.06-0.76) supported acceptable construct validity. Conclusions: The web-based R4C instrument shows satisfactory psychometric properties and could be useful in education and research. The use of the instrument may contribute to a deeper understanding of physiotherapists' and students' clinical reasoning, valuable for curriculum development and improvements of competencies in clinical reasoning related to clients' behavioural change.
Collapse
Affiliation(s)
- Maria Elvén
- Division of Physiotherapy, School of Health, Care and Social Welfare, Mälardalen University, Västerås, Sweden
| | - Jacek Hochwälder
- Division of Psychology, School of Health, Care and Social Welfare, Mälardalen University, Eskilstuna, Sweden
| | - Elizabeth Dean
- Division of Physiotherapy, School of Health, Care and Social Welfare, Mälardalen University, Västerås, Sweden
- Department of Physical Therapy, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Olle Hällman
- Department of Information Technology, Uppsala University, Uppsala, Sweden
| | - Anne Söderlund
- Division of Physiotherapy, School of Health, Care and Social Welfare, Mälardalen University, Västerås, Sweden
| |
Collapse
|
35
|
Lubarsky S, Dory V, Meterissian S, Lambert C, Gagnon R. Examining the effects of gaming and guessing on script concordance test scores. PERSPECTIVES ON MEDICAL EDUCATION 2018; 7:174-181. [PMID: 29904900 PMCID: PMC6002294 DOI: 10.1007/s40037-018-0435-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
INTRODUCTION In a script concordance test (SCT), examinees are asked to judge the effect of a new piece of clinical information on a proposed hypothesis. Answers are collected using a Likert-type scale (ranging from -2 to +2, with '0' indicating no effect), and compared with those of a reference panel of 'experts'. It has been argued, however, that SCT may be susceptible to the influences of gaming and guesswork. This study aims to address some of the mounting concern over the response process validity of SCT scores. METHOD Using published datasets from three independent SCTs, we investigated examinee response patterns, and computed the score a hypothetical examinee would obtain on each of the tests if he 1) guessed random answers and 2) deliberately answered '0' on all test items. RESULTS A simulated random guessing strategy led to scores 2 SDs below mean scores of actual respondents (Z-scores -3.6 to -2.1). A simulated 'all-0' strategy led to scores at least 1 SD above those obtained by random guessing (Z-scores -2.2 to -0.7). In one dataset, stepwise exclusion of items with modal panel response '0' to fewer than 10% of the total number of test items yielded hypothetical scores 2 SDs below mean scores of actual respondents. DISCUSSION Random guessing was not an advantageous response strategy. An 'all-0' response strategy, however, demonstrated evidence of artificial score inflation. Our findings pose a significant threat to the SCT's validity argument. 'Testwiseness' is a potential hazard to all testing formats, and appropriate countermeasures must be established. We propose an approach that might be used to mitigate a potentially real and troubling phenomenon in script concordance testing. The impact of this approach on the content validity of SCTs merits further discussion.
Collapse
Affiliation(s)
- Stuart Lubarsky
- Centre for Medical Education, McGill University, Montreal, Canada.
| | - Valérie Dory
- Centre for Medical Education, McGill University, Montreal, Canada
| | | | - Carole Lambert
- Centre de pédagogie appliquée aux sciences de la santé (CPASS), Université de Montréal, Montreal, Canada
| | - Robert Gagnon
- Centre de pédagogie appliquée aux sciences de la santé (CPASS), Université de Montréal, Montreal, Canada
| |
Collapse
|
36
|
Escudier MP, Woolford MJ, Tricio JA. Assessing the application of knowledge in clinical problem-solving: The structured professional reasoning exercise. EUROPEAN JOURNAL OF DENTAL EDUCATION : OFFICIAL JOURNAL OF THE ASSOCIATION FOR DENTAL EDUCATION IN EUROPE 2018; 22:e269-e277. [PMID: 28804939 DOI: 10.1111/eje.12286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 06/30/2017] [Indexed: 06/07/2023]
Abstract
INTRODUCTION Clinical reasoning is a fundamental and core clinical competence of healthcare professionals. The study aimed to investigate the utility of the Structured Professional Reasoning Exercise (SPRE), a new competence assessment method designed to measure dental students' clinical reasoning in simulated scenarios, covering the clinical areas of Oral Disease, Primary Dental Care and Restorative Dentistry, Child Dental Health and Dental Practice and Clinical Governance. MATERIALS AND METHODS A total of 313 year-5 students sat for the assessment. Students spent 45 minutes assimilating the scenarios, before rotating through four pairs of 39 trained examiners who each independently assessed a single scenario over a ten-minute period, using a structured marking sheet. After the assessment, all students and examiners were invited to complete an anonymous perception questionnaire of the exercise. These questionnaires and the examination scores were statistically analysed. RESULTS AND DISCUSSION Oral Disease showed the lowest scores; Dental Practice and Governance the highest. The overall Intraclass Correlation Coefficient (ICC) was 0.770, whilst examiner training helped to increase the ICC from 0.716 in 2013 to 0.835 in 2014. Exploratory factor analysis revealed one major factor with an eigenvalue of 2.75 (68.8% of total variance). The Generalizability coefficient was consistent at 0.806. A total of 295 students and 32 examiners completed the perception questionnaire. Students' lowest examination perceptions were an "Unpleasant" and "Unenjoyable" experience, whilst the highest were "Interesting", "Valuable" and "Important". The majority of students and examiners reported the assessment as acceptable, fair and valid. CONCLUSION The SPRE offers a reliable, valid and acceptable assessment method, provided it comprises at least four scenarios with two independently marking and trained assessors. 3.
Collapse
Affiliation(s)
- M P Escudier
- King's College London Dental Institute, London, UK
| | - M J Woolford
- King's College London Dental Institute, London, UK
| | - J A Tricio
- King's College London Dental Institute, London, UK
- Faculty of Dentistry, University of the Andes, Santiago, Chile
| |
Collapse
|
37
|
Wan MS, Tor E, Hudson JN. Improving the validity of script concordance testing by optimising and balancing items. MEDICAL EDUCATION 2018; 52:336-346. [PMID: 29318646 DOI: 10.1111/medu.13495] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Revised: 08/18/2017] [Accepted: 10/19/2017] [Indexed: 06/07/2023]
Abstract
BACKGROUND A script concordance test (SCT) is a modality for assessing clinical reasoning. Concerns had been raised about the plausible validity threat to SCT scores if students deliberately avoided the extreme answer options to obtain higher scores. The aims of the study were firstly to investigate whether students' avoidance of the extreme answer options could result in higher scores, and secondly to determine whether a 'balanced approach' by careful construction of SCT items (to include extreme as well as median options as model responses) would improve the validity of an SCT. METHODS Using the paired sample t-test, the actual average student scores for 10 SCT papers from 2012-2016 were compared with simulated scores. The latter were generated by recoding all '-2' responses to '-1' and '+2' responses to '+1' for the whole and bottom 10% of the cohort (simulation 1), and scoring as if all students had chosen '0' for their responses (simulation 2). The actual average and simulated average scores in 2012 (before the 'balanced approach') were compared with those from 2013-2016, when papers had a good balance of modal responses from the expert reference panel. RESULTS In 2012, a score increase was seen in simulation 1 in the third-year cohort, from 50.2 to 55.6% (t [10] = 4.818; p = 0.001). Since 2013, with the 'balanced approach', the actual SCT scores (57.4%) were significantly higher than scores in both simulation 1 and simulation 2 (46.7% and 23.9% respectively). CONCLUSIONS When constructing SCT examinations, apart from the rigorous pre-examination optimisation, it is desirable to achieve a balance between items that attract extreme responses and those that attract median response options. This could mitigate the validity threat to SCT scores, especially for the low-performing students who have previously been shown to only select median responses and avoid the extreme responses.
Collapse
Affiliation(s)
- Michael Sh Wan
- School of Medicine, University of Notre Dame, Sydney, New South Wales, Australia
| | - Elina Tor
- School of Medicine, University of Notre Dame, Sydney, New South Wales, Australia
| | - Judith Nicky Hudson
- Adelaide Medical School, University of Adelaide, Adelaide, South Australia, Australia
| |
Collapse
|
38
|
Abstract
OBJECTIVES Script concordance testing (SCT) is used to assess clinical decision-making. We explore the use of SCT to (1) quantify practice variations in infant lumbar puncture (LP) and (2) analyze physician's characteristics affecting LP decision making. METHODS Using standard SCT processes, a panel of pediatric subspecialty physicians constructed 15 infant LP case vignettes, each with 2 to 4 SCT questions (a total of 47). The vignettes were distributed to pediatric attending physicians and fellows at 10 hospitals within the INSPIRE Network. We determined both raw scores (tendency to perform LP) and SCT scores (agreement with the reference panel) as well as the variation with participant factors. RESULTS Two hundred twenty-six respondents completed all 47 SCT questions. Pediatric emergency medicine physicians tended to select LP more frequently than did general pediatricians, with pediatric emergency medicine physicians showing significantly higher raw scores (20.2 ± 10.2) than general pediatricians (13 ± 15; 95% confidence interval for difference, 1, 13). Concordance with the reference panel varied among subspecialties and by the frequency with which practitioners perform LPs in their practices. CONCLUSION Script concordance testing questions can be used as a tool to detect subspecialty practice variation. We are able to detect significant practice variation in the self-report of use of LP for infants among different pediatric subspecialties.
Collapse
|
39
|
ten Cate O, Durning SJ. Approaches to Assessing the Clinical Reasoning of Preclinical Students. INNOVATION AND CHANGE IN PROFESSIONAL EDUCATION 2018. [DOI: 10.1007/978-3-319-64828-6_5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
40
|
Chew KS, van Merrienboer JJG, Durning SJ. Investing in the use of a checklist during differential diagnoses consideration: what's the trade-off? BMC MEDICAL EDUCATION 2017; 17:234. [PMID: 29187172 PMCID: PMC5707798 DOI: 10.1186/s12909-017-1078-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2017] [Accepted: 11/19/2017] [Indexed: 06/07/2023]
Abstract
BACKGROUND A key challenge clinicians face when considering differential diagnoses is whether the patient data have been adequately collected. Insufficient data may inadvertently lead to premature closure of the diagnostic process. This study aimed to test the hypothesis that the application of a mnemonic checklist helps to stimulate more patient data collection, thus leading to better diagnostic consideration. METHODS A total of 88 final year medical students were assigned to either an educational intervention group or a control group in a non-equivalent group post-test only design. Participants in the intervention group received a tutorial on the use of a mnemonic checklist aimed to minimize cognitive errors in clinical decision-making. Two weeks later, the participants in both groups were given a script concordance test consisting of 10 cases, with 3 items per case, to assess their clinical decisions when additional data are given in the case scenarios. RESULTS The Mann-Whitney U-test performed on the total scores from both groups showed no statistical significance (U = 792, z = -1.408, p = 0.159). When comparisons were made for the first half and the second half of the SCT, it was found that participants in the intervention group performed significantly better than participants in the control group in the first half of the test, with median scores of 9.15 (IQR 8.00-10.28) vs. 8.18 (IQR 7.16-9.24) respectively, U = 642.5, z = -2.661, p = 0.008. No significant difference was found in the second half of the test, with the median score of 9.58 (IQR 8.90-10.56) vs. 9.81 (IQR 8.83-11.12) for the intervention group and control group respectively (U = 897.5, z = -0.524, p = 0.60). CONCLUSION Checklist use in differential diagnoses consideration did show some benefit. However, this benefit seems to have been traded off by the time and effort in using it. More research is needed to determine whether this benefit could be translated into clinical practice after repetitive use.
Collapse
Affiliation(s)
- Keng Sheng Chew
- Faculty of Medicine and Health Sciences, Universiti Malaysia Sarawak, Kota Samarahan, Sarawak, Malaysia.
| | | | - Steven J Durning
- Faculty of Medicine and Health Sciences, Universiti Malaysia Sarawak, Kota Samarahan, Sarawak, Malaysia
| |
Collapse
|
41
|
Funk KA, Kolar C, Schweiss SK, Tingen JM, Janke KK. Experience with the script concordance test to develop clinical reasoning skills in pharmacy students. CURRENTS IN PHARMACY TEACHING & LEARNING 2017; 9:1031-1041. [PMID: 29233371 DOI: 10.1016/j.cptl.2017.07.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Revised: 03/22/2017] [Accepted: 07/28/2017] [Indexed: 06/07/2023]
Abstract
BACKGROUND The script concordance test (SCT) is used to assess clinical reasoning and was originally developed for medical learners. The Accreditation Council for Pharmacy Education (ACPE) endorses the need for pharmacy students to develop clinical reasoning skills, but there is little documentation of use of the SCT for pharmacy learners. EDUCATIONAL ACTIVITY A script concordance test activity was designed for a diabetes and metabolic syndrome pharmacotherapy course. Twenty-five cases were created and evaluated by an expert panel of 20 practicing pharmacists. Ten cases were presented as a formative activity in class. The students, design team, teaching team, and expert panel evaluated the activity. CRITICAL ANALYSIS OF THE EDUCATIONAL ACTIVITY The SCT was received positively from the students, design team, teaching team, and expert panel. The design team noted that case writing was different for this approach and that the inclusion of various perspectives from panelists was beneficial. Although the activity was formative in nature, the teaching team scored the students and this provided insight into areas where the students may struggle. SUMMARY This report provides information on the formative use of the SCT in the classroom, as well as categories of items suitable for pharmacy. The SCT provides an approach to illustrate clinical reasoning and clinical decision making among content experts and can be used to stimulate clinical discussions among student learners and content experts. The SCT could help incorporate clinical reasoning skills in a pharmacy curriculum to meet ACPE standards.
Collapse
Affiliation(s)
- Kylee A Funk
- Pharmaceutical Care & Health Systems, University of Minnesota College of Pharmacy, 7-176 Weaver-Densford Hall, 308 Harvard St. SE, Minneapolis, MN 55455, United States.
| | - Claire Kolar
- Fairview Pharmacy Services, 711 Kasota Ave, Minneapolis, MN 55414, United States.
| | - Sarah K Schweiss
- Pharmacy Practice and Pharmaceutical Sciences, University of Minnesota College of Pharmacy, 223 Life Science, 1110 Kirby Drive, Duluth, MN 55812, United States.
| | - Jeffrey M Tingen
- Department of Pharmacy Services, University of Virginia Health System, Department of Family Medicine, University of Virginia School of Medicine, PO Box 800729, Charlottesville, VA 22908, United States.
| | - Kristin K Janke
- Pharmaceutical Care & Health Systems, University of Minnesota College of Pharmacy, 7-125D Weaver Densford Hall, 308 Harvard St SE, Minneapolis, MN 55455, United States.
| |
Collapse
|
42
|
St-Onge C, Young M, Eva KW, Hodges B. Validity: one word with a plurality of meanings. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2017; 22:853-867. [PMID: 27696103 DOI: 10.1007/s10459-016-9716-3] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2016] [Accepted: 09/26/2016] [Indexed: 06/06/2023]
Abstract
Validity is one of the most debated constructs in our field; debates abound about what is legitimate and what is not, and the word continues to be used in ways that are explicitly disavowed by current practice guidelines. The resultant tensions have not been well characterized, yet their existence suggests that different uses may maintain some value for the user that needs to be better understood. We conducted an empirical form of Discourse Analysis to document the multiple ways in which validity is described, understood, and used in the health professions education field. We created and analyzed an archive of texts identified from multiple sources, including formal databases such as PubMED, ERIC and PsycINFO as well as the authors' personal assessment libraries. An iterative analytic process was used to identify, discuss, and characterize emerging discourses about validity. Three discourses of validity were identified. Validity as a test characteristic is underpinned by the notion that validity is an intrinsic property of a tool and could, therefore, be seen as content and context independent. Validity as an argument-based evidentiary-chain emphasizes the importance of supporting the interpretation of assessment results with ongoing analysis such that validity does not belong to the tool/instrument itself. The emphasis is on process-based validation (emphasizing the journey instead of the goal). Validity as a social imperative foregrounds the consequences of assessment at the individual and societal levels, be they positive or negative. The existence of different discourses may explain-in part-results observed in recent systematic reviews that highlighted discrepancies and tensions between recommendations for practice and the validation practices that are actually adopted and reported. Some of these practices, despite contravening accepted validation 'guidelines', may nevertheless respond to different and somewhat unarticulated needs within health professional education.
Collapse
Affiliation(s)
| | | | - Kevin W Eva
- University of British Columbia, Vancouver, Canada
| | | |
Collapse
|
43
|
Schubach F, Goos M, Fabry G, Vach W, Boeker M. Virtual patients in the acquisition of clinical reasoning skills: does presentation mode matter? A quasi-randomized controlled trial. BMC MEDICAL EDUCATION 2017; 17:165. [PMID: 28915871 PMCID: PMC5603058 DOI: 10.1186/s12909-017-1004-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Accepted: 09/05/2017] [Indexed: 05/28/2023]
Abstract
BACKGROUND The objective of this study is to compare two different instructional methods in the curricular use of computerized virtual patients in undergraduate medical education. We aim to investigate whether using many short and focused cases - the key feature principle - is more effective for the learning of clinical reasoning skills than using few long and systematic cases. METHODS We conducted a quasi-randomized, non-blinded, controlled parallel-group intervention trial in a large medical school in Southwestern Germany. During two seminar sessions, fourth- and fifth-year medical students (n = 56) worked on the differential diagnosis of the acute abdomen. The educational tool - virtual patients - was the same, but the instructional method differed: In one trial arm, students worked on multiple short cases, with the instruction being focused only on important elements ("key feature arm", n = 30). In the other trial arm, students worked on few long cases, with the instruction being comprehensive and systematic ("systematic arm", n = 26). The overall training time was the same in both arms. The students' clinical reasoning capacity was measured by a specifically developed instrument, a script concordance test. Their motivation and the perceived effectiveness of the instruction were assessed using a structured evaluation questionnaire. RESULTS Upon completion of the script concordance test with a reference score of 80 points and a standard deviation of 5 for experts, students in the key feature arm attained a mean of 57.4 points (95% confidence interval: 50.9-63.9), and in the systematic arm, 62.7 points (57.2-68.2), with Cohen's d at 0.337. The difference is statistically non-significant (p = 0.214). In the evaluation survey, students in the key feature arm indicated that they experienced more time pressure and perceived the material as more difficult. CONCLUSIONS In this study powered for a medium effect, we could not provide empirical evidence for the hypothesis that a key feature-based instruction on multiple short cases is superior to a systematic instruction on few long cases in the curricular implementation of virtual patients. The results of the evaluation survey suggest that learners should be given enough time to work through case examples, and that caution should be taken to prevent cognitive overload.
Collapse
Affiliation(s)
- Fabian Schubach
- Institute for Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg, Stefan-Meier-Str. 26, 79104 Freiburg i. Br., Germany
| | - Matthias Goos
- Department of General and Visceral Surgery, Helios Klinik Müllheim, Heliosweg, 79379 Müllheim, Germany
| | - Götz Fabry
- Department of Medical Psychology and Medical Sociology, Faculty of Medicine and Medical Center - University of Freiburg, Rheinstr. 12, 79104 Freiburg i. Br., Germany
| | - Werner Vach
- Institute for Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg, Stefan-Meier-Str. 26, 79104 Freiburg i. Br., Germany
| | - Martin Boeker
- Institute for Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg, Stefan-Meier-Str. 26, 79104 Freiburg i. Br., Germany
| |
Collapse
|
44
|
Cooke S, Lemay JF, Beran T. Evolutions in clinical reasoning assessment: The Evolving Script Concordance Test. MEDICAL TEACHER 2017; 39:828-835. [PMID: 28580814 DOI: 10.1080/0142159x.2017.1327706] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
INTRODUCTION Script concordance testing (SCT) is a method of assessment of clinical reasoning. We developed a new type of SCT case design, the evolving SCT (E-SCT), whereby the patient's clinical story is "evolving" and with thoughtful integration of new information at each stage, decisions related to clinical decision-making become increasingly clear. OBJECTIVES We aimed to: (1) determine whether an E-SCT could differentiate clinical reasoning ability among junior residents (JR), senior residents (SR), and pediatricians, (2) evaluate the reliability of an E-SCT, and (3) obtain qualitative feedback from participants to help inform the potential acceptability of the E-SCT. METHODS A 12-case E-SCT, embedded within a 24-case pediatric SCT (PaedSCT), was administered to 91 pediatric residents (JR: n = 50; SR: n = 41). A total of 21 pediatricians served on the panel of experts (POE). A one-way analysis of variance (ANOVA) was conducted across the levels of experience. Participants' feedback on the E-SCT was obtained with a post-test survey and analyzed using two methods: percentage preference and thematic analysis. RESULTS Statistical differences existed across levels of training: F = 19.31 (df = 2); p < 0.001. The POE scored higher than SR (mean difference = 10.34; p < 0.001) and JR (mean difference = 16.00; p < 0.001). SR scored higher than JR (mean difference = 5.66; p < 0.001). Reliability (Cronbach's α) was 0.83. Participants found the E-SCT engaging, easy to follow and true to the daily clinical decision-making process. CONCLUSIONS The E-SCT demonstrated very good reliability and was effective in distinguishing clinical reasoning ability across three levels of experience. Participants found the E-SCT engaging and representative of real-life clinical reasoning and decision-making processes. We suggest that further refinement and utilization of the evolving style case will enhance SCT as a robust, engaging, and relevant method for the assessment of clinical reasoning.
Collapse
Affiliation(s)
- Suzette Cooke
- a Department of Paediatrics , Alberta Children's Hospital, University of Calgary , Calgary , Canada
- b Department of Paediatrics, Cumming School of Medicine , University of Calgary , Calgary , Canada
| | - Jean-François Lemay
- a Department of Paediatrics , Alberta Children's Hospital, University of Calgary , Calgary , Canada
- b Department of Paediatrics, Cumming School of Medicine , University of Calgary , Calgary , Canada
| | - Tanya Beran
- a Department of Paediatrics , Alberta Children's Hospital, University of Calgary , Calgary , Canada
- c Department of Community Health Sciences/Medical Education, Cumming School of Medicine , University of Calgary , Calgary , Alberta , Canada
| |
Collapse
|
45
|
Holmboe ES, Sherbino J, Englander R, Snell L, Frank JR. A call to action: The controversy of and rationale for competency-based medical education. MEDICAL TEACHER 2017; 39:574-581. [PMID: 28598742 DOI: 10.1080/0142159x.2017.1315067] [Citation(s) in RCA: 146] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Although medical education has enjoyed many successes over the last century, there is a recognition that health care is too often unsafe and of poor quality. Errors in diagnosis and treatment, communication breakdowns, poor care coordination, inappropriate use of tests and procedures, and dysfunctional collaboration harm patients and families around the world. These issues reflect on our current model of medical education and raise the question: Are physicians being adequately prepared for twenty-first century practice? Multiple reports have concluded the answer is "no." Concurrent with this concern is an increasing interest in competency-based medical education (CBME) as an approach to help reform medical education. The principles of CBME are grounded in providing better and safer care. As interest in CBME has increased, so have criticisms of the movement. This article summarizes and addresses objections and challenges related to CBME. These can provide valuable feedback to improve CBME implementation and avoid pitfalls. We strongly believe medical education reform should not be reduced to an "either/or" approach, but should blend theories and approaches to suit the needs and resources of the populations served. The incorporation of milestones and entrustable professional activities within existing competency frameworks speaks to the dynamic evolution of CBME, which should not be viewed as a fixed doctrine, but rather as a set of evolving concepts, principles, tools, and approaches that can enable important reforms in medical education that, in turn, enable the best outcomes for patients.
Collapse
Affiliation(s)
- Eric S Holmboe
- a Accreditation Council for Graduate Medical Education , Chicago , IL , USA
| | - Jonathan Sherbino
- b Division of Emergency Medicine, Department of Medicine , McMaster University , Hamilton , Canada
| | - Robert Englander
- c School of Medicine, University of Minnesota , Minneapolis , MN , USA
| | - Linda Snell
- d Centre for Medical and Department of General Internal Medicine , McGill University , Montreal , Quebec , Canada
- e Royal College of Physicians and Surgeons of Canada , Ottawa , Canada
| | - Jason R Frank
- e Royal College of Physicians and Surgeons of Canada , Ottawa , Canada
- f Department of Emergency Medicine , University of Ottawa , Ottawa , Canada
| |
Collapse
|
46
|
Cooke S, Lemay JF. Transforming Medical Assessment: Integrating Uncertainty Into the Evaluation of Clinical Reasoning in Medical Education. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2017; 92:746-751. [PMID: 28557933 DOI: 10.1097/acm.0000000000001559] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
In an age where practicing physicians have access to an overwhelming volume of clinical information and are faced with increasingly complex medical decisions, the ability to execute sound clinical reasoning is essential to optimal patient care. The authors propose two concepts that are philosophically paramount to the future assessment of clinical reasoning in medicine: assessment in the context of "uncertainty" (when, despite all of the information that is available, there is still significant doubt as to the best diagnosis, investigation, or treatment), and acknowledging that it is entirely possible (and reasonable) to have more than "one correct answer." The purpose of this article is to highlight key elements related to these two core concepts and discuss genuine barriers that currently exist on the pathway to creating such assessments. These include acknowledging situations of uncertainty, creating clear frameworks that define progressive levels of clinical reasoning skills, providing validity evidence to increase the defensibility of such assessments, considering the comparative feasibility with other forms of assessment, and developing strategies to evaluate the impact of these assessment methods on future learning and practice. The authors recommend that concerted efforts be directed toward these key areas to help advance the field of clinical reasoning assessment, improve the clinical care decisions made by current and future physicians, and have positive outcomes for patients. It is anticipated that these and subsequent efforts will aid in reaching the goal of making future assessment in medical education more representative of current-day clinical reasoning and decision making.
Collapse
Affiliation(s)
- Suzette Cooke
- S. Cooke is clinical associate professor, Department of Paediatrics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada. J.F. Lemay is professor, Department of Paediatrics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | | |
Collapse
|
47
|
De Leng WE, Stegers-Jager KM, Husbands A, Dowell JS, Born MP, Themmen APN. Scoring method of a Situational Judgment Test: influence on internal consistency reliability, adverse impact and correlation with personality? ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2017; 22:243-265. [PMID: 27757558 DOI: 10.1007/s10459-016-9720-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Accepted: 10/06/2016] [Indexed: 05/16/2023]
Abstract
Situational Judgment Tests (SJTs) are increasingly used for medical school selection. Scoring an SJT is more complicated than scoring a knowledge test, because there are no objectively correct answers. The scoring method of an SJT may influence the construct and concurrent validity and the adverse impact with respect to non-traditional students. Previous research has compared only a small number of scoring methods and has not studied the effect of scoring method on internal consistency reliability. This study compared 28 different scoring methods for a rating SJT on internal consistency reliability, adverse impact and correlation with personality. The scoring methods varied on four aspects: the way of controlling for systematic error, and the type of reference group, distance and central tendency statistic. All scoring methods were applied to a previously validated integrity-based SJT, administered to 931 medical school applicants. Internal consistency reliability varied between .33 and .73, which is likely explained by the dependence of coefficient alpha on the total score variance. All scoring methods led to significantly higher scores for the ethnic majority than for the non-Western minorities, with effect sizes ranging from 0.48 to 0.66. Eighteen scoring methods showed a significant small positive correlation with agreeableness. Four scoring methods showed a significant small positive correlation with conscientiousness. The way of controlling for systematic error was the most influential scoring method aspect. These results suggest that the increased use of SJTs for selection into medical school must be accompanied by a thorough examination of the scoring method to be used.
Collapse
Affiliation(s)
- W E De Leng
- Institute of Medical Education Research Rotterdam (iMERR), Erasmus MC, Room AE-239, PO Box 2040, 3000 CA, Rotterdam, The Netherlands.
| | - K M Stegers-Jager
- Institute of Medical Education Research Rotterdam (iMERR), Erasmus MC, Room AE-239, PO Box 2040, 3000 CA, Rotterdam, The Netherlands
| | - A Husbands
- Medical School, University of Buckingham, Buckingham, UK
| | - J S Dowell
- School of Medicine, University of Dundee, Dundee, UK
| | - M Ph Born
- Department of Psychology, Erasmus University Rotterdam, Rotterdam, The Netherlands
| | - A P N Themmen
- Institute of Medical Education Research Rotterdam (iMERR), Erasmus MC, Room AE-239, PO Box 2040, 3000 CA, Rotterdam, The Netherlands
- Department of Internal Medicine, Erasmus MC, Rotterdam, The Netherlands
| |
Collapse
|
48
|
Nseir S, Elkalioubie A, Deruelle P, Lacroix D, Gosset D. Accuracy of script concordance tests in fourth-year medical students. INTERNATIONAL JOURNAL OF MEDICAL EDUCATION 2017; 8:63-69. [PMID: 28237977 PMCID: PMC5339020 DOI: 10.5116/ijme.5898.2f91] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Accepted: 02/06/2017] [Indexed: 06/06/2023]
Abstract
OBJECTIVES This investigation aimed to determine the validity of script concordance test (SCT), compared with clinical-case-related short-answer management problems (SAMP), in fourth-year medical students. METHODS This retrospective study was conducted at the Medical School of Lille University. Cardiology and gynecology examinations both included 3 SCT and 2 clinical-case-related SAMP. Final score did not include SCT results, and was out of 20 points. The passing score was ≥10/20. Wilcoxon and McNemar tests were used to compare quantitative and qualitative variables, respectively. Correlation between scores was also analyzed. RESULTS A total of 519 and 521 students completed SAMP and SCT in cardiology and gynecology, respectively. Cardiology score was significantly higher in SCT than SAMP (mean ± SD 13.5±2.4 versus 11.4±2.6, Wilcoxon test, p<0.001). In gynecology, SCT score was significantly lower than SAMP score (10.8±2.6 versus 11.4±2.7, Wilcoxon test, p=0.001). SCT and SAMP scores were significantly correlated (p <0.05, Pearson's correlation). However, percentage of students with SCT score ≥ 10/20 was similar among those who passed or failed cardiology (327 of 359 (91%) vs 146 of 160 (91%), χ2=0.004, df =1, p=0.952), or gynecology (274 of 379 (65%) vs 84 of 142 (59%), χ2=1.614, df=1, p=0.204) SAMP test. Cronbach alpha coefficient was 0.31 and 0.92 for all SCT and SAMP, respectively. CONCLUSIONS Although significantly correlated, the scores obtained in SCT and SAMP were significantly different in fourth-year medical students. These findings suggest that SCT should not be used for summative purposes in fourth-year medical students.
Collapse
Affiliation(s)
- Saad Nseir
- University of Lille, School of Medicine, Lille, France
| | | | | | | | - Didier Gosset
- University of Lille, School of Medicine, Lille, France
| |
Collapse
|
49
|
Kreiter CD. A Bayesian perspective on constructing a written assessment of probabilistic clinical reasoning in experienced clinicians. J Eval Clin Pract 2017; 23:44-48. [PMID: 26486941 DOI: 10.1111/jep.12469] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/18/2015] [Indexed: 11/29/2022]
Abstract
RATIONALE Decision-making performance assessments have proven problematic for assessing clinical reasoning. AIMS AND OBJECTIVES A Bayesian approach to designing an advanced clinical reasoning assessment is well grounded in mathematical and cognitive theory and may offer significant psychometric advantages. Probabilistic logic plays an important role in medical problem solving, and performances on Bayesian-type tasks appear to be causally-related to the ability to make sound clinical decisions. METHODS A validity argument is used to guide the design of an assessment of medical reasoning using clinical probabilities. RESULTS/CONCLUSIONS The practical advantage of using a Bayesian approach to item design relates to the fact that probability theory provides a rationally optimal method for managing uncertain information and provides the criteria for objective correct answer scoring. Potential item formats are discussed.
Collapse
|
50
|
Kazour F, Richa S, Zoghbi M, El-Hage W, Haddad FG. Using the Script Concordance Test to Evaluate Clinical Reasoning Skills in Psychiatry. ACADEMIC PSYCHIATRY : THE JOURNAL OF THE AMERICAN ASSOCIATION OF DIRECTORS OF PSYCHIATRIC RESIDENCY TRAINING AND THE ASSOCIATION FOR ACADEMIC PSYCHIATRY 2017; 41:86-90. [PMID: 27178278 DOI: 10.1007/s40596-016-0539-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Accepted: 03/18/2016] [Indexed: 05/21/2023]
Abstract
OBJECTIVES Although clinical reasoning is a major component of psychiatric training, most evaluating tools do not assess this skill properly. Clinicians mobilize networks of organized knowledge (scripts) to assess ambiguous or uncertain situations. The Script Concordance Test (SCT) was developed to assess clinical reasoning in an uncertainty context. The objective of this study was to test the usefulness of the SCT to assess the reasoning capacities of interns (7th year medical students) during the psychiatry training. METHODS The authors designed a SCT for psychiatry teaching, adapted to interns. The test contained 20 vignettes of five questions each. A reference panel of senior psychiatrists underwent the test, and we used their scoring as a reference for the student group. The SCT assessed the competence of students at the beginning and the end of their training in psychiatry. RESULTS A panel of 10 psychiatrists and 47 interns participated to this study. As expected, the reference panel performed significantly (p<0.001) better (79.4±5.1) than the students on the SCT. Interns improved significantly (p<0.001) their scores between the beginning (58.5±6.2) and the end (65.0±5.3) of their psychiatry rotation. The students improved significantly (p<0.001) their scores between the beginning and the end of the training (6.4±4.8). CONCLUSIONS This is the first study using the SCT in psychiatry. This study shows the feasibility of this procedure and its utility in the field of psychiatry for evaluating medical students in their clinical reasoning competence. It can provide a valid alternative to classical evaluation methods.
Collapse
Affiliation(s)
- François Kazour
- Faculty of Medicine, Saint Joseph University, Beirut, Lebanon.
| | - Sami Richa
- Faculty of Medicine, Saint Joseph University, Beirut, Lebanon
| | - Marouan Zoghbi
- Faculty of Medicine, Saint Joseph University, Beirut, Lebanon
| | - Wissam El-Hage
- Université François-Rabelais de Tours, Inserm UMR U930, Tours, France
| | - Fady G Haddad
- Faculty of Medicine, Saint Joseph University, Beirut, Lebanon
| |
Collapse
|