1
|
Cook DA, Hatala R, Brydges R, Zendejas B, Szostek JH, Wang AT, Erwin PJ, Hamstra SJ. Technology-enhanced simulation for health professions education: a systematic review and meta-analysis. JAMA 2011; 306:978-88. [PMID: 21900138 DOI: 10.1001/jama.2011.1234] [Citation(s) in RCA: 997] [Impact Index Per Article: 71.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
CONTEXT Although technology-enhanced simulation has widespread appeal, its effectiveness remains uncertain. A comprehensive synthesis of evidence may inform the use of simulation in health professions education. OBJECTIVE To summarize the outcomes of technology-enhanced simulation training for health professions learners in comparison with no intervention. DATA SOURCE Systematic search of MEDLINE, EMBASE, CINAHL, ERIC, PsychINFO, Scopus, key journals, and previous review bibliographies through May 2011. STUDY SELECTION Original research in any language evaluating simulation compared with no intervention for training practicing and student physicians, nurses, dentists, and other health care professionals. DATA EXTRACTION Reviewers working in duplicate evaluated quality and abstracted information on learners, instructional design (curricular integration, distributing training over multiple days, feedback, mastery learning, and repetitive practice), and outcomes. We coded skills (performance in a test setting) separately for time, process, and product measures, and similarly classified patient care behaviors. DATA SYNTHESIS From a pool of 10,903 articles, we identified 609 eligible studies enrolling 35,226 trainees. Of these, 137 were randomized studies, 67 were nonrandomized studies with 2 or more groups, and 405 used a single-group pretest-posttest design. We pooled effect sizes using random effects. Heterogeneity was large (I(2)>50%) in all main analyses. In comparison with no intervention, pooled effect sizes were 1.20 (95% CI, 1.04-1.35) for knowledge outcomes (n = 118 studies), 1.14 (95% CI, 1.03-1.25) for time skills (n = 210), 1.09 (95% CI, 1.03-1.16) for process skills (n = 426), 1.18 (95% CI, 0.98-1.37) for product skills (n = 54), 0.79 (95% CI, 0.47-1.10) for time behaviors (n = 20), 0.81 (95% CI, 0.66-0.96) for other behaviors (n = 50), and 0.50 (95% CI, 0.34-0.66) for direct effects on patients (n = 32). Subgroup analyses revealed no consistent statistically significant interactions between simulation training and instructional design features or study quality. CONCLUSION In comparison with no intervention, technology-enhanced simulation training in health professions education is consistently associated with large effects for outcomes of knowledge, skills, and behaviors and moderate effects for patient-related outcomes.
Collapse
|
Meta-Analysis |
14 |
997 |
2
|
Freedman B, Camm J, Calkins H, Healey JS, Rosenqvist M, Wang J, Albert CM, Anderson CS, Antoniou S, Benjamin EJ, Boriani G, Brachmann J, Brandes A, Chao TF, Conen D, Engdahl J, Fauchier L, Fitzmaurice DA, Friberg L, Gersh BJ, Gladstone DJ, Glotzer TV, Gwynne K, Hankey GJ, Harbison J, Hillis GS, Hills MT, Kamel H, Kirchhof P, Kowey PR, Krieger D, Lee VWY, Levin LÅ, Lip GYH, Lobban T, Lowres N, Mairesse GH, Martinez C, Neubeck L, Orchard J, Piccini JP, Poppe K, Potpara TS, Puererfellner H, Rienstra M, Sandhu RK, Schnabel RB, Siu CW, Steinhubl S, Svendsen JH, Svennberg E, Themistoclakis S, Tieleman RG, Turakhia MP, Tveit A, Uittenbogaart SB, Van Gelder IC, Verma A, Wachter R, Yan BP, Al Awwad A, Al-Kalili F, Berge T, Breithardt G, Bury G, Caorsi WR, Chan NY, Chen SA, Christophersen I, Connolly S, Crijns H, Davis S, Dixen U, Doughty R, Du X, Ezekowitz M, Fay M, Frykman V, Geanta M, Gray H, Grubb N, Guerra A, Halcox J, Hatala R, Heidbuchel H, Jackson R, Johnson L, Kaab S, Keane K, Kim YH, Kollios G, Løchen ML, Ma C, Mant J, Martinek M, Marzona I, Matsumoto K, McManus D, Moran P, Naik N, et alFreedman B, Camm J, Calkins H, Healey JS, Rosenqvist M, Wang J, Albert CM, Anderson CS, Antoniou S, Benjamin EJ, Boriani G, Brachmann J, Brandes A, Chao TF, Conen D, Engdahl J, Fauchier L, Fitzmaurice DA, Friberg L, Gersh BJ, Gladstone DJ, Glotzer TV, Gwynne K, Hankey GJ, Harbison J, Hillis GS, Hills MT, Kamel H, Kirchhof P, Kowey PR, Krieger D, Lee VWY, Levin LÅ, Lip GYH, Lobban T, Lowres N, Mairesse GH, Martinez C, Neubeck L, Orchard J, Piccini JP, Poppe K, Potpara TS, Puererfellner H, Rienstra M, Sandhu RK, Schnabel RB, Siu CW, Steinhubl S, Svendsen JH, Svennberg E, Themistoclakis S, Tieleman RG, Turakhia MP, Tveit A, Uittenbogaart SB, Van Gelder IC, Verma A, Wachter R, Yan BP, Al Awwad A, Al-Kalili F, Berge T, Breithardt G, Bury G, Caorsi WR, Chan NY, Chen SA, Christophersen I, Connolly S, Crijns H, Davis S, Dixen U, Doughty R, Du X, Ezekowitz M, Fay M, Frykman V, Geanta M, Gray H, Grubb N, Guerra A, Halcox J, Hatala R, Heidbuchel H, Jackson R, Johnson L, Kaab S, Keane K, Kim YH, Kollios G, Løchen ML, Ma C, Mant J, Martinek M, Marzona I, Matsumoto K, McManus D, Moran P, Naik N, Ngarmukos T, Prabhakaran D, Reidpath D, Ribeiro A, Rudd A, Savalieva I, Schilling R, Sinner M, Stewart S, Suwanwela N, Takahashi N, Topol E, Ushiyama S, Verbiest van Gurp N, Walker N, Wijeratne T. Screening for Atrial Fibrillation. Circulation 2017; 135:1851-1867. [DOI: 10.1161/circulationaha.116.026693] [Show More Authors] [Citation(s) in RCA: 369] [Impact Index Per Article: 46.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Approximately 10% of ischemic strokes are associated with atrial fibrillation (AF) first diagnosed at the time of stroke. Detecting asymptomatic AF would provide an opportunity to prevent these strokes by instituting appropriate anticoagulation. The AF-SCREEN international collaboration was formed in September 2015 to promote discussion and research about AF screening as a strategy to reduce stroke and death and to provide advocacy for implementation of country-specific AF screening programs. During 2016, 60 expert members of AF-SCREEN, including physicians, nurses, allied health professionals, health economists, and patient advocates, were invited to prepare sections of a draft document. In August 2016, 51 members met in Rome to discuss the draft document and consider the key points arising from it using a Delphi process. These key points emphasize that screen-detected AF found at a single timepoint or by intermittent ECG recordings over 2 weeks is not a benign condition and, with additional stroke factors, carries sufficient risk of stroke to justify consideration of anticoagulation. With regard to the methods of mass screening, handheld ECG devices have the advantage of providing a verifiable ECG trace that guidelines require for AF diagnosis and would therefore be preferred as screening tools. Certain patient groups, such as those with recent embolic stroke of uncertain source (ESUS), require more intensive monitoring for AF. Settings for screening include various venues in both the community and the clinic, but they must be linked to a pathway for appropriate diagnosis and management for screening to be effective. It is recognized that health resources vary widely between countries and health systems, so the setting for AF screening should be both country- and health system-specific. Based on current knowledge, this white paper provides a strong case for AF screening now while recognizing that large randomized outcomes studies would be helpful to strengthen the evidence base.
Collapse
|
|
8 |
369 |
3
|
Cook DA, Brydges R, Ginsburg S, Hatala R. A contemporary approach to validity arguments: a practical guide to Kane's framework. MEDICAL EDUCATION 2015; 49:560-75. [PMID: 25989405 DOI: 10.1111/medu.12678] [Citation(s) in RCA: 350] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Revised: 11/20/2014] [Accepted: 12/19/2014] [Indexed: 05/13/2023]
Abstract
CONTEXT Assessment is central to medical education and the validation of assessments is vital to their use. Earlier validity frameworks suffer from a multiplicity of types of validity or failure to prioritise among sources of validity evidence. Kane's framework addresses both concerns by emphasising key inferences as the assessment progresses from a single observation to a final decision. Evidence evaluating these inferences is planned and presented as a validity argument. OBJECTIVES We aim to offer a practical introduction to the key concepts of Kane's framework that educators will find accessible and applicable to a wide range of assessment tools and activities. RESULTS All assessments are ultimately intended to facilitate a defensible decision about the person being assessed. Validation is the process of collecting and interpreting evidence to support that decision. Rigorous validation involves articulating the claims and assumptions associated with the proposed decision (the interpretation/use argument), empirically testing these assumptions, and organising evidence into a coherent validity argument. Kane identifies four inferences in the validity argument: Scoring (translating an observation into one or more scores); Generalisation (using the score[s] as a reflection of performance in a test setting); Extrapolation (using the score[s] as a reflection of real-world performance), and Implications (applying the score[s] to inform a decision or action). Evidence should be collected to support each of these inferences and should focus on the most questionable assumptions in the chain of inference. Key assumptions (and needed evidence) vary depending on the assessment's intended use or associated decision. Kane's framework applies to quantitative and qualitative assessments, and to individual tests and programmes of assessment. CONCLUSIONS Validation focuses on evaluating the key claims, assumptions and inferences that link assessment scores with their intended interpretations and uses. The Implications and associated decisions are the most important inferences in the validity argument.
Collapse
|
|
10 |
350 |
4
|
Cook DA, Hamstra SJ, Brydges R, Zendejas B, Szostek JH, Wang AT, Erwin PJ, Hatala R. Comparative effectiveness of instructional design features in simulation-based education: systematic review and meta-analysis. MEDICAL TEACHER 2013; 35:e867-98. [PMID: 22938677 DOI: 10.3109/0142159x.2012.714886] [Citation(s) in RCA: 350] [Impact Index Per Article: 29.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
BACKGROUND Although technology-enhanced simulation is increasingly used in health professions education, features of effective simulation-based instructional design remain uncertain. AIMS Evaluate the effectiveness of instructional design features through a systematic review of studies comparing different simulation-based interventions. METHODS We systematically searched MEDLINE, EMBASE, CINAHL, ERIC, PsycINFO, Scopus, key journals, and previous review bibliographies through May 2011. We included original research studies that compared one simulation intervention with another and involved health professions learners. Working in duplicate, we evaluated study quality and abstracted information on learners, outcomes, and instructional design features. We pooled results using random effects meta-analysis. RESULTS From a pool of 10,903 articles we identified 289 eligible studies enrolling 18,971 trainees, including 208 randomized trials. Inconsistency was usually large (I2 > 50%). For skills outcomes, pooled effect sizes (positive numbers favoring the instructional design feature) were 0.68 for range of difficulty (20 studies; p < 0.001), 0.68 for repetitive practice (7 studies; p = 0.06), 0.66 for distributed practice (6 studies; p = 0.03), 0.65 for interactivity (89 studies; p < 0.001), 0.62 for multiple learning strategies (70 studies; p < 0.001), 0.52 for individualized learning (59 studies; p < 0.001), 0.45 for mastery learning (3 studies; p = 0.57), 0.44 for feedback (80 studies; p < 0.001), 0.34 for longer time (23 studies; p = 0.005), 0.20 for clinical variation (16 studies; p = 0.24), and -0.22 for group training (8 studies; p = 0.09). CONCLUSIONS These results confirm quantitatively the effectiveness of several instructional design features in simulation-based education.
Collapse
|
Meta-Analysis |
12 |
350 |
5
|
Murad MH, Montori VM, Ioannidis JPA, Jaeschke R, Devereaux PJ, Prasad K, Neumann I, Carrasco-Labra A, Agoritsas T, Hatala R, Meade MO, Wyer P, Cook DJ, Guyatt G. How to read a systematic review and meta-analysis and apply the results to patient care: users' guides to the medical literature. JAMA 2014; 312:171-9. [PMID: 25005654 DOI: 10.1001/jama.2014.5559] [Citation(s) in RCA: 310] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Clinical decisions should be based on the totality of the best evidence and not the results of individual studies. When clinicians apply the results of a systematic review or meta-analysis to patient care, they should start by evaluating the credibility of the methods of the systematic review, ie, the extent to which these methods have likely protected against misleading results. Credibility depends on whether the review addressed a sensible clinical question; included an exhaustive literature search; demonstrated reproducibility of the selection and assessment of studies; and presented results in a useful manner. For reviews that are sufficiently credible, clinicians must decide on the degree of confidence in the estimates that the evidence warrants (quality of evidence). Confidence depends on the risk of bias in the body of evidence; the precision and consistency of the results; whether the results directly apply to the patient of interest; and the likelihood of reporting bias. Shared decision making requires understanding of the estimates of magnitude of beneficial and harmful effects, and confidence in those estimates.
Collapse
|
|
11 |
310 |
6
|
Hamstra SJ, Brydges R, Hatala R, Zendejas B, Cook DA. Reconsidering fidelity in simulation-based training. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2014; 89:387-92. [PMID: 24448038 DOI: 10.1097/acm.0000000000000130] [Citation(s) in RCA: 295] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
In simulation-based health professions education, the concept of simulator fidelity is usually understood as the degree to which a simulator looks, feels, and acts like a human patient. Although this can be a useful guide in designing simulators, this definition emphasizes technological advances and physical resemblance over principles of educational effectiveness. In fact, several empirical studies have shown that the degree of fidelity appears to be independent of educational effectiveness. The authors confronted these issues while conducting a recent systematic review of simulation-based health professions education, and in this Perspective they use their experience in conducting that review to examine key concepts and assumptions surrounding the topic of fidelity in simulation.Several concepts typically associated with fidelity are more useful in explaining educational effectiveness, such as transfer of learning, learner engagement, and suspension of disbelief. Given that these concepts more directly influence properties of the learning experience, the authors make the following recommendations: (1) abandon the term fidelity in simulation-based health professions education and replace it with terms reflecting the underlying primary concepts of physical resemblance and functional task alignment; (2) make a shift away from the current emphasis on physical resemblance to a focus on functional correspondence between the simulator and the applied context; and (3) focus on methods to enhance educational effectiveness using principles of transfer of learning, learner engagement, and suspension of disbelief. These recommendations clarify underlying concepts for researchers in simulation-based health professions education and will help advance this burgeoning field.
Collapse
|
|
11 |
295 |
7
|
Cook DA, Brydges R, Zendejas B, Hamstra SJ, Hatala R. Mastery learning for health professionals using technology-enhanced simulation: a systematic review and meta-analysis. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2013; 88:1178-86. [PMID: 23807104 DOI: 10.1097/acm.0b013e31829a365d] [Citation(s) in RCA: 230] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
PURPOSE Competency-based education requires individualization of instruction. Mastery learning, an instructional approach requiring learners to achieve a defined proficiency before proceeding to the next instructional objective, offers one approach to individualization. The authors sought to summarize the quantitative outcomes of mastery learning simulation-based medical education (SBME) in comparison with no intervention and nonmastery instruction, and to determine what features of mastery SBME make it effective. METHOD The authors searched MEDLINE, EMBASE, CINAHL, ERIC, PsycINFO, Scopus, key journals, and previous review bibliographies through May 2011. They included original research in any language evaluating mastery SBME, in comparison with any intervention or no intervention, for practicing and student physicians, nurses, and other health professionals. Working in duplicate, they abstracted information on trainees, instructional design (interactivity, feedback, repetitions, and learning time), study design, and outcomes. RESULTS They identified 82 studies evaluating mastery SBME. In comparison with no intervention, mastery SBME was associated with large effects on skills (41 studies; effect size [ES] 1.29 [95% confidence interval, 1.08-1.50]) and moderate effects on patient outcomes (11 studies; ES 0.73 [95% CI, 0.36-1.10]). In comparison with nonmastery SBME instruction, mastery learning was associated with large benefit in skills (3 studies; effect size 1.17 [95% CI, 0.29-2.05]) but required more time. Pretraining and additional practice improved outcomes but, again, took longer. Studies exploring enhanced feedback and self-regulated learning in the mastery model showed mixed results. CONCLUSIONS Limited evidence suggests that mastery learning SBME is superior to nonmastery instruction but takes more time.
Collapse
|
Meta-Analysis |
12 |
230 |
8
|
Abstract
OBJECTIVE To compare the efficacy, nephrotoxicity, and ototoxicity of once-daily aminoglycoside dosing with those of standard aminoglycoside regimens in immuno-competent adults. DATA SOURCES A structured MEDLINE search from 1966 to April 1995 using the keywords aminoglycosides, drug administration schedule, and adult; bibliographic searching of review articles, position papers, and references of the selected articles; contact with primary authors of selected articles to obtain information not in the published reports and lists of potentially relevant articles. STUDY SELECTION Randomized, controlled trials that 1) compared an intravenous once-daily aminoglycoside regimen with a standard aminoglycoside regimen in infected immunocompetent adults and 2) examined efficacy, mortality, or toxicity. DATA EXTRACTION For each selected study, two independent reviewers assessed methodologic quality and abstracted data. The heterogeneity of individual study risk ratios was assessed and data were pooled using a random-effects model. RESULTS Forty-two studies were reviewed for possible inclusion. Thirteen independent studies met the selection criteria, and their results were pooled. The trials had a mean methodologic quality score of 0.69 (range, 0.50 to 0.91). Heterogeneity exists among the individual risk ratios for clinical cure (P = 0.07); significant heterogeneity does not exist for the other outcomes. For the pooled efficacy outcomes, the risk ratio for bacteriologic cure is 1.02 (95% CI, 0.99 to 1.05), and the risk ratio for mortality is 0.91 (CI, 0.63 to 1.31). For the pooled toxicity outcomes, the risk ratio for nephrotoxicity is 0.87 (CI, 0.60 to 1.26), and the risk ratio for ototoxicity is 0.67 (CI, 0.35 to 1.28). CONCLUSIONS Standard and once-daily aminoglycoside dosing regimens are equivalent with regard to bacteriologic cure, and once-daily dosing shows a trend toward reduced mortality and toxicity. However, additional studies are needed for more precise estimates of mortality and toxicity risk ratios. The equivalency of the dosing regimens, the ease of administration, reduced nursing time, and reduced variability in the timing of drug administration that are associated with once-daily dosing may mean that the once-daily regimen is clinically advantageous.
Collapse
|
Clinical Trial |
29 |
223 |
9
|
Ilgen JS, Ma IWY, Hatala R, Cook DA. A systematic review of validity evidence for checklists versus global rating scales in simulation-based assessment. MEDICAL EDUCATION 2015; 49:161-73. [PMID: 25626747 DOI: 10.1111/medu.12621] [Citation(s) in RCA: 215] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Revised: 08/01/2014] [Accepted: 09/09/2014] [Indexed: 05/14/2023]
Abstract
CONTEXT The relative advantages and disadvantages of checklists and global rating scales (GRSs) have long been debated. To compare the merits of these scale types, we conducted a systematic review of the validity evidence for checklists and GRSs in the context of simulation-based assessment of health professionals. METHODS We conducted a systematic review of multiple databases including MEDLINE, EMBASE and Scopus to February 2013. We selected studies that used both a GRS and checklist in the simulation-based assessment of health professionals. Reviewers working in duplicate evaluated five domains of validity evidence, including correlation between scales and reliability. We collected information about raters, instrument characteristics, assessment context, and task. We pooled reliability and correlation coefficients using random-effects meta-analysis. RESULTS We found 45 studies that used a checklist and GRS in simulation-based assessment. All studies included physicians or physicians in training; one study also included nurse anaesthetists. Topics of assessment included open and laparoscopic surgery (n = 22), endoscopy (n = 8), resuscitation (n = 7) and anaesthesiology (n = 4). The pooled GRS-checklist correlation was 0.76 (95% confidence interval [CI] 0.69-0.81, n = 16 studies). Inter-rater reliability was similar between scales (GRS 0.78, 95% CI 0.71-0.83, n = 23; checklist 0.81, 95% CI 0.75-0.85, n = 21), whereas GRS inter-item reliabilities (0.92, 95% CI 0.84-0.95, n = 6) and inter-station reliabilities (0.80, 95% CI 0.73-0.85, n = 10) were higher than those for checklists (0.66, 95% CI 0-0.84, n = 4 and 0.69, 95% CI 0.56-0.77, n = 10, respectively). Content evidence for GRSs usually referenced previously reported instruments (n = 33), whereas content evidence for checklists usually described expert consensus (n = 26). Checklists and GRSs usually had similar evidence for relations to other variables. CONCLUSIONS Checklist inter-rater reliability and trainee discrimination were more favourable than suggested in earlier work, but each task requires a separate checklist. Compared with the checklist, the GRS has higher average inter-item and inter-station reliability, can be used across multiple tasks, and may better capture nuanced elements of expertise.
Collapse
|
Comparative Study |
10 |
215 |
10
|
Abstract
CONTEXT Early clinical recognition of meningitis is imperative to allow clinicians to efficiently complete further tests and initiate appropriate therapy. OBJECTIVE To review the accuracy and precision of the clinical examination in the diagnosis of adult meningitis. DATA SOURCES A comprehensive review of English- and French-language literature was conducted by searching MEDLINE for 1966 to July 1997, using a structured search strategy. Additional references were identified by reviewing reference lists of pertinent articles. STUDY SELECTION The search yielded 139 potentially relevant studies, which were reviewed by the first author. Studies were included if they described the clinical examination in the diagnosis of objectively confirmed bacterial or viral meningitis. Studies were excluded if they enrolled predominantly children or immunocompromised adults or focused only on metastatic meningitis or meningitis of a single microbial origin. A total of 10 studies met the criteria and were included in the analysis. DATA EXTRACTION Validity of the studies was assessed by a critical appraisal of several components of the study design. These components included an assessment of the reference standard used to diagnose meningitis (lumbar puncture or autopsy), the completeness of patient ascertainment, and whether the clinical examination was described in sufficient detail to be reproducible. DATA SYNTHESIS Individual items of the clinical history have low accuracy for the diagnosis of meningitis in adults (pooled sensitivity for headache, 50% [95% confidence interval [CI], 32%-68%]; for nausea/vomiting, 30% [95% CI, 22%-38%]). On physical examination, the absence of fever, neck stiffness, and altered mental status effectively eliminates meningitis (sensitivity, 99%-100% for the presence of 1 of these findings). Of the classic signs of meningeal irritation, only 1 study has assessed Kernig sign; no studies subsequent to the original report have evaluated Brudzinski sign. Among patients with fever and headache, jolt accentuation of headache is a useful adjunctive maneuver, with a sensitivity of 100%, specificity of 54%, positive likelihood ratio of 2.2, and negative likelihood ratio of 0 for the diagnosis of meningitis. CONCLUSIONS Among adults with a clinical presentation that is low risk for meningitis, the clinical examination aids in excluding the diagnosis. However, given the seriousness of this infection, clinicians frequently need to proceed directly to lumbar puncture in high-risk patients. Many of the signs and symptoms of meningitis have been inadequately studied, and further prospective research is needed.
Collapse
|
Meta-Analysis |
26 |
198 |
11
|
Cook DA, Zendejas B, Hamstra SJ, Hatala R, Brydges R. What counts as validity evidence? Examples and prevalence in a systematic review of simulation-based assessment. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2014; 19:233-50. [PMID: 23636643 DOI: 10.1007/s10459-013-9458-4] [Citation(s) in RCA: 195] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2012] [Accepted: 04/09/2013] [Indexed: 05/26/2023]
Abstract
Ongoing transformations in health professions education underscore the need for valid and reliable assessment. The current standard for assessment validation requires evidence from five sources: content, response process, internal structure, relations with other variables, and consequences. However, researchers remain uncertain regarding the types of data that contribute to each evidence source. We sought to enumerate the validity evidence sources and supporting data elements for assessments using technology-enhanced simulation. We conducted a systematic literature search including MEDLINE, ERIC, and Scopus through May 2011. We included original research that evaluated the validity of simulation-based assessment scores using two or more evidence sources. Working in duplicate, we abstracted information on the prevalence of each evidence source and the underlying data elements. Among 217 eligible studies only six (3 %) referenced the five-source framework, and 51 (24 %) made no reference to any validity framework. The most common evidence sources and data elements were: relations with other variables (94 % of studies; reported most often as variation in simulator scores across training levels), internal structure (76 %; supported by reliability data or item analysis), and content (63 %; reported as expert panels or modification of existing instruments). Evidence of response process and consequences were each present in <10 % of studies. We conclude that relations with training level appear to be overrepresented in this field, while evidence of consequences and response process are infrequently reported. Validation science will be improved as educators use established frameworks to collect and interpret evidence from the full spectrum of possible sources and elements.
Collapse
|
Review |
11 |
195 |
12
|
Cook DA, Hatala R. Validation of educational assessments: a primer for simulation and beyond. Adv Simul (Lond) 2016; 1:31. [PMID: 29450000 PMCID: PMC5806296 DOI: 10.1186/s41077-016-0033-y] [Citation(s) in RCA: 193] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 11/16/2016] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Simulation plays a vital role in health professions assessment. This review provides a primer on assessment validation for educators and education researchers. We focus on simulation-based assessment of health professionals, but the principles apply broadly to other assessment approaches and topics. KEY PRINCIPLES Validation refers to the process of collecting validity evidence to evaluate the appropriateness of the interpretations, uses, and decisions based on assessment results. Contemporary frameworks view validity as a hypothesis, and validity evidence is collected to support or refute the validity hypothesis (i.e., that the proposed interpretations and decisions are defensible). In validation, the educator or researcher defines the proposed interpretations and decisions, identifies and prioritizes the most questionable assumptions in making these interpretations and decisions (the "interpretation-use argument"), empirically tests those assumptions using existing or newly-collected evidence, and then summarizes the evidence as a coherent "validity argument." A framework proposed by Messick identifies potential evidence sources: content, response process, internal structure, relationships with other variables, and consequences. Another framework proposed by Kane identifies key inferences in generating useful interpretations: scoring, generalization, extrapolation, and implications/decision. We propose an eight-step approach to validation that applies to either framework: Define the construct and proposed interpretation, make explicit the intended decision(s), define the interpretation-use argument and prioritize needed validity evidence, identify candidate instruments and/or create/adapt a new instrument, appraise existing evidence and collect new evidence as needed, keep track of practical issues, formulate the validity argument, and make a judgment: does the evidence support the intended use? CONCLUSIONS Rigorous validation first prioritizes and then empirically evaluates key assumptions in the interpretation and use of assessment scores. Validation science would be improved by more explicit articulation and prioritization of the interpretation-use argument, greater use of formal validation frameworks, and more evidence informing the consequences and implications of assessment.
Collapse
|
research-article |
9 |
193 |
13
|
Bucher HC, Cook RJ, Guyatt GH, Lang JD, Cook DJ, Hatala R, Hunt DL. Effects of dietary calcium supplementation on blood pressure. A meta-analysis of randomized controlled trials. JAMA 1996; 275:1016-22. [PMID: 8596234 DOI: 10.1001/jama.1996.03530370054031] [Citation(s) in RCA: 190] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
OBJECTIVE To review the effect of supplemental calcium on blood pressure. DATA SOURCE We searched MEDLINE and EMBASE for 1996 to May 1994. We contacted authors of eligible trials to ensure accuracy and completeness of data and to identify unpublished trials. STUDY SELECTION We included any study in which investigators randomized people to calcium supplementation or placebo and measured blood pressure for at least 2 weeks. Fifty-six articles met the inclusion criteria, and 33 were eligible for analysis, involving a total of 2412 patients. DATA EXTRACTION Two pairs of independent reviewers abstracted data and assessed validity according to six quality criteria. DATA SYNTHESIS We calculated the differences in blood pressure change between the calcium supplementation group and the control group and pooled the estimates, with each trial weighted with the inverse of the variance using a random-effects model. Predictors of blood pressure reduction that we examined included method of supplementation, baseline blood pressure, and the methodological quality of the studies. The pooled analysis showed a reduction in systolic blood pressure of -1.27 mm Hg (95% confidence interval [CI], -2.25 to -0.29 mm Hg; P=.01) and in diastolic blood pressure of -0.24 mm Hg (95% CI, -0.92 to 0.44 mm Hg; P=.49). None of the possible mediators of blood pressure reduction explained differences in treatment effects. CONCLUSIONS Calcium supplementation may lead to a small reduction in systolic but not diastolic blood pressure. The results do not exclude a larger, important effect of calcium on blood pressure in subpopulations. In particular, further studies should address the hypothesis that inadequate calcium intake is associated with increased blood pressure that can be corrected with calcium supplementation.
Collapse
|
Meta-Analysis |
29 |
190 |
14
|
Brydges R, Hatala R, Zendejas B, Erwin PJ, Cook DA. Linking simulation-based educational assessments and patient-related outcomes: a systematic review and meta-analysis. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2015; 90:246-56. [PMID: 25374041 DOI: 10.1097/acm.0000000000000549] [Citation(s) in RCA: 171] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
PURPOSE To examine the evidence supporting the use of simulation-based assessments as surrogates for patient-related outcomes assessed in the workplace. METHOD The authors systematically searched MEDLINE, EMBASE, Scopus, and key journals through February 26, 2013. They included original studies that assessed health professionals and trainees using simulation and then linked those scores with patient-related outcomes assessed in the workplace. Two reviewers independently extracted information on participants, tasks, validity evidence, study quality, patient-related and simulation-based outcomes, and magnitude of correlation. All correlations were pooled using random-effects meta-analysis. RESULTS Of 11,628 potentially relevant articles, the 33 included studies enrolled 1,203 participants, including postgraduate physicians (n = 24 studies), practicing physicians (n = 8), medical students (n = 6), dentists (n = 2), and nurses (n = 1). The pooled correlation for provider behaviors was 0.51 (95% confidence interval [CI], 0.38 to 0.62; n = 27 studies); for time behaviors, 0.44 (95% CI, 0.15 to 0.66; n = 7); and for patient outcomes, 0.24 (95% CI, -0.02 to 0.47; n = 5). Most reported validity evidence was favorable, though studies often included only correlational evidence. Validity evidence of internal structure (n = 13 studies), content (n = 12), response process (n = 2), and consequences (n = 1) were reported less often. Three tools showed large pooled correlations and favorable (albeit incomplete) validity evidence. CONCLUSIONS Simulation-based assessments often correlate positively with patient-related outcomes. Although these surrogates are imperfect, tools with established validity evidence may replace workplace-based assessments for evaluating select procedural skills.
Collapse
|
Meta-Analysis |
10 |
171 |
15
|
Ginsburg S, Regehr G, Hatala R, McNaughton N, Frohna A, Hodges B, Lingard L, Stern D. Context, conflict, and resolution: a new conceptual framework for evaluating professionalism. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2000; 75:S6-S11. [PMID: 11031159 DOI: 10.1097/00001888-200010001-00003] [Citation(s) in RCA: 163] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
|
Review |
25 |
163 |
16
|
|
Editorial |
23 |
154 |
17
|
Barratt A, Wyer PC, Hatala R, McGinn T, Dans AL, Keitz S, Moyer V, For GG. Tips for learners of evidence-based medicine: 1. Relative risk reduction, absolute risk reduction and number needed to treat. CMAJ 2004; 171:353-8. [PMID: 15313996 PMCID: PMC509050 DOI: 10.1503/cmaj.1021197] [Citation(s) in RCA: 134] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022] Open
|
Review |
21 |
134 |
18
|
Bucher HC, Guyatt GH, Cook RJ, Hatala R, Cook DJ, Lang JD, Hunt D. Effect of calcium supplementation on pregnancy-induced hypertension and preeclampsia: a meta-analysis of randomized controlled trials. JAMA 1996; 275:1113-7. [PMID: 8601931 DOI: 10.1001/jama.1996.03530380055031] [Citation(s) in RCA: 121] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
OBJECTIVE To review the effect of calcium supplementation during pregnancy on blood pressure, preeclampsia, and adverse outcomes of pregnancy. DATA SOURCE We searched MEDLINE and EMBASE for 1966 to May 1994. We contacted authors of eligible trials to ensure accuracy and completeness of data and to identify unpublished trials. STUDY SELECTION Fourteen randomized trials involving 2459 women were eligible. DATA EXTRACTION Reviewers working independently in pairs abstracted data and assessed validity according to six quality criteria. DATA SYNTHESIS Each trial yielded differences in blood pressure change between calcium supplementation and control groups that we weighted by the inverse of the variance. The pooled analysis showed a reduction in systolic blood pressure of -5.40 mm Hg (95% confidence interval [CI], -7.81 to -3.00 mm Hg; P<.001) and in diastolic blood pressure of -3.44 mm Hg (95% CI, -5.20 to -1.68 mm Hg; P<.001). The odds ratio for preeclampsia in women with calcium supplementation compared with placebo was 0.38 (95% CI, 0.22 to 0.65). CONCLUSIONS Calcium supplementation during pregnancy leads to an important reduction in systolic and diastolic blood pressure and preeclampsia. While pregnant women at risk of preeclampsia should consider taking calcium, many more patient events are needed to confirm calcium's impact on maternal and fetal morbidity.
Collapse
|
Meta-Analysis |
29 |
121 |
19
|
Cook DA, Brydges R, Zendejas B, Hamstra SJ, Hatala R. Technology-enhanced simulation to assess health professionals: a systematic review of validity evidence, research methods, and reporting quality. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2013; 88:872-83. [PMID: 23619073 DOI: 10.1097/acm.0b013e31828ffdcf] [Citation(s) in RCA: 117] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
PURPOSE To summarize the tool characteristics, sources of validity evidence, methodological quality, and reporting quality for studies of technology-enhanced simulation-based assessments for health professions learners. METHOD The authors conducted a systematic review, searching MEDLINE, EMBASE, CINAHL, ERIC, PsychINFO, Scopus, key journals, and previous reviews through May 2011. They selected original research in any language evaluating simulation-based assessment of practicing and student physicians, nurses, and other health professionals. Reviewers working in duplicate evaluated validity evidence using Messick's five-source framework; methodological quality using the Medical Education Research Study Quality Instrument and the revised Quality Assessment of Diagnostic Accuracy Studies; and reporting quality using the Standards for Reporting Diagnostic Accuracy and Guidelines for Reporting Reliability and Agreement Studies. RESULTS Of 417 studies, 350 (84%) involved physicians at some stage in training. Most focused on procedural skills, including minimally invasive surgery (N=142), open surgery (81), and endoscopy (67). Common elements of validity evidence included relations with trainee experience (N=306), content (142), relations with other measures (128), and interrater reliability (124). Of the 217 studies reporting more than one element of evidence, most were judged as having high or unclear risk of bias due to selective sampling (N=192) or test procedures (132). Only 64% proposed a plan for interpreting the evidence to be presented (validity argument). CONCLUSIONS Validity evidence for simulation-based assessments is sparse and is concentrated within specific specialties, tools, and sources of validity evidence. The methodological and reporting quality of assessment studies leaves much room for improvement.
Collapse
|
Review |
12 |
117 |
20
|
Pusic MV, Boutis K, Hatala R, Cook DA. Learning curves in health professions education. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2015; 90:1034-42. [PMID: 25806621 DOI: 10.1097/acm.0000000000000681] [Citation(s) in RCA: 113] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Learning curves, which graphically show the relationship between learning effort and achievement, are common in published education research but are not often used in day-to-day educational activities. The purpose of this article is to describe the generation and analysis of learning curves and their applicability to health professions education. The authors argue that the time is right for a closer look at using learning curves-given their desirable properties-to inform both self-directed instruction by individuals and education management by instructors.A typical learning curve is made up of a measure of learning (y-axis), a measure of effort (x-axis), and a mathematical linking function. At the individual level, learning curves make manifest a single person's progress towards competence including his/her rate of learning, the inflection point where learning becomes more effortful, and the remaining distance to mastery attainment. At the group level, overlaid learning curves show the full variation of a group of learners' paths through a given learning domain. Specifically, they make overt the difference between time-based and competency-based approaches to instruction. Additionally, instructors can use learning curve information to more accurately target educational resources to those who most require them.The learning curve approach requires a fine-grained collection of data that will not be possible in all educational settings; however, the increased use of an assessment paradigm that explicitly includes effort and its link to individual achievement could result in increased learner engagement and more effective instructional design.
Collapse
|
|
10 |
113 |
21
|
Hatala R, Keitz S, Wyer P, Guyatt G. Tips for learners of evidence-based medicine: 4. Assessing heterogeneity of primary studies in systematic reviews and whether to combine their results. CMAJ 2005; 172:661-5. [PMID: 15738493 PMCID: PMC550638 DOI: 10.1503/cmaj.1031920] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
|
Review |
20 |
106 |
22
|
Abstract
To examine the effect of clinical history on the electrocardiogram (ECG) interpretation skills of physicians with different levels of expertise, we randomly allocated to an ECG test package 30 final-year medical students, 15 second-year internal medicine residents, and 15 university cardiologists at university-affiliated teaching hospitals. All participants interpreted the same set of 10 ECGs. Each ECG was accompanied by a brief clinical history suggestive of the correct ECG diagnosis, or the most plausible alternative diagnosis, or no history. Provision of a correct history improved accuracy by 4% to 12% compared with no history, depending on level of training. Conversely, a misleading history compared with no history reduced accuracy by 5% for cardiologists, 25% for residents, and 19% for students. Clinical history also affected the participants' frequencies of listing ECG features consistent with the correct diagnosis and features consistent with the alternative diagnosis (all p values < .05). For physicians at all levels of expertise, clinical history has an influence on ECG diagnostic accuracy, both improving accuracy when the history suggests the correct diagnosis, and reducing accuracy when the history suggests an alternative diagnosis.
Collapse
|
Clinical Trial |
26 |
104 |
23
|
Hatala R, Cook DA, Brydges R, Hawkins R. Constructing a validity argument for the Objective Structured Assessment of Technical Skills (OSATS): a systematic review of validity evidence. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2015; 20:1149-75. [PMID: 25702196 DOI: 10.1007/s10459-015-9593-1] [Citation(s) in RCA: 97] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2014] [Accepted: 02/15/2015] [Indexed: 05/28/2023]
Abstract
In order to construct and evaluate the validity argument for the Objective Structured Assessment of Technical Skills (OSATS), based on Kane's framework, we conducted a systematic review. We searched MEDLINE, EMBASE, CINAHL, PsycINFO, ERIC, Web of Science, Scopus, and selected reference lists through February 2013. Working in duplicate, we selected original research articles in any language evaluating the OSATS as an assessment tool for any health professional. We iteratively and collaboratively extracted validity evidence from included articles to construct and evaluate the validity argument for varied uses of the OSATS. Twenty-nine articles met the inclusion criteria, all focussed on surgical technical skills assessment. We identified three intended uses for the OSATS, namely formative feedback, high-stakes assessment and program evaluation. Following Kane's framework, four inferences in the validity argument were examined (scoring, generalization, extrapolation, decision). For formative feedback and high-stakes assessment, there was reasonable evidence for scoring and extrapolation. However, for high-stakes assessment there was a dearth of evidence for generalization aside from inter-rater reliability data and an absence of evidence linking multi-station OSATS scores to performance in real clinical settings. For program evaluation, the OSATS validity argument was supported by reasonable generalization and extrapolation evidence. There was a complete lack of evidence regarding implications and decisions based on OSATS scores. In general, validity evidence supported the use of the OSATS for formative feedback. Research to provide support for decisions based on OSATS scores is required if the OSATS is to be used for higher-stakes decisions and program evaluation.
Collapse
|
Review |
10 |
97 |
24
|
Hatala R, Cook DA, Zendejas B, Hamstra SJ, Brydges R. Feedback for simulation-based procedural skills training: a meta-analysis and critical narrative synthesis. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2014; 19:251-72. [PMID: 23712700 DOI: 10.1007/s10459-013-9462-8] [Citation(s) in RCA: 95] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Accepted: 05/13/2013] [Indexed: 05/11/2023]
Abstract
Although feedback has been identified as a key instructional feature in simulation based medical education (SBME), we remain uncertain as to the magnitude of its effectiveness and the mechanisms by which it may be effective. We employed a meta-analysis and critical narrative synthesis to examine the effectiveness of feedback for SBME procedural skills training and to examine how it works in this context. Our results demonstrate that feedback is moderately effective during procedural skills training in SBME, with a pooled effect size favoring feedback for skill outcomes of 0.74 (95 % CI 0.38-1.09; p < .001). Terminal feedback appears more effective than concurrent feedback for novice learners' skill retention. Multiple sources of feedback, including instructor feedback, lead to short-term performance gains although data on long-term effects is lacking. The mechanism by which feedback may be operating is consistent with the guidance hypothesis, with more research needed to examine other mechanisms such as cognitive load theory and social development theory.
Collapse
|
Meta-Analysis |
11 |
95 |
25
|
LaDonna KA, Hatala R, Lingard L, Voyer S, Watling C. Staging a performance: learners' perceptions about direct observation during residency. MEDICAL EDUCATION 2017; 51:498-510. [PMID: 28247495 DOI: 10.1111/medu.13232] [Citation(s) in RCA: 94] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2016] [Revised: 06/29/2016] [Accepted: 09/27/2016] [Indexed: 05/14/2023]
Abstract
CONTEXT Evidence strongly supports that direct observation is a valid and reliable assessment tool; support for its impact on learning is less compelling, and we know that some learners are ambivalent about being observed. However, learners' perceptions about the impact of direct observation on their learning and professional development remain underexplored. To promote learning, we need to understand what makes direct observation valuable for learners. METHODS Informed by constructivist grounded theory, we interviewed 22 learners about their observation experiences. Data collection and analysis occurred iteratively; themes were identified using constant comparative analysis. RESULTS Direct observation was widely endorsed as an important educational strategy, albeit one that created significant anxiety. Opaque expectations exacerbated participants' discomfort, and participants described that being observed felt like being assessed. Consequently, participants exchanged their 'usual' practice for a 'textbook' approach; alterations to performance generated uncertainty about their role, and raised questions about whether observers saw an authentic portrayal of their knowledge and skill. CONCLUSION An 'observer effect' may partly explain learners' ambivalence about direct observation; being observed seemed to magnify learners' role ambiguity, intensify their tensions around professional development and raise questions about the credibility of feedback. In turn, an observer effect may impact learners' receptivity to feedback and may explain, in part, learners' perceptions that useful feedback is scant. For direct observation to be valuable, educators must be explicit about expectations, and they must be aware that how learners perform in the presence of an observer may not reflect what they do as independent practitioners. To nurture learners' professional development, educators must create a culture of observation-based coaching that is divorced from assessment and is tailored to developing learners' identities as practitioners of both the art and the science of medicine.
Collapse
|
|
8 |
94 |