1
|
Dukhanin V, Gamper MJ, Gleason KT, McDonald KM. Patient-reported outcome and experience domains for diagnostic excellence: a scoping review to inform future measure development. Qual Life Res 2024; 33:2883-2897. [PMID: 38850395 DOI: 10.1007/s11136-024-03709-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/31/2024] [Indexed: 06/10/2024]
Abstract
PURPOSE "Diagnostic excellence," as a relatively new construct centered on the diagnostic process and its health-related outcomes, can be refined by patient reporting and its measurement. We aimed to explore the scope of patient-reported outcome (PRO) and patient-reported experience (PRE) domains that are diagnostically relevant, regardless of the future diagnosed condition, and to review the state of measurement of these patient-reported domains. METHODS We conducted an exploratory analysis to identify these domains by employing a scoping review supplemented with internal expert consultations, 24-member international expert convening, additional environmental scans, and the validation of the domains' diagnostic relevance via mapping these onto patient diagnostic journeys. We created a narrative bibliography of the domains illustrating them with existing measurement examples. RESULTS We identified 41 diagnostically relevant PRO and PRE domains. We classified 10 domains as PRO, 28 as PRE, and three as mixed PRO/PRE. Among these domains, 19 were captured in existing instruments, and 20 were captured only in qualitative studies. Two domains were conceptualized during this exploratory analysis with no examples identified of capturing these domains. For 27 domains, patients and care partners report on a specific encounter; for 14 domains, reporting relates to an entire diagnostic journey over time, which presents particular measurement opportunities and challenges. CONCLUSION The multitude of PRO and PRE domains, if measured rigorously, would allow the diagnostic excellence construct to evolve further and in a manner that is patient-centered, prospectively focused, and concentrates on effectiveness and efficiency of diagnostic care on patients' well-being.
Collapse
Affiliation(s)
- Vadim Dukhanin
- Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, 624 N Broadway, Suite 643, Baltimore, MD 21205, USA.
| | - Mary Jo Gamper
- Johns Hopkins University School of Nursing, Baltimore, MD, USA
| | - Kelly T Gleason
- Johns Hopkins University School of Nursing, Baltimore, MD, USA
| | - Kathryn M McDonald
- Johns Hopkins University School of Nursing, Baltimore, MD, USA
- Division of General Internal Medicine, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
2
|
Hirosawa T, Harada Y, Mizuta K, Sakamoto T, Tokumasu K, Shimizu T. Evaluating ChatGPT-4's Accuracy in Identifying Final Diagnoses Within Differential Diagnoses Compared With Those of Physicians: Experimental Study for Diagnostic Cases. JMIR Form Res 2024; 8:e59267. [PMID: 38924784 PMCID: PMC11237772 DOI: 10.2196/59267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 04/28/2024] [Accepted: 05/04/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND The potential of artificial intelligence (AI) chatbots, particularly ChatGPT with GPT-4 (OpenAI), in assisting with medical diagnosis is an emerging research area. However, it is not yet clear how well AI chatbots can evaluate whether the final diagnosis is included in differential diagnosis lists. OBJECTIVE This study aims to assess the capability of GPT-4 in identifying the final diagnosis from differential-diagnosis lists and to compare its performance with that of physicians for case report series. METHODS We used a database of differential-diagnosis lists from case reports in the American Journal of Case Reports, corresponding to final diagnoses. These lists were generated by 3 AI systems: GPT-4, Google Bard (currently Google Gemini), and Large Language Models by Meta AI 2 (LLaMA2). The primary outcome was focused on whether GPT-4's evaluations identified the final diagnosis within these lists. None of these AIs received additional medical training or reinforcement. For comparison, 2 independent physicians also evaluated the lists, with any inconsistencies resolved by another physician. RESULTS The 3 AIs generated a total of 1176 differential diagnosis lists from 392 case descriptions. GPT-4's evaluations concurred with those of the physicians in 966 out of 1176 lists (82.1%). The Cohen κ coefficient was 0.63 (95% CI 0.56-0.69), indicating a fair to good agreement between GPT-4 and the physicians' evaluations. CONCLUSIONS GPT-4 demonstrated a fair to good agreement in identifying the final diagnosis from differential-diagnosis lists, comparable to physicians for case report series. Its ability to compare differential diagnosis lists with final diagnoses suggests its potential to aid clinical decision-making support through diagnostic feedback. While GPT-4 showed a fair to good agreement for evaluation, its application in real-world scenarios and further validation in diverse clinical environments are essential to fully understand its utility in the diagnostic process.
Collapse
Affiliation(s)
- Takanobu Hirosawa
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Tochigi, Japan
| | - Yukinori Harada
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Tochigi, Japan
| | - Kazuya Mizuta
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Tochigi, Japan
| | - Tetsu Sakamoto
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Tochigi, Japan
| | - Kazuki Tokumasu
- Department of General Medicine, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, Okayama, Japan
| | - Taro Shimizu
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Tochigi, Japan
| |
Collapse
|
3
|
Butler JM, Taft T, Taber P, Rutter E, Fix M, Baker A, Weir C, Nevers M, Classen D, Cosby K, Jones M, Chapman A, Jones BE. Pneumonia diagnosis performance in the emergency department: a mixed-methods study about clinicians' experiences and exploration of individual differences and response to diagnostic performance feedback. J Am Med Inform Assoc 2024; 31:1503-1513. [PMID: 38796835 PMCID: PMC11187426 DOI: 10.1093/jamia/ocae112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 03/25/2024] [Accepted: 05/22/2024] [Indexed: 05/29/2024] Open
Abstract
OBJECTIVES We sought to (1) characterize the process of diagnosing pneumonia in an emergency department (ED) and (2) examine clinician reactions to a clinician-facing diagnostic discordance feedback tool. MATERIALS AND METHODS We designed a diagnostic feedback tool, using electronic health record data from ED clinicians' patients to establish concordance or discordance between ED diagnosis, radiology reports, and hospital discharge diagnosis for pneumonia. We conducted semistructured interviews with 11 ED clinicians about pneumonia diagnosis and reactions to the feedback tool. We administered surveys measuring individual differences in mindset beliefs, comfort with feedback, and feedback tool usability. We qualitatively analyzed interview transcripts and descriptively analyzed survey data. RESULTS Thematic results revealed: (1) the diagnostic process for pneumonia in the ED is characterized by diagnostic uncertainty and may be secondary to goals to treat and dispose the patient; (2) clinician diagnostic self-evaluation is a fragmented, inconsistent process of case review and follow-up that a feedback tool could fill; (3) the feedback tool was described favorably, with task and normative feedback harnessing clinician values of high-quality patient care and personal excellence; and (4) strong reactions to diagnostic feedback varied from implicit trust to profound skepticism about the validity of the concordance metric. Survey results suggested a relationship between clinicians' individual differences in learning and failure beliefs, feedback experience, and usability ratings. DISCUSSION AND CONCLUSION Clinicians value feedback on pneumonia diagnoses. Our results highlight the importance of feedback about diagnostic performance and suggest directions for considering individual differences in feedback tool design and implementation.
Collapse
Affiliation(s)
- Jorie M Butler
- Department of Biomedical Informatics, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
- Department of Internal Medicine, Division of Geriatrics, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84132, United States
- Salt Lake City VA Informatics Decision-Enhancement and Analytic Sciences (IDEAS) Center for Innovation, Salt Lake City, UT 84148, United States
- Geriatrics Research, Education, and Clinical Center (GRECC), VA Salt Lake City Health Care System, Salt Lake City, UT 84148, United States
| | - Teresa Taft
- Department of Biomedical Informatics, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
| | - Peter Taber
- Department of Biomedical Informatics, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
- Salt Lake City VA Informatics Decision-Enhancement and Analytic Sciences (IDEAS) Center for Innovation, Salt Lake City, UT 84148, United States
| | - Elizabeth Rutter
- Department of Emergency Medicine, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
| | - Megan Fix
- Department of Emergency Medicine, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
| | - Alden Baker
- Department of Family and Preventive Medicine, Division of Physician Assistant Studies, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
| | - Charlene Weir
- Department of Biomedical Informatics, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
| | - McKenna Nevers
- Department of Internal Medicine, Division of Epidemiology, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
| | - David Classen
- Department of Internal Medicine, Division of Epidemiology, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
| | - Karen Cosby
- Department of Emergency Medicine, Cook County Hospital, Rush Medical College, Chicago, IL 60612, United States
| | - Makoto Jones
- Salt Lake City VA Informatics Decision-Enhancement and Analytic Sciences (IDEAS) Center for Innovation, Salt Lake City, UT 84148, United States
- Department of Internal Medicine, Division of Epidemiology, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
| | - Alec Chapman
- Department of Population Health Sciences, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
| | - Barbara E Jones
- Salt Lake City VA Informatics Decision-Enhancement and Analytic Sciences (IDEAS) Center for Innovation, Salt Lake City, UT 84148, United States
- Department of Internal Medicine, Division of Pulmonology, University of Utah Spencer Fox Eccles School of Medicine, Salt Lake City, UT 84108, United States
| |
Collapse
|
4
|
Caddick ZA, Fraundorf SH, Rottman BM, Nokes-Malach TJ. Cognitive perspectives on maintaining physicians' medical expertise: II. Acquiring, maintaining, and updating cognitive skills. Cogn Res Princ Implic 2023; 8:47. [PMID: 37488460 PMCID: PMC10366061 DOI: 10.1186/s41235-023-00497-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 06/20/2023] [Indexed: 07/26/2023] Open
Abstract
Over the course of training, physicians develop significant knowledge and expertise. We review dual-process theory, the dominant theory in explaining medical decision making: physicians use both heuristics from accumulated experience (System 1) and logical deduction (System 2). We then discuss how the accumulation of System 1 clinical experience can have both positive effects (e.g., quick and accurate pattern recognition) and negative ones (e.g., gaps and biases in knowledge from physicians' idiosyncratic clinical experience). These idiosyncrasies, biases, and knowledge gaps indicate a need for individuals to engage in appropriate training and study to keep these cognitive skills current lest they decline over time. Indeed, we review converging evidence that physicians further out from training tend to perform worse on tests of medical knowledge and provide poorer patient care. This may reflect a variety of factors, such as specialization of a physician's practice, but is likely to stem at least in part from cognitive factors. Acquired knowledge or skills gained may not always be readily accessible to physicians for a number of reasons, including an absence of study, cognitive changes with age, and the presence of other similar knowledge or skills that compete in what is brought to mind. Lastly, we discuss the cognitive challenges of keeping up with standards of care that continuously evolve over time.
Collapse
Affiliation(s)
- Zachary A Caddick
- Learning Research and Development Center, University of Pittsburgh, 3420 Forbes Ave., Pittsburgh, PA, 15260, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Scott H Fraundorf
- Learning Research and Development Center, University of Pittsburgh, 3420 Forbes Ave., Pittsburgh, PA, 15260, USA.
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Benjamin M Rottman
- Learning Research and Development Center, University of Pittsburgh, 3420 Forbes Ave., Pittsburgh, PA, 15260, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Timothy J Nokes-Malach
- Learning Research and Development Center, University of Pittsburgh, 3420 Forbes Ave., Pittsburgh, PA, 15260, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
5
|
Rottman BM, Caddick ZA, Nokes-Malach TJ, Fraundorf SH. Cognitive perspectives on maintaining physicians' medical expertise: I. Reimagining Maintenance of Certification to promote lifelong learning. Cogn Res Princ Implic 2023; 8:46. [PMID: 37486508 PMCID: PMC10366070 DOI: 10.1186/s41235-023-00496-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 06/20/2023] [Indexed: 07/25/2023] Open
Abstract
Until recently, physicians in the USA who were board-certified in a specialty needed to take a summative test every 6-10 years. However, the 24 Member Boards of the American Board of Medical Specialties are in the process of switching toward much more frequent assessments, which we refer to as longitudinal assessment. The goal of longitudinal assessments is to provide formative feedback to physicians to help them learn content they do not know as well as serve an evaluation for board certification. We present five articles collectively covering the science behind this change, the likely outcomes, and some open questions. This initial article introduces the context behind this change. This article also discusses various forms of lifelong learning opportunities that can help physicians stay current, including longitudinal assessment, and the pros and cons of each.
Collapse
Affiliation(s)
- Benjamin M Rottman
- Learning Research and Development Center, University of Pittsburgh, 3420 Forbes Ave., Pittsburgh, PA, 15260, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
| | - Zachary A Caddick
- Learning Research and Development Center, University of Pittsburgh, 3420 Forbes Ave., Pittsburgh, PA, 15260, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
| | - Timothy J Nokes-Malach
- Learning Research and Development Center, University of Pittsburgh, 3420 Forbes Ave., Pittsburgh, PA, 15260, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
| | - Scott H Fraundorf
- Learning Research and Development Center, University of Pittsburgh, 3420 Forbes Ave., Pittsburgh, PA, 15260, USA.
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA.
| |
Collapse
|
6
|
Schaye V, Parsons AS, Graber ML, Olson APJ. The future of diagnosis - where are we going? Diagnosis (Berl) 2023; 10:1-3. [PMID: 36720463 DOI: 10.1515/dx-2023-0003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
- Verity Schaye
- Department of Medicine, NYU Grossman School of Medicine, New York, NY, USA
| | - Andrew S Parsons
- Department of Medicine, Section of Hospital Medicine, University of Virginia School of Medicine, Charlottesville, VA, USA
| | - Mark L Graber
- Founder and President Emeritus, Society to Improve Diagnosis in Medicine, Plymouth, MA, USA.,Professor Emeritus, Stony Brook University, NY, USA
| | - Andrew P J Olson
- Division of Hospital Medicine, Department of Medicine, Division of Pediatric Hospital Medicine, Department of Pediatrics, University of Minnesota Medical School, Minneapolis, MN, USA
| |
Collapse
|