1
|
Rahman H, Anggadiredja K, Sasongko L. Mechanisms of oral ciprofloxacin-induced depressive-like behavior and the potential benefit of lactulose: A correlation analysis. Toxicol Rep 2025; 14:101920. [PMID: 39911318 PMCID: PMC11795828 DOI: 10.1016/j.toxrep.2025.101920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Revised: 01/06/2025] [Accepted: 01/19/2025] [Indexed: 02/07/2025] Open
Abstract
Prolonged administration of antibiotics may be associated with depression due to the potential risk of dysbiosis. Thus, the restoration of microbial balance, through administration of prebiotics, might overcome the problem. This study investigated the mechanisms of antibiotic-induced depression, which were explored through statistical correlation analysis. The potential benefit of lactulose, a prebiotic, on this behavioral disorder was further assessed. The rats were assigned to groups receiving 102.8 mg/kg ciprofloxacin daily for 1, 8, 15, or 22 days. A different group of rat was given the same regimen for 8 days accompanied with lactulose at 2056 mg/kg. Upon completion of ciprofloxacin administration, the rats were tested for depression-like behavior (forced swimming test, FST; and sucrose preference test, SPT). They were then sacrificed for biochemical assessment in the hippocampus and prefrontal cortex. The mechanism studies revealed significant correlation between SPT vs. serotonin in the hippocampus, and SPT vs. serotonin, cortisol, NF-κB in the prefrontal cortex. Meanwhile, FST was significantly correlated with serotonin in the hippocampus and the prefrontal cortex, while in the prefrontal cortex it was significantly correlated with cortisol, NF-κB, and IL-6. Based on the afore-mentioned results, it was found that lactulose improved FST by targeting serotonin in the hippocampus. This study indicate that ciprofloxacin induce depression-like behavior via modulation of several neurotransmitter system as well as proinflammatory cytokines in the hippocampus and prefrontal cortex. The results further suggest the potential of lactulose to improve this behavior.
Collapse
Affiliation(s)
- Havizur Rahman
- Department of Pharmaceutics, School of Pharmacy, Institut Teknologi Bandung, Bandung 41116, Indonesia
- Department of Pharmacy, Faculty of Medicine and Health Sciences, University of Jambi, Jambi 36361, Indonesia
| | - Kusnandar Anggadiredja
- Department of Pharmacology and Clinical Pharmacy, School of Pharmacy, Institut Teknologi Bandung, Bandung 41116, Indonesia
| | - Lucy Sasongko
- Department of Pharmaceutics, School of Pharmacy, Institut Teknologi Bandung, Bandung 41116, Indonesia
| |
Collapse
|
2
|
Chima S, Martinez-Gutierrez J, Hunter B, Laughlin A, Chondros P, Lumsden N, Boyle D, Nelson C, Amores P, Tran-Duy A, Manski-Nankervis JA, Emery J. Future Health Today and patients at risk of undiagnosed cancer: a pragmatic cluster randomised trial of quality- improvement activities in general practice. Br J Gen Pract 2025; 75:e306-e315. [PMID: 39567181 PMCID: PMC12010534 DOI: 10.3399/bjgp.2024.0491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 11/04/2024] [Indexed: 11/22/2024] Open
Abstract
BACKGROUND Diagnosing cancer in general practice is complex, given the non-specific nature of many presenting symptoms and the overlap of potential diagnoses. AIM This trial aimed to evaluate the effectiveness of Future Health Today (FHT) - a technology that provides clinical decision support, auditing, and quality-improvement monitoring - on the appropriate follow-up of patients at risk of undiagnosed cancer. DESIGN AND SETTING Pragmatic, cluster randomised trial undertaken in general practices in Victoria and Tasmania, Australia. METHOD Practices were randomly assigned to receive recommendations for follow-up investigations for cancer (FHT cancer module) or the active control. Algorithms were applied to the electronic medical record, and used demographic information and abnormal test results that are associated with a risk of undiagnosed cancer (that is, anaemia/iron deficiency, thrombocytosis, and raised prostate-specific antigen) to identify patients requiring further investigation and provide recommendations for care. The intervention consisted of the FHT cancer module, a case-based learning series, and ongoing practice support. Using the intention-to-treat approach, the between-arm difference in the proportion of patients with abnormal test results who were followed up according to guidelines was determined at 12 months. RESULTS In total, 7555 patients were identified as at risk of undiagnosed cancer. At 12 months post-randomisation, 76.0% of patients in the intervention arm had received recommended follow-up (21 practices, n = 2820/3709), compared with 70.0% in the control arm (19 practices, n = 2693/3846; estimated between-arm difference = 2.6% [95% confidence interval (CI)] = -2.8% to 7.9%; odds ratio = 1.15 [95% CI = 0.87 to 1.53]; P = 0.332). CONCLUSION The FHT cancer module intervention did not increase the proportion of patients receiving guideline-concordant care. The proportion of patients receiving recommended follow-up was high, suggesting a possible ceiling effect for the intervention.
Collapse
Affiliation(s)
- Sophie Chima
- Department of General Practice and Primary Care, University of Melbourne, Melbourne, Australia
| | - Javiera Martinez-Gutierrez
- Department of General Practice and Primary Care, University of Melbourne, Melbourne, Australia; Department of Family Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Barbara Hunter
- Department of General Practice and Primary Care, University of Melbourne, Melbourne, Australia
| | - Adrian Laughlin
- Department of General Practice and Primary Care, University of Melbourne, Melbourne, Australia
| | - Patty Chondros
- Department of General Practice and Primary Care, University of Melbourne, Melbourne, Australia
| | - Natalie Lumsden
- Department of General Practice and Primary Care, University of Melbourne, Melbourne, Australia; Western Health Chronic Disease Alliance, Western Health, Sunshine, Australia
| | - Douglas Boyle
- Department of General Practice and Primary Care, University of Melbourne, Melbourne, Australia; Centre for Research Excellence in Interactive Digital Technology to Transform Australia's Chronic Disease Outcomes, Melbourne, Australia
| | - Craig Nelson
- Department of Medicine, Western Health, University of Melbourne, Sunshine, Australia; Department of Nephrology, Western Health, Sunshine, Australia
| | - Paul Amores
- Centre for Health Policy, University of Melbourne, Melbourne, Australia; Methods and Implementation Support for Clinical Health Research Hub, University of Melbourne, Melbourne, Australia
| | - An Tran-Duy
- Centre for Health Policy, University of Melbourne, Melbourne, Australia; Methods and Implementation Support for Clinical Health Research Hub, University of Melbourne, Melbourne, Australia
| | - Jo-Anne Manski-Nankervis
- Department of General Practice and Primary Care, University of Melbourne, Melbourne, Australia; Primary Care and Family Medicine, LKC Medicine, Nanyang Technological University, Singapore
| | - Jon Emery
- Department of General Practice and Primary Care, University of Melbourne, Melbourne, Australia, and Western Health, Sunshine, Australia
| |
Collapse
|
3
|
Schaye V, DiTullio DJ, Sartori DJ, Hauck K, Haller M, Reinstein I, Guzman B, Burk-Rafel J. Artificial intelligence based assessment of clinical reasoning documentation: an observational study of the impact of the clinical learning environment on resident documentation quality. BMC MEDICAL EDUCATION 2025; 25:591. [PMID: 40264096 PMCID: PMC12016287 DOI: 10.1186/s12909-025-07191-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 04/17/2025] [Indexed: 04/24/2025]
Abstract
BACKGROUND Objective measures and large datasets are needed to determine aspects of the Clinical Learning Environment (CLE) impacting the essential skill of clinical reasoning documentation. Artificial Intelligence (AI) offers a solution. Here, the authors sought to determine what aspects of the CLE might be impacting resident clinical reasoning documentation quality assessed by AI. METHODS In this observational, retrospective cross-sectional analysis of hospital admission notes from the Electronic Health Record (EHR), all categorical internal medicine (IM) residents who wrote at least one admission note during the study period July 1, 2018- June 30, 2023 at two sites of NYU Grossman School of Medicine's IM residency program were included. Clinical reasoning documentation quality of admission notes was determined to be low or high-quality using a supervised machine learning model. From note-level data, the shift (day or night) and note index within shift (if a note was first, second, etc. within shift) were calculated. These aspects of the CLE were included as potential markers of workload, which have been shown to have a strong relationship with resident performance. Patient data was also captured, including age, sex, Charlson Comorbidity Index, and primary diagnosis. The relationship between these variables and clinical reasoning documentation quality was analyzed using generalized estimating equations accounting for resident-level clustering. RESULTS Across 37,750 notes authored by 474 residents, patients who were older, had more pre-existing comorbidities, and presented with certain primary diagnoses (e.g., infectious and pulmonary conditions) were associated with higher clinical reasoning documentation quality. When controlling for these and other patient factors, variables associated with clinical reasoning documentation quality included academic year (adjusted odds ratio, aOR, for high-quality: 1.10; 95% CI 1.06-1.15; P <.001), night shift (aOR 1.21; 95% CI 1.13-1.30; P <.001), and note index (aOR 0.93; 95% CI 0.90-0.95; P <.001). CONCLUSIONS AI can be used to assess complex skills such as clinical reasoning in authentic clinical notes that can help elucidate the potential impact of the CLE on resident clinical reasoning documentation quality. Future work should explore residency program and systems interventions to optimize the CLE.
Collapse
Affiliation(s)
- Verity Schaye
- Department of Medicine, New York University Grossman School of Medicine, New York, NY, USA.
- Institute for Innovations in Medical Education, New York University Grossman School of Medicine, New York, NY, USA.
| | - David J DiTullio
- Department of Medicine, New York University Grossman School of Medicine, New York, NY, USA
| | - Daniel J Sartori
- Department of Medicine, New York University Grossman School of Medicine, New York, NY, USA
| | - Kevin Hauck
- Department of Medicine, New York University Grossman School of Medicine, New York, NY, USA
| | - Matthew Haller
- Department of Medicine, New York University Grossman School of Medicine, New York, NY, USA
| | - Ilan Reinstein
- Institute for Innovations in Medical Education, New York University Grossman School of Medicine, New York, NY, USA
| | - Benedict Guzman
- Division of Applied AI Technologies, New York University Langone Health, New York, NY, USA
| | - Jesse Burk-Rafel
- Department of Medicine, New York University Grossman School of Medicine, New York, NY, USA
- Institute for Innovations in Medical Education, New York University Grossman School of Medicine, New York, NY, USA
| |
Collapse
|
4
|
Mirata D, Tiezzi AC, Buffoni L, Pagnini I, Maccora I, Marrani E, Mastrolia MV, Simonini G, Giani T. Learning-Based Models for Predicting IVIG Resistance and Coronary Artery Lesions in Kawasaki Disease: A Review of Technical Aspects and Study Features. Paediatr Drugs 2025:10.1007/s40272-025-00693-7. [PMID: 40180759 DOI: 10.1007/s40272-025-00693-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/04/2025] [Indexed: 04/05/2025]
Abstract
Kawasaki disease (KD) is a common pediatric vasculitis, with coronary artery lesions (CALs) representing its most severe complication. Early identification of high-risk patients, including those with disease resistant to first-line treatments, is essential to guide personalized therapeutic approaches. Given the limited reliability of current scoring systems, there has been growing interest in the development of new prognostic models based on machine learning algorithms and artificial intelligence (AI). AI has the potential to revolutionize the management of KD by improving patient stratification and supporting more targeted treatment strategies. This narrative review examines recent applications of AI in stratifying patients with KD, with a particular focus on the ability of models to predict intravenous immunoglobulin resistance and the risk of CALs. We analyzed studies published between January 2019 and April 2024 that incorporated AI-based predictive models. In total, 21 papers met the inclusion criteria and were subject to technical and statistical review; 90% of these were conducted in patients from Asian hospitals. Most of the studies (18/21; 85.7%) were retrospective, and two-thirds included fewer than 1000 patients. Significant heterogeneity in study design and parameter selection was observed across the studies. Resistance to intravenous immunoglobulin emerged as a key factor in AI-based models for predicting CALs. Only five models demonstrated a sensitivity > 80%, and four studies provided access to the underlying algorithms and datasets. Challenges such as small sample sizes, class imbalance, and the need for multicenter validation currently limit the clinical applicability of machine-learning-based predictive models. The effectiveness of AI models is heavily influenced by the quantity and quality of data, labeling accuracy, and the completeness of the training datasets. Additionally, issues such as noise and missing data can negatively affect model performance and generalizability. These limitations highlight the need for rigorous validation and open access to model code to ensure transparency and reproducibility. Collaboration and data sharing will be essential for refining AI algorithms, improving patient stratification, and optimizing treatment strategies.
Collapse
Affiliation(s)
- Danilo Mirata
- Pediatric Department, School of Sciences of Human Health, University of Florence, Florence, Italy
| | - Anna Chiara Tiezzi
- Pediatric Department, School of Sciences of Human Health, University of Florence, Florence, Italy
| | - Lorenzo Buffoni
- Department of Physics and Astronomy, School of Physical, Mathematical and Natural Sciences, University of Florence, Sesto Fiorentino, Italy
| | - Ilaria Pagnini
- Rheumatology Unit, ERN ReCONNET Center, Meyer Children's Hospital IRCCS, Firenze, Italy
| | - Ilaria Maccora
- Rheumatology Unit, ERN ReCONNET Center, Meyer Children's Hospital IRCCS, Firenze, Italy
| | - Edoardo Marrani
- Rheumatology Unit, ERN ReCONNET Center, Meyer Children's Hospital IRCCS, Firenze, Italy
| | | | - Gabriele Simonini
- Rheumatology Unit, ERN ReCONNET Center, Meyer Children's Hospital IRCCS, Firenze, Italy
| | - Teresa Giani
- Rheumatology Unit, ERN ReCONNET Center, Meyer Children's Hospital IRCCS, Firenze, Italy.
- AOU Meyer IRCCS, Viale Pieraccini 24, 50139, Florence, Italy.
| |
Collapse
|
5
|
Fink W, Kasper O, Kamenski G, Zehetmayer S, Kleinbichler D, Konitzer M. Frequency distribution of health disorders in primary care-its consistency and meaning for diagnostics and nomenclature. Wien Med Wochenschr 2025; 175:99-109. [PMID: 39037633 PMCID: PMC11928369 DOI: 10.1007/s10354-024-01049-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 06/13/2024] [Indexed: 07/23/2024]
Abstract
RN Braun observed that frequencies of health disorders in general practice are so consistent that he called his discovery "Case Distribution Law". Our study compares morbidity data from methodologically similar surveys in primary care practices over a period of fifty years. Frequency ranks were determined for each observation period and the first 150 ranks were compared with Spearman's correlation coefficients. All correlations were consistently positive. Frequency ranks were strikingly similar for surveys carried out at approximately the same time, especially when nomenclatural matching had been carried out before data collection. Ranks were also very similar where clear disease classifications were possible, but less so for non-specific symptoms.The consistency of the distribution of health disorders helps develop diagnostic strategies (diagnostic protocols) and appropriate labeling for non-specific, diagnostically open symptom classifications. According to Braun's considerations, the regularity of case distribution plays an important role in the professionalization of primary care.
Collapse
Affiliation(s)
- Waltraud Fink
- Karl Landsteiner Institute for Systematics in General Practice, Straning 153, 3722, Straning, Austria.
| | - Otto Kasper
- Karl Landsteiner Institute for Systematics in General Practice, Reinöd 26, 3242, Texing, Austria
| | - Gustav Kamenski
- Karl Landsteiner Institute for Systematics in General Practice, Ollersbachgasse 144, 2261, Angern/March, Austria
| | - Sonja Zehetmayer
- Institute of Medical Statistics-Center for Medical Data Science, Medical University of Vienna, Spitalgasse 23, 1090, Vienna, Austria
| | - Dietmar Kleinbichler
- Karl Landsteiner Institute for Systematics in General Practice, Reiterhofgasse 1, 3385, Markersdorf, Austria
| | - Martin Konitzer
- Academic Teaching Practice, Hannover Medical School MHH, Hannover, Germany
- Karl Landsteiner Institute for Systematics in General Practice, Bahnhofstr. 5, 29690, Schwarmstedt, Germany
| |
Collapse
|
6
|
Flash M, Lynch EA, Lacson R, Guenette JP, Desai S, Kapoor N. Predictors of Physician Agreement With Radiologist-Recommended Follow-up Imaging. J Am Coll Radiol 2025; 22:407-416. [PMID: 39551329 DOI: 10.1016/j.jacr.2024.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 11/01/2024] [Accepted: 11/12/2024] [Indexed: 11/19/2024]
Abstract
OBJECTIVE Although recommendations for additional imaging are common in radiology reports, completion of follow-up imaging does not always occur, which could reflect disagreement between radiologist and referring provider. We assessed how frequently referring providers agree with radiologists' follow-up recommendations, reasons for disagreement, and factors associated with radiologist-referring provider agreement. METHODS This institutional review board-exempt, retrospective study was performed at a large academic center. A PACS-integrated tool allowed radiologists to send follow-up imaging recommendations to referring providers, who used the tool to document agreement or disagreement with recommendations. The study included recommendations sent for outpatients between October 21, 2019, and October 31, 2022. Multivariable logistic regression analysis was performed to identify patient, radiologist, and imaging examination factors associated with radiologist-referring provider agreement. RESULTS Of the 9,406 recommendations meeting inclusion criteria, 8,331 (88.6%) resulted in agreement. The most common reason for disagreement was that the recommendation was considered not clinically relevant (44.5%, 478 of 1,075). The following factors were associated with low rates of agreement: referring provider being a surgeon (odds ratio [OR] 0.73, P < .001) or recommendation for follow-up nuclear imaging (OR 0.64, P = .012). The odds of agreement were higher for recommendations made by thoracic radiologists (OR 1.41, P = .002) and for recommendations with longer follow-up time frames (weeks) (OR 1.03, P < .001). Patient race, ethnicity, insurance type, and living in a socio-economically disadvantaged neighborhood were not significantly associated with radiologist-referring provider agreement. DISCUSSION Referring providers frequently agree with follow-up imaging recommendations made by radiologists for outpatients, and patient demographics and socio-economic factors do not seem to significantly impact radiologist-referring provider agreement.
Collapse
Affiliation(s)
- Moses Flash
- Department of Radiology, Hospital of the University of Pennsylvania, University of Pennsylvania Health System, Philadelphia, Pennsylvania
| | - Elyse A Lynch
- Center for Evidence-Based Imaging, Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Ronilda Lacson
- Center for Evidence-Based Imaging, Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Jeffrey P Guenette
- Center for Evidence-Based Imaging, Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Sonali Desai
- Vice President, Quality, and Associate Chief Medical Officer, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Neena Kapoor
- Associate Chair of Quality and Safety, Center for Evidence-Based Imaging, Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts.
| |
Collapse
|
7
|
Hao S, Tao G, Pearson WS, Rochlin I, Phillips RL, Rehkopf DH, Kamdar N. Treatment of Chlamydia and Gonorrhea in Primary Care and Its Patient-Level Variation: An American Family Cohort Study. Ann Fam Med 2025; 23:136-144. [PMID: 40127987 PMCID: PMC11936364 DOI: 10.1370/afm.240164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 11/25/2024] [Accepted: 12/03/2024] [Indexed: 03/26/2025] Open
Abstract
PURPOSE Chlamydia and gonorrhea are the 2 most common bacterial sexually transmitted infections in the United States. Nonadherence to the Centers for Disease Control and Prevention treatment guidelines remains a concern. We examined how well chlamydia and gonorrhea treatment in primary care settings adhered to guidelines. METHODS We used electronic health records from the PRIME registry to identify patients with diagnosis codes or positive test results for chlamydia and/or gonorrhea from 2018 to 2022. Outcomes were the first dates of antibiotic administered within 30 days after a positive test result for the infection. Descriptive statistics were calculated for patient sociodemographic characteristics. We used a multivariate parametric accelerated failure time analysis with shared frailty modeling to assess associations between these characteristics and time to treatment. RESULTS We identified 6,678 cases of chlamydia confirmed by a positive test and 2,206 cases of gonorrhea confirmed by a positive test; 75.3% and 69.6% of these cases, respectively, were treated. Females, individuals aged 10-29 years, suburban dwellers, and patients with chlamydia-gonorrhea coinfection had higher treatment rates than comparator groups. Chlamydia was infrequently treated with the recommended antibiotic, doxycycline (14.0% of cases), and gonorrhea was infrequently treated with the recommended antibiotic, ceftriaxone (38.7% of cases). Time to treatment of chlamydia was longer for patients aged 50-59 years (time ratio relative to those aged 20-29 years = 1.61; 95% CI, 1.12-2.30) and for non-Hispanic Black patients (time ratio relative to White patients = 1.17; 95% CI, 1.04-1.33). CONCLUSIONS Guideline adherence remains suboptimal for chlamydia and gonorrhea treatment across primary care practices. Efforts are needed to develop interventions to improve quality of care for these sexually transmitted infections.
Collapse
Affiliation(s)
- Shiying Hao
- Center for Population Health Sciences, School of Medicine, Stanford University, Stanford, California
| | - Guoyu Tao
- Division of STD Prevention, Centers for Disease Control and Prevention, Atlanta, Georgia
| | - William S Pearson
- Division of STD Prevention, Centers for Disease Control and Prevention, Atlanta, Georgia
| | - Ilia Rochlin
- Inform and Disseminate Division, Office of Public Health Data, Surveillance, and Technology, Centers for Disease Control and Prevention, Atlanta, Georgia
| | - Robert L Phillips
- The Center for Professionalism & Value in Health Care, ABFM Foundation, Washington, DC
| | - David H Rehkopf
- Center for Population Health Sciences, School of Medicine, Stanford University, Stanford, California
- Department of Epidemiology and Population Health, School of Medicine, Stanford University, Stanford, California
| | - Neil Kamdar
- Center for Population Health Sciences, School of Medicine, Stanford University, Stanford, California
- Institute for Healthcare Policy and Innovation, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
8
|
Schaye V, DiTullio D, Guzman BV, Vennemeyer S, Shih H, Reinstein I, Weber DE, Goodman A, Wu DTY, Sartori DJ, Santen SA, Gruppen L, Aphinyanaphongs Y, Burk-Rafel J. Large Language Model-Based Assessment of Clinical Reasoning Documentation in the Electronic Health Record Across Two Institutions: Development and Validation Study. J Med Internet Res 2025; 27:e67967. [PMID: 40117575 PMCID: PMC11971582 DOI: 10.2196/67967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2024] [Revised: 01/31/2025] [Accepted: 02/25/2025] [Indexed: 03/23/2025] Open
Abstract
BACKGROUND Clinical reasoning (CR) is an essential skill; yet, physicians often receive limited feedback. Artificial intelligence holds promise to fill this gap. OBJECTIVE We report the development of named entity recognition (NER), logic-based and large language model (LLM)-based assessments of CR documentation in the electronic health record across 2 institutions (New York University Grossman School of Medicine [NYU] and University of Cincinnati College of Medicine [UC]). METHODS The note corpus consisted of internal medicine resident admission notes (retrospective set: July 2020-December 2021, n=700 NYU and 450 UC notes and prospective validation set: July 2023-December 2023, n=155 NYU and 92 UC notes). Clinicians rated CR documentation quality in each note using a previously validated tool (Revised-IDEA), on 3-point scales across 2 domains: differential diagnosis (D0, D1, and D2) and explanation of reasoning, (EA0, EA1, and EA2). At NYU, the retrospective set was annotated for NER for 5 entities (diagnosis, diagnostic category, prioritization of diagnosis language, data, and linkage terms). Models were developed using different artificial intelligence approaches, including NER, logic-based model: a large word vector model (scispaCy en_core_sci_lg) with model weights adjusted with backpropagation from annotations, developed at NYU with external validation at UC, NYUTron LLM: an NYU internal 110 million parameter LLM pretrained on 7.25 million clinical notes, only validated at NYU, and GatorTron LLM: an open source 345 million parameter LLM pretrained on 82 billion words of clinical text, fined tuned on NYU retrospective sets, then externally validated and further fine-tuned at UC. Model performance was assessed in the prospective sets with F1-scores for the NER, logic-based model and area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC) for the LLMs. RESULTS At NYU, the NYUTron LLM performed best: the D0 and D2 models had AUROC/AUPRC 0.87/0.79 and 0.89/0.86, respectively. The D1, EA0, and EA1 models had insufficient performance for implementation (AUROC range 0.57-0.80, AUPRC range 0.33-0.63). For the D1 classification, the approach pivoted to a stepwise approach taking advantage of the more performant D0 and D2 models. For the EA model, the approach pivoted to a binary EA2 model (ie, EA2 vs not EA2) with excellent performance, AUROC/AUPRC 0.85/ 0.80. At UC, the NER, D-logic-based model was the best performing D model (F1-scores 0.80, 0.74, and 0.80 for D0, D1, D2, respectively. The GatorTron LLM performed best for EA2 scores AUROC/AUPRC 0.75/ 0.69. CONCLUSIONS This is the first multi-institutional study to apply LLMs for assessing CR documentation in the electronic health record. Such tools can enhance feedback on CR. Lessons learned by implementing these models at distinct institutions support the generalizability of this approach.
Collapse
Affiliation(s)
- Verity Schaye
- Department of Medicine, NYU Grossman School of Medicine, New York, NY, United States
- Institute for Innovations in Medical Education, NYU Grossman School of Medicine, New York, NY, United States
| | - David DiTullio
- Department of Medicine, NYU Grossman School of Medicine, New York, NY, United States
| | | | - Scott Vennemeyer
- Department of Biostatistics, Health informatics, and Data Sciences, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Hanniel Shih
- Department of Biostatistics, Health informatics, and Data Sciences, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Ilan Reinstein
- Institute for Innovations in Medical Education, NYU Grossman School of Medicine, New York, NY, United States
| | - Danielle E Weber
- Division of Hospital Medicine, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, United States
- Division of Hospital Medicine, Department of Internal Medicine, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Abbie Goodman
- Division of Hospital Medicine, Department of Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, United States
- Division of Hospital Medicine, Department of Internal Medicine, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Danny T Y Wu
- Department of Biostatistics, Health informatics, and Data Sciences, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Daniel J Sartori
- Department of Medicine, NYU Grossman School of Medicine, New York, NY, United States
| | - Sally A Santen
- Department of Emergency Medicine, University of Cincinnati College of Medicine, Cincinnati, OH, United States
| | - Larry Gruppen
- Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MI, United States
| | | | - Jesse Burk-Rafel
- Department of Medicine, NYU Grossman School of Medicine, New York, NY, United States
- Institute for Innovations in Medical Education, NYU Grossman School of Medicine, New York, NY, United States
| |
Collapse
|
9
|
Breithaupt A, Mohan S, Thombley R, Pimentel SD, Douglas VC. Education Research: Exploring the Impact of Standardized, Condition-Specific Note Templates on Quality Metrics and Efficiency in Multiple Resident Clinics. NEUROLOGY. EDUCATION 2025; 4:e200200. [PMID: 40070448 PMCID: PMC11896599 DOI: 10.1212/ne9.0000000000200200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Accepted: 01/13/2025] [Indexed: 03/14/2025]
Abstract
Background and Objectives Electronic health record documentation burden negatively affects physician satisfaction and patient care. Although well-constructed notes are important for care quality and safety, most note templates are created and maintained by individual physicians, leading to inefficiency and variable note quality. This study aimed to assess whether standardized, condition-specific note templates could enhance the efficiency and quality of notes written by neurology residents in the outpatient setting. Methods In a quality improvement study with a randomized, nonblinded design from July 2021 to June 2022, neurology residents were assigned standardized templates for epilepsy, headache, and Parkinson disease (PD) in 2 outpatient clinics. The standardized templates were created with input from specialists in these disorders. Efficiency was gauged based on the time and characters involved in note writing while quality was assessed by adherence to American Academy of Neurology quality metrics for each condition through chart review. A qualitative survey gathered resident opinions on the templates. Linear regression models were used in the efficiency and quality analyses. Results The study included 23 of 34 neurology residents. Templates were used in 36% of eligible encounters over the first 6 months of the study and 65% over the last 6 months. No significant difference in time spent on note writing was observed between the template and nontemplate groups. While both groups showed similar quality measures across most domains, the template group documented quality measures more consistently for driving status in epilepsy (92% vs 53%, p = 0.002), medication-related motor symptoms in PD (95% vs 50%, p = 0.01), and lifestyle changes in headache management (77% vs 21%, p = 0.005). Resident feedback suggested that the templates facilitated clinic workflows and prompted more thorough patient inquiry. Discussion Standardized, condition-specific templates improved documentation of quality metrics without increasing time spent. Despite initial low uptake of template use, an increase was observed over time, indicating potential for wider acceptance with implementation efforts. These templates, updated and maintained by subject matter experts, serve as an opportunity to incorporate quality care checklists and knowledge into a clinician's workflow. This warrants further research into template implementation and its effects on care quality and education for neurologists and generalists.
Collapse
Affiliation(s)
- Andrew Breithaupt
- Department of Neurology, Emory University School of Medicine, Atlanta, GA
| | - Sonam Mohan
- Department of Neurology, Kaiser Permanente, San Jose, CA
| | - Robert Thombley
- Division of Clinical Informatics and Digital Transformation, Department of Medicine, University of California, San Francisco, CA
| | - Samuel D Pimentel
- Department of Statistics, University of California Berkeley, CA; and
| | - Vanja C Douglas
- Department of Neurology, University of California, San Francisco, CA
| |
Collapse
|
10
|
Mahajan P, White E, Shaw K, Parker SJ, Chamberlain J, Ruddy RM, Alpern ER, Corboy J, Krack A, Ku B, Morrison Ponce D, Payne AS, Freiheit E, Horvath G, Kolenic G, Carney M, Klekowski N, O'Connell KJ, Singh H. Epidemiology of diagnostic errors in pediatric emergency departments using electronic triggers. Acad Emerg Med 2025; 32:226-245. [PMID: 39815759 PMCID: PMC11921087 DOI: 10.1111/acem.15087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 12/18/2024] [Accepted: 12/18/2024] [Indexed: 01/18/2025]
Abstract
OBJECTIVES We applied three electronic triggers to study frequency and contributory factors of missed opportunities for improving diagnosis (MOIDs) in pediatric emergency departments (EDs): return visits within 10 days resulting in admission (Trigger 1), care escalation within 24 h of ED presentation (Trigger 2), and death within 24 h of ED visit (Trigger 3). METHODS We created an electronic query and reporting template for the triggers and applied them to electronic health record systems of five pediatric EDs for visits from 2019. Clinician reviewers manually screened identified charts and initially categorized them as "unlikely for MOIDs" or "unable to rule out MOIDs" without a detailed chart review. For the latter category, reviewers performed a detailed chart review using the Revised Safer Dx Instrument to determine the presence of a MOID. RESULTS A total of 2937 ED records met trigger criteria (Trigger 1 1996 [68%], Trigger 2 829 [28%], Trigger 3 112 [4%]), of which 2786 (95%) were categorized as unlikely for MOIDs. The Revised Safer Dx Instrument was applied to 151 (5%) records and 76 (50%) had MOIDs. The overall frequency of MOIDs was 2.6% for the entire cohort, 3.0% for Trigger 1, 1.9% for Trigger 2, and 0% for Trigger 3. Brain lesions, infections, or hemorrhage; pneumonias and lung abscess; and appendicitis were the top three missed diagnoses. The majority (54%) of MOIDs cases resulted in patient harm. Contributory factors were related to patient-provider (52.6%), followed by patient factors (21.1%), system factors (13.2%), and provider factors (10.5%). CONCLUSIONS Using electronic triggers with selective record review is an effective process to screen for harmful diagnostic errors in EDs: detailed review of 5% of charts revealed MOIDs in half, of which half were harmful to the patient. With further refining, triggers can be used as effective patient safety tools to monitor diagnostic quality.
Collapse
Affiliation(s)
| | | | - Kathy Shaw
- Children's Hospital of PhiladelphiaPhiladelphiaPennsylvaniaUSA
| | | | | | | | | | - Jacqueline Corboy
- Ann and Robert H. Lurie Children's Hospital of ChicagoChicagoIllinoisUSA
| | - Andrew Krack
- University of Cincinnati College of MedicineCincinnatiOhioUSA
| | - Brandon Ku
- Children's Hospital of PhiladelphiaPhiladelphiaPennsylvaniaUSA
| | | | | | | | | | | | | | | | | | - Hardeep Singh
- Center for Innovations in Quality, Effectiveness and SafetyMichael E. DeBakey Veterans Affairs Medical Center and Baylor College of MedicineHoustonTexasUSA
| |
Collapse
|
11
|
Vaghani V, Gupta A, Mir U, Wei L, Murphy DR, Mushtaq U, Sittig DF, Zimolzak AJ, Singh H. Implementation of Electronic Triggers to Identify Diagnostic Errors in Emergency Departments. JAMA Intern Med 2025; 185:143-151. [PMID: 39621337 PMCID: PMC11612912 DOI: 10.1001/jamainternmed.2024.6214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Accepted: 09/30/2024] [Indexed: 12/06/2024]
Abstract
Importance Missed diagnosis can lead to preventable patient harm. Objective To develop and implement a portfolio of electronic triggers (e-triggers) and examine their performance for identifying missed opportunities in diagnosis (MODs) in emergency departments (EDs). Design, Setting, and Participants In this retrospective medical record review study of ED visits at 1321 Veterans Affairs health care sites, rules-based e-triggers were developed and implemented using a national electronic health record repository. These e-triggers targeted 6 high-risk presentations for MODs in treat-and-release ED visits. A high-risk stroke e-trigger was applied to treat-and-release ED visits from January 1, 2016, to December 31, 2020. A symptom-disease dyad e-trigger was applied to visits from January 1, 2018, to December 31, 2019. High-risk abdominal pain, unexpected ED return, unexpected hospital return, and test result e-triggers were applied to visits from January 1, 2019, to December 31, 2019. At least 100 randomly selected flagged records were reviewed by physician reviewers for each e-trigger. Data were analyzed between January 2024 and April 2024. Exposures Treat-and-release ED visits involving high-risk stroke, symptom-disease dyads, high-risk abdominal pain, unexpected ED return, unexpected hospital return, and abnormal test results not followed up after initial ED visit. Main Outcomes and Measures Trained physician reviewers evaluated the presence/absence of MODs at ED visits and recorded data on patient and clinician characteristics, types of diagnostic process breakdowns, and potential harm from MODs. Results The high-risk stroke e-trigger was applied to 8 792 672 treat-and-release ED visits (4 967 283 unique patients); the symptom-disease dyad e-trigger was applied to 3 692 454 visits (2 070 979 patients); and high-risk abdominal pain, unexpected ED return, unexpected hospital return, and test result e-triggers were applied to 1 845 905 visits (1 032 969 patients), overall identifying 203, 1981, 170, 116 785, 14 879, and 2090 trigger-positive records, respectively. Review of 625 randomly selected patient records (mean [SD] age, 62.5 [15.2] years; 553 [88.5%] male) showed the following MOD counts and positive predictive values (PPVs) within each category: 47 MODs (PPV, 47.0%) for stroke, 31 MODs (PPV, 25.8%) for abdominal pain, 11 MODs (PPV, 11.0%) for ED returns, 23 MODs (PPV, 23.0%) for hospital returns, 18 MODs (PPV, 18.0%) for symptom-disease dyads, and 55 MODs (PPV, 52.4%) for test results. Patients with MODs were slightly older than those without (mean [SD] age, 65.6 [14.5] vs 61.2 [15.3] years; P < .001). Reviewer agreement was favorable (range, 72%-100%). In 108 of 130 MODs (83.1%; excluding MODs related to the test result e-trigger), the most common diagnostic process breakdown involved the patient-clinician encounter. In 185 total MODs, 20 patients experienced severe harm (10.8%), and 54 patients experienced moderate harm (29.2%). Conclusions and Relevance In this retrospective medical record review study, rules-based e-triggers were useful for post hoc detection of MODs in ED visits. Interventions to target ED work system factors are urgently needed to support patient-clinician encounters and minimize harm from diagnostic errors.
Collapse
Affiliation(s)
- Viralkumar Vaghani
- Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, Texas
| | - Ashish Gupta
- Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, Texas
| | - Usman Mir
- Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, Texas
| | - Li Wei
- Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, Texas
| | - Daniel R. Murphy
- Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, Texas
| | - Umair Mushtaq
- Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, Texas
| | - Dean F. Sittig
- Department of Clinical and Health Informatics, McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston
| | - Andrew J. Zimolzak
- Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, Texas
| | - Hardeep Singh
- Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, Texas
| |
Collapse
|
12
|
Hautz WE, Marcin T, Hautz SC, Schauber SK, Krummrey G, Müller M, Sauter TC, Lambrigger C, Schwappach D, Nendaz M, Lindner G, Bosbach S, Griesshammer I, Schönberg P, Plüss E, Romann V, Ravioli S, Werthmüller N, Kölbener F, Exadaktylos AK, Singh H, Zwaan L. Diagnoses supported by a computerised diagnostic decision support system versus conventional diagnoses in emergency patients (DDX-BRO): a multicentre, multiple-period, double-blind, cluster-randomised, crossover superiority trial. Lancet Digit Health 2025; 7:e136-e144. [PMID: 39890244 DOI: 10.1016/s2589-7500(24)00250-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 10/22/2024] [Accepted: 11/12/2024] [Indexed: 02/03/2025]
Abstract
BACKGROUND Diagnostic error is a frequent and clinically relevant health-care problem. Whether computerised diagnostic decision support systems (CDDSSs) improve diagnoses is controversial, and prospective randomised trials investigating their effectiveness in routine clinical practice are scarce. We hypothesised that diagnoses made with a CDDSS in the emergency department setting would be superior to unsupported diagnoses. METHODS This multicentre, multiple-period, double-blind, cluster-randomised, crossover superiority trial was done in four emergency departments in Switzerland. Eligible patients were adults (aged ≥18 years) presenting with abdominal pain, fever of unknown origin, syncope, or non-specific symptoms. Emergency departments were randomly assigned (1:1) to one of two predefined sequences of six alternating periods of intervention or control. Patients presenting during an intervention period were diagnosed with the aid of a CDDSS, whereas patients presenting during a control period were diagnosed without a CDDSS (usual care). Patients and personnel assessing outcomes were masked to group allocation; treating physicians were not. The primary binary outcome (false or true) was a composite score indicating a risk of reduced diagnostic quality, which was deemed to be present if any of the following occurred within 14 days: unscheduled medical care, a change in diagnosis, an unexpected intensive care unit admission within 24 h if initially admitted to hospital, or death. We assessed superiority of supported versus unsupported diagnoses in all consenting patients using a generalised linear mixed effects model. All participants who received any study treatment (including control) and completed the study were included in the safety analysis. This trial is registered with ClinicalTrials.gov (NCT05346523) and is closed to accrual. FINDINGS Between June 9, 2022, and June 23, 2023, 15 845 patients were screened and 1204 (591 [49·1%] female and 613 [50·9%] male) were included in the primary efficacy analysis. The median age of participants was 53 years (IQR 34-69). Diagnostic quality risk was observed in 100 (18%) of 559 patients with CDDSS-supported diagnoses and 119 (18%) of 645 with unsupported diagnoses (adjusted odds ratio 0·96 [95% CI 0·71-1·3]). 94 (7·8%) patients suffered a serious adverse event, none related to the study. INTERPRETATION Use of a CDDSS did not reduce the occurrence of diagnostic quality risk compared with the usual diagnostic process in adults presenting to emergency departments. Future research should aim to identify specific contexts in which CDDSSs are effective and how existing CDDSSs can be adapted to improve patient outcomes. FUNDING Swiss National Science Foundation and University Hospital Bern.
Collapse
Affiliation(s)
- Wolf E Hautz
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland.
| | - Thimo Marcin
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Stefanie C Hautz
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Stefan K Schauber
- Center for Educational Measurement and Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Gert Krummrey
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland; Bern University of Applied Sciences, Biel, Switzerland
| | - Martin Müller
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Thomas C Sauter
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Cornelia Lambrigger
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - David Schwappach
- Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
| | - Mathieu Nendaz
- Department of Medicine, University of Geneva, Geneva, Switzerland
| | - Gregor Lindner
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland; Department of Emergency Medicine, Kepler Universitätsklinikum Linz, Johannes Kepler University, Linz, Austria
| | | | - Ines Griesshammer
- Department of Internal and Emergency Medicine, Bürgerspital Solothurn, Solothurn, Switzerland
| | | | - Emanuel Plüss
- Department of Internal and Emergency Medicine, Bürgerspital Solothurn, Solothurn, Switzerland
| | | | - Svenja Ravioli
- Department of Emergency Medicine, Kepler Universitätsklinikum Linz, Johannes Kepler University, Linz, Austria; Department of Emergency Medicine, King's College Hospital NHS Foundation Trust, Denmark Hill, London, UK
| | - Nadine Werthmüller
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Fabian Kölbener
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Aristomenis K Exadaktylos
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Hardeep Singh
- Center for Innovations in Quality, Effectiveness and Safety (IQuESt), Michael E DeBakey VA Medical Center, Houston, TX, USA; Department of Medicine, Baylor College of Medicine, Houston, TX, USA
| | - Laura Zwaan
- Institute of Medical Education Research Rotterdam (iMERR), Erasmus Medical Center, Rotterdam, Netherlands
| |
Collapse
|
13
|
Singh R, Kim JY, Glassy EF, Dash RC, Brodsky V, Seheult J, de Baca ME, Gu Q, Hoekstra S, Pritt BS. Introduction to Generative Artificial Intelligence: Contextualizing the Future. Arch Pathol Lab Med 2025; 149:112-122. [PMID: 39631430 DOI: 10.5858/arpa.2024-0221-ra] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/05/2024] [Indexed: 12/07/2024]
Abstract
CONTEXT.— Generative artificial intelligence (GAI) is a promising new technology with the potential to transform communication and workflows in health care and pathology. Although new technologies offer advantages, they also come with risks that users, particularly early adopters, must recognize. Given the fast pace of GAI developments, pathologists may find it challenging to stay current with the terminology, technical underpinnings, and latest advancements. Building this knowledge base will enable pathologists to grasp the potential risks and impacts that GAI may have on the future practice of pathology. OBJECTIVE.— To present key elements of GAI development, evaluation, and implementation in a way that is accessible to pathologists and relevant to laboratory applications. DATA SOURCES.— Information was gathered from recent studies and reviews from PubMed and arXiv. CONCLUSIONS.— GAI offers many potential benefits for practicing pathologists. However, the use of GAI in clinical practice requires rigorous oversight and continuous refinement to fully realize its potential and mitigate inherent risks. The performance of GAI is highly dependent on the quality and diversity of the training and fine-tuning data, which can also propagate biases if not carefully managed. Ethical concerns, particularly regarding patient privacy and autonomy, must be addressed to ensure responsible use. By harnessing these emergent technologies, pathologists will be well placed to continue forward as leaders in diagnostic medicine.
Collapse
Affiliation(s)
- Rajendra Singh
- From the Department of Pathology, Summit Health, Woodland Park, New Jersey (Singh)
| | - Ji Yeon Kim
- the Department of Pathology, Kaiser Permanente, Los Angeles, California (Kim)
| | - Eric F Glassy
- Affiliated Pathologists Medical Group, Rancho Dominguez, California (Glassy)
| | - Rajesh C Dash
- Department of Pathology, Duke Health, Durham, North Carolina (Dash)
| | - Victor Brodsky
- the Department of Pathology and Immunology, Washington University, St Louis, Missouri (Brodsky)
| | - Jansen Seheult
- the Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota (Seheult, Pritt)
| | - M E de Baca
- Sysmex America, Lincolnshire, Illinois (de Baca)
| | - Qiangqiang Gu
- the Department of Neurology, Neurosurgery, and Critical Care, Mayo Clinic, Jacksonville, Florida (Gu)
| | - Shannon Hoekstra
- Information Services, College of American Pathologists, Northfield, Illinois (Hoekstra)
| | - Bobbi S Pritt
- the Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota (Seheult, Pritt)
| |
Collapse
|
14
|
Weissman GE, Zwaan L, Bell SK. Diagnostic scope: the AI can't see what the mind doesn't know. Diagnosis (Berl) 2024:dx-2024-0151. [PMID: 39624993 DOI: 10.1515/dx-2024-0151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Accepted: 11/20/2024] [Indexed: 01/27/2025]
Abstract
BACKGROUND Diagnostic scope is the range of diagnoses found in a clinical setting. Although the diagnostic scope is an essential feature of training and evaluating artificial intelligence (AI) systems to promote diagnostic excellence, its impact on AI systems and the diagnostic process remains under-explored. CONTENT We define the concept of diagnostic scope, discuss its nuanced role in building safe and effective AI-based diagnostic decision support systems, review current challenges to measurement and use, and highlight knowledge gaps for future research. SUMMARY The diagnostic scope parallels the differential diagnosis although the latter is at the level of an encounter and the former is at the level of a clinical setting. Therefore, diagnostic scope will vary by local characteristics including geography, population, and resources. The true, observed, and considered scope in each setting may also diverge, both posing challenges for clinicians, patients, and AI developers, while also highlighting opportunities to improve safety. Further work is needed to systematically define and measure diagnostic scope in terms that are accurate, equitable, and meaningful at the bedside. AI tools tailored to a particular setting, such as a primary care clinic or intensive care unit, will each require specifying and measuring the appropriate diagnostic scope. OUTLOOK AI tools will promote diagnostic excellence if they are aligned with patient and clinician needs and trained on an accurately measured diagnostic scope. A careful understanding and rigorous evaluation of the diagnostic scope in each clinical setting will promote optimal care through human-AI collaborations in the diagnostic process.
Collapse
Affiliation(s)
- Gary E Weissman
- 14640 Palliative and Advanced Illness Research (PAIR) Center, University of Pennsylvania Perelman School of Medicine , Philadelphia, PA, USA
- Pulmonary, Allergy, and Critical Care Division, Department of Medicine, 14640 University of Pennsylvania Perelman School of Medicine , Philadelphia, PA, USA
- Division of Informatics, Department of Biostatistics, Epidemiology & Informatics, 14640 University of Pennsylvania Perelman School of Medicine , Philadelphia, PA, USA
- Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA, USA
| | - Laura Zwaan
- Institute of Medical Education Research, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Sigall K Bell
- Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
15
|
Chima S, Hunter B, Martinez-Gutierrez J, Lumsden N, Nelson C, Manski-Nankervis JA, Emery J. Adoption, acceptance, and use of a decision support tool to promote timely investigations for cancer in primary care. Fam Pract 2024; 41:1048-1057. [PMID: 39425610 PMCID: PMC11642683 DOI: 10.1093/fampra/cmae046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/21/2024] Open
Abstract
BACKGROUND The complexities of diagnosing cancer in general practice has driven the development of quality improvement (QI) interventions, including clinical decision support (CDS) and auditing tools. Future Health Today (FHT) is a novel QI tool, consisting of CDS at the point-of-care, practice population-level auditing, recall, and the monitoring of QI activities. OBJECTIVES Explore the acceptability and usability of the FHT cancer module, which flags patients with abnormal test results that may be indicative of undiagnosed cancer. METHODS Interviews were conducted with general practitioners (GPs) and general practice nurses (GPNs), from practices participating in a randomized trial evaluating the appropriate follow-up of patients. Clinical Performance Feedback Intervention Theory (CP-FIT) was used to analyse and interpret the data. RESULTS The majority of practices reported not using the auditing and QI components of the tool, only the CDS which was delivered at the point-of-care. The tool was used primarily by GPs; GPNs did not perceive the clinical recommendations to be within their role. For the CDS, facilitators for use included a good workflow fit, ease of use, low time cost, importance, and perceived knowledge gain. Barriers for use of the CDS included accuracy, competing priorities, and the patient population. CONCLUSIONS The CDS aligned with the clinical workflow of GPs, was considered non-disruptive to the consultation and easy to implement into usual care. By applying the CP-FIT theory, we were able to demonstrate the key drivers for GPs using the tool, and what limited the use by GPNs.
Collapse
Affiliation(s)
- Sophie Chima
- Department of General Practice and Primary Care, University of Melbourne, 780 Elizabeth St, Melbourne, 3010, Australia
- Centre for Cancer Research, University of Melbourne, 305 Grattan St, Melbourne, 3010, Australia
| | - Barbara Hunter
- Centre for Cancer Research, University of Melbourne, 305 Grattan St, Melbourne, 3010, Australia
| | - Javiera Martinez-Gutierrez
- Department of General Practice and Primary Care, University of Melbourne, 780 Elizabeth St, Melbourne, 3010, Australia
- Centre for Cancer Research, University of Melbourne, 305 Grattan St, Melbourne, 3010, Australia
- Department of Family Medicine, Pontificia Universidad Católica de Chile, Vicuña Mackenna 4686, Santiago, Chile
| | - Natalie Lumsden
- Centre for Cancer Research, University of Melbourne, 305 Grattan St, Melbourne, 3010, Australia
| | - Craig Nelson
- Department of Medicine, Western Health, University of Melbourne, 176 Furlong Road, Melbourne, 3021, Australia
| | - Jo-Anne Manski-Nankervis
- Department of General Practice and Primary Care, University of Melbourne, 780 Elizabeth St, Melbourne, 3010, Australia
- Department of Primary Care and Family Medicine, LKC Medicine, Nanyang Technological University, 11 Mandalay Road, Singapore, 308232, Singapore
| | - Jon Emery
- Department of General Practice and Primary Care, University of Melbourne, 780 Elizabeth St, Melbourne, 3010, Australia
- Centre for Cancer Research, University of Melbourne, 305 Grattan St, Melbourne, 3010, Australia
| |
Collapse
|
16
|
Tsilimingras D, Schnipper J, Zhang L, Levy P, Korzeniewski S, Paxton J. Adverse Events in Patients Transitioning From the Emergency Department to the Inpatient Setting. J Patient Saf 2024; 20:564-570. [PMID: 39324989 DOI: 10.1097/pts.0000000000001284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2024]
Abstract
OBJECTIVES The objective of this study was to determine the incidence and types of adverse events (AEs), including preventable and ameliorable AEs, in patients transitioning from the emergency department (ED) to the inpatient setting. A second objective was to examine the risk factors for patients with AEs. METHODS This was a prospective cohort study of patients at risk for AEs in 2 urban academic hospitals from August 2020 to January 2022. Eighty-one eligible patients who were being admitted to any internal medicine or hospitalist service were recruited from the ED of these hospitals by a trained nurse. The nurse conducted a structured interview during admission and referred possible AEs for adjudication. Two blinded trained physicians using a previously established methodology adjudicated AEs. RESULTS Over 22% of 81 patients experienced AEs from the ED to the inpatient setting. The most common AEs were adverse drug events (42%), followed by management (38%), and diagnostic errors (21%). Of these AEs, 75% were considered preventable. Patients who stayed in the ED longer were more likely to experience an AE (adjusted odds ratio = 1.99, 95% confidence interval = 1.19-3.32, P = 0.01). CONCLUSIONS AEs were common for patients transitioning from the ED to the inpatient setting. Further research is needed to understand the underlying causes of AEs that occur when patients transition from the ED to the inpatient setting. Understanding the contribution of factors such as length of stay in the ED will significantly help efforts to develop targeted interventions to improve this crucial transition of care.
Collapse
Affiliation(s)
- Dennis Tsilimingras
- From the Department of Family Medicine & Public Health Sciences, Wayne State University School of Medicine, Detroit, Michigan
| | - Jeffrey Schnipper
- Division of General Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Liying Zhang
- From the Department of Family Medicine & Public Health Sciences, Wayne State University School of Medicine, Detroit, Michigan
| | - Phillip Levy
- Department of Emergency Medicine, Wayne State University School of Medicine, Detroit, Michigan
| | - Steven Korzeniewski
- From the Department of Family Medicine & Public Health Sciences, Wayne State University School of Medicine, Detroit, Michigan
| | - James Paxton
- Department of Emergency Medicine, Wayne State University School of Medicine, Detroit, Michigan
| |
Collapse
|
17
|
Hill MA, Coppinger T, Sedig K, Gallagher WJ, Baker KM, Haskell H, Miller KE, Smith KM. "What Else Could It Be?" A Scoping Review of Questions for Patients to Ask Throughout the Diagnostic Process. J Patient Saf 2024; 20:529-534. [PMID: 39259002 PMCID: PMC11803640 DOI: 10.1097/pts.0000000000001273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
BACKGROUND Diagnostic errors are a global patient safety challenge. Over 75% of diagnostic errors in ambulatory care result from breakdowns in patient-clinician communication. Encouraging patients to speak up and ask questions has been recommended as one strategy to mitigate these failures. OBJECTIVES The goal of the scoping review was to identify, summarize, and thematically map questions patients are recommended to ask during ambulatory encounters along the diagnostic process. This is the first step in a larger study to co-design a patient-facing question prompt list for patients to use throughout the diagnostic process. METHODS Medline and Google Scholar were searched to identify question lists in the peer-reviewed literature. Organizational websites and a search engine were searched to identify question lists in the gray literature. Articles and resources were screened for eligibility and data were abstracted. Interrater reliability (K = 0.875) was achieved. RESULTS A total of 5509 questions from 235 resources met inclusion criteria. Most questions ( n = 4243, 77.02%) were found in the gray literature. Question lists included an average of 23.44 questions. Questions were most commonly coded along the diagnostic process categories of treatment (2434 questions from 250 resources), communication of the diagnosis (1160 questions, 204 resources), and outcomes (766 questions, 172 resources). CONCLUSIONS Despite recommendations for patients to ask questions, most question prompt lists focus on later stages of the diagnostic process such as communication of the diagnosis, treatment, and outcomes. Further research is needed to identify and prioritize diagnostic-related questions from the patient perspective and to develop simple, usable guidance on question-asking to improve patient safety across the diagnostic continuum.
Collapse
Affiliation(s)
- Mary A. Hill
- University of Toronto, Institute of Health Policy, Management & Evaluation, Toronto, Canada
- Michael Garron Hospital, Toronto East Health Network, Toronto, Canada
| | - Tess Coppinger
- Michael Garron Hospital, Toronto East Health Network, Toronto, Canada
| | - Kimia Sedig
- Michael Garron Hospital, Toronto East Health Network, Toronto, Canada
| | | | - Kelley M. Baker
- National Center for Human Factors in Healthcare, MedStar Health, Washington, District of Columbia
| | - Helen Haskell
- Mothers Against Medical Error, Columbia, South Carolina
| | - Kristen E. Miller
- Georgetown University School of Medicine, Washington, District of Columbia
- National Center for Human Factors in Healthcare, MedStar Health, Washington, District of Columbia
| | - Kelly M. Smith
- University of Toronto, Institute of Health Policy, Management & Evaluation, Toronto, Canada
- Michael Garron Hospital, Toronto East Health Network, Toronto, Canada
| |
Collapse
|
18
|
Rose C, Chen JH. Learning from the EHR to implement AI in healthcare. NPJ Digit Med 2024; 7:330. [PMID: 39567723 PMCID: PMC11579417 DOI: 10.1038/s41746-024-01340-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 11/11/2024] [Indexed: 11/22/2024] Open
Affiliation(s)
- Christian Rose
- Department of Emergency Medicine, Stanford University School of Medicine, Stanford, USA.
| | - Jonathan H Chen
- Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
19
|
Zhu J, Yang F, Wang Y, Wang Z, Xiao Y, Wang L, Sun L. Accuracy of Machine Learning in Discriminating Kawasaki Disease and Other Febrile Illnesses: Systematic Review and Meta-Analysis. J Med Internet Res 2024; 26:e57641. [PMID: 39556821 PMCID: PMC11612596 DOI: 10.2196/57641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 07/25/2024] [Accepted: 09/20/2024] [Indexed: 11/20/2024] Open
Abstract
BACKGROUND Kawasaki disease (KD) is an acute pediatric vasculitis that can lead to coronary artery aneurysms and severe cardiovascular complications, often presenting with obvious fever in the early stages. In current clinical practice, distinguishing KD from other febrile illnesses remains a significant challenge. In recent years, some researchers have explored the potential of machine learning (ML) methods for the differential diagnosis of KD versus other febrile illnesses, as well as for predicting coronary artery lesions (CALs) in people with KD. However, there is still a lack of systematic evidence to validate their effectiveness. Therefore, we have conducted the first systematic review and meta-analysis to evaluate the accuracy of ML in differentiating KD from other febrile illnesses and in predicting CALs in people with KD, so as to provide evidence-based support for the application of ML in the diagnosis and treatment of KD. OBJECTIVE This study aimed to summarize the accuracy of ML in differentiating KD from other febrile illnesses and predicting CALs in people with KD. METHODS PubMed, Cochrane Library, Embase, and Web of Science were systematically searched until September 26, 2023. The risk of bias in the included original studies was appraised using the Prediction Model Risk of Bias Assessment Tool (PROBAST). Stata (version 15.0; StataCorp) was used for the statistical analysis. RESULTS A total of 29 studies were incorporated. Of them, 20 used ML to differentiate KD from other febrile illnesses. These studies involved a total of 103,882 participants, including 12,541 people with KD. In the validation set, the pooled concordance index, sensitivity, and specificity were 0.898 (95% CI 0.874-0.922), 0.91 (95% CI 0.83-0.95), and 0.86 (95% CI 0.80-0.90), respectively. Meanwhile, 9 studies used ML for early prediction of the risk of CALs in children with KD. These studies involved a total of 6503 people with KD, of whom 986 had CALs. The pooled concordance index in the validation set was 0.787 (95% CI 0.738-0.835). CONCLUSIONS The diagnostic and predictive factors used in the studies we included were primarily derived from common clinical data. The ML models constructed based on these clinical data demonstrated promising effectiveness in differentiating KD from other febrile illnesses and in predicting coronary artery lesions. Therefore, in future research, we can explore the use of ML methods to identify more efficient predictors and develop tools that can be applied on a broader scale for the differentiation of KD and the prediction of CALs.
Collapse
Affiliation(s)
- Jinpu Zhu
- College of Chinese Medicine, Changchun University of Chinese Medicine, Changchun, China
| | - Fushuang Yang
- Center of Children's Clinic, The Affiliated Hospital to Changchun University of Chinese Medicine, Changchun, China
| | - Yang Wang
- Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
| | - Zhongtian Wang
- College of Chinese Medicine, Changchun University of Chinese Medicine, Changchun, China
| | - Yao Xiao
- College of Chinese Medicine, Changchun University of Chinese Medicine, Changchun, China
| | - Lie Wang
- Center of Children's Clinic, The Affiliated Hospital to Changchun University of Chinese Medicine, Changchun, China
| | - Liping Sun
- Center of Children's Clinic, The Affiliated Hospital to Changchun University of Chinese Medicine, Changchun, China
| |
Collapse
|
20
|
Cho J, Han JY, Cho A, Yoo S, Lee HY, Kim H. Enhancing Clinical History Taking Through the Implementation of a Streamlined Electronic Questionnaire System at a Pediatric Headache Clinic: Development and Evaluation Study. JMIR Med Inform 2024; 12:e54415. [PMID: 39622694 PMCID: PMC11611800 DOI: 10.2196/54415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 10/11/2024] [Accepted: 10/14/2024] [Indexed: 12/06/2024] Open
Abstract
Background Accurate history taking is essential for diagnosis, treatment, and patient care, yet miscommunications and time constraints often lead to incomplete information. Consequently, there has been a pressing need to establish a system whereby the questionnaire is duly completed before the medical appointment, entered into the electronic health record (EHR), and stored in a structured format within a database. Objective This study aimed to develop and evaluate a streamlined electronic questionnaire system, BEST-Survey (Bundang Hospital Electronic System for Total Care-Survey), integrated with the EHR, to enhance history taking and data management for patients with pediatric headaches. Methods An electronic questionnaire system was developed at Seoul National University Bundang Hospital, allowing patients to complete previsit questionnaires on a tablet PC. The information is automatically integrated into the EHR and stored in a structured database for further analysis. A retrospective analysis compared clinical information acquired from patients aged <18 years visiting the pediatric neurology outpatient clinic for headaches, before and after implementing the BEST-Survey system. The study included 365 patients before and 452 patients after system implementation. Answer rates and positive rates of key headache characteristics were compared between the 2 groups to evaluate the system's clinical utility. Results Implementation of the BEST-Survey system significantly increased the mean data acquisition rate from 54.6% to 99.3% (P<.001). Essential clinical features such as onset, location, duration, severity, nature, and frequency were obtained in over 98.7% (>446/452) of patients after implementation, compared to from 53.7% (196/365) to 85.2% (311/365) before. The electronic system facilitated comprehensive data collection, enabling detailed analysis of headache characteristics in the patient population. Most patients (280/452, 61.9%) reported headache onset less than 1 year prior, with the temporal region being the most common pain location (261/703, 37.1%). Over half (232/452, 51.3%) experienced headaches lasting less than 2 hours, with nausea and vomiting as the most commonly associated symptoms (231/1036, 22.3%). Conclusions The BEST-Survey system markedly improved the completeness and accuracy of essential history items for patients with pediatric headaches. The system also streamlined data extraction and analysis for clinical and research purposes. While the electronic questionnaire cannot replace physician-led history taking, it serves as a valuable adjunctive tool to enhance patient care.
Collapse
Affiliation(s)
- Jaeso Cho
- Department of Pediatrics, Seoul National University Bundang Hospital, 166 Gumi-ro, Bundang-gu, Seongnam, 03080, Republic of Korea, 82 317877297
| | - Ji Yeon Han
- Department of Pediatrics, Seoul National University Bundang Hospital, 166 Gumi-ro, Bundang-gu, Seongnam, 03080, Republic of Korea, 82 317877297
- Department of Pediatrics, Inha University Hospital, Incheon, Republic of Korea
| | - Anna Cho
- Department of Pediatrics, Seoul National University Bundang Hospital, 166 Gumi-ro, Bundang-gu, Seongnam, 03080, Republic of Korea, 82 317877297
| | - Sooyoung Yoo
- Healthcare ICT Research Center, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Ho-Young Lee
- Department of Nuclear Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Hunmin Kim
- Department of Pediatrics, Seoul National University Bundang Hospital, 166 Gumi-ro, Bundang-gu, Seongnam, 03080, Republic of Korea, 82 317877297
- Department of Pediatrics, Seoul National University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
21
|
Goh E, Gallo R, Hom J, Strong E, Weng Y, Kerman H, Cool JA, Kanjee Z, Parsons AS, Ahuja N, Horvitz E, Yang D, Milstein A, Olson APJ, Rodman A, Chen JH. Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial. JAMA Netw Open 2024; 7:e2440969. [PMID: 39466245 PMCID: PMC11519755 DOI: 10.1001/jamanetworkopen.2024.40969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 08/02/2024] [Indexed: 10/29/2024] Open
Abstract
Importance Large language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves physician diagnostic reasoning. Objective To assess the effect of an LLM on physicians' diagnostic reasoning compared with conventional resources. Design, Setting, and Participants A single-blind randomized clinical trial was conducted from November 29 to December 29, 2023. Using remote video conferencing and in-person participation across multiple academic medical institutions, physicians with training in family medicine, internal medicine, or emergency medicine were recruited. Intervention Participants were randomized to either access the LLM in addition to conventional diagnostic resources or conventional resources only, stratified by career stage. Participants were allocated 60 minutes to review up to 6 clinical vignettes. Main Outcomes and Measures The primary outcome was performance on a standardized rubric of diagnostic performance based on differential diagnosis accuracy, appropriateness of supporting and opposing factors, and next diagnostic evaluation steps, validated and graded via blinded expert consensus. Secondary outcomes included time spent per case (in seconds) and final diagnosis accuracy. All analyses followed the intention-to-treat principle. A secondary exploratory analysis evaluated the standalone performance of the LLM by comparing the primary outcomes between the LLM alone group and the conventional resource group. Results Fifty physicians (26 attendings, 24 residents; median years in practice, 3 [IQR, 2-8]) participated virtually as well as at 1 in-person site. The median diagnostic reasoning score per case was 76% (IQR, 66%-87%) for the LLM group and 74% (IQR, 63%-84%) for the conventional resources-only group, with an adjusted difference of 2 percentage points (95% CI, -4 to 8 percentage points; P = .60). The median time spent per case for the LLM group was 519 (IQR, 371-668) seconds, compared with 565 (IQR, 456-788) seconds for the conventional resources group, with a time difference of -82 (95% CI, -195 to 31; P = .20) seconds. The LLM alone scored 16 percentage points (95% CI, 2-30 percentage points; P = .03) higher than the conventional resources group. Conclusions and Relevance In this trial, the availability of an LLM to physicians as a diagnostic aid did not significantly improve clinical reasoning compared with conventional resources. The LLM alone demonstrated higher performance than both physician groups, indicating the need for technology and workforce development to realize the potential of physician-artificial intelligence collaboration in clinical practice. Trial Registration ClinicalTrials.gov Identifier: NCT06157944.
Collapse
Affiliation(s)
- Ethan Goh
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California
- Stanford Clinical Excellence Research Center, Stanford University, Stanford, California
| | - Robert Gallo
- Center for Innovation to Implementation, VA Palo Alto Health Care System, Palo Alto, California
| | - Jason Hom
- Department of Hospital Medicine, Stanford University School of Medicine, Stanford, California
| | - Eric Strong
- Department of Hospital Medicine, Stanford University School of Medicine, Stanford, California
| | - Yingjie Weng
- Quantitative Sciences Unit, Stanford University School of Medicine, Stanford, California
| | - Hannah Kerman
- Department of Hospital Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
- Department of Hospital Medicine, Harvard Medical School, Boston, Massachusetts
| | - Joséphine A. Cool
- Department of Hospital Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
- Department of Hospital Medicine, Harvard Medical School, Boston, Massachusetts
| | - Zahir Kanjee
- Department of Hospital Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
- Department of Hospital Medicine, Harvard Medical School, Boston, Massachusetts
| | - Andrew S. Parsons
- Department of Hospital Medicine, School of Medicine, University of Virginia, Charlottesville
| | - Neera Ahuja
- Department of Hospital Medicine, Stanford University School of Medicine, Stanford, California
| | - Eric Horvitz
- Microsoft Corp, Redmond, Washington
- Stanford Institute for Human-Centered Artificial Intelligence, Stanford, California
| | - Daniel Yang
- Department of Hospital Medicine, Kaiser Permanente, Oakland, California
| | - Arnold Milstein
- Stanford Clinical Excellence Research Center, Stanford University, Stanford, California
| | - Andrew P. J. Olson
- Department of Hospital Medicine, University of Minnesota Medical School, Minneapolis
| | - Adam Rodman
- Department of Hospital Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
- Department of Hospital Medicine, Harvard Medical School, Boston, Massachusetts
| | - Jonathan H. Chen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California
- Stanford Clinical Excellence Research Center, Stanford University, Stanford, California
- Division of Hospital Medicine, Stanford University, Stanford, California
| |
Collapse
|
22
|
Ranji SR. Large Language Models-Misdiagnosing Diagnostic Excellence? JAMA Netw Open 2024; 7:e2440901. [PMID: 39466249 DOI: 10.1001/jamanetworkopen.2024.40901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/29/2024] Open
Affiliation(s)
- Sumant R Ranji
- Division of Hospital Medicine, Department of Medicine, San Francisco General Hospital, San Francisco, California
- Division of Clinical Informatics and Digital Transformation, University of California, San Francisco
| |
Collapse
|
23
|
Rasooly IR, Marshall TL, Cifra CL, Catchpole K, Kuzma NC, Brady PW, Melton K, Khan A, Chien AT, Lipstein EA, Landrigan CP, Walsh KE. Developing methods to identify resilience and improve communication about diagnosis in pediatric primary care. Front Med (Lausanne) 2024; 11:1414892. [PMID: 39403279 PMCID: PMC11472325 DOI: 10.3389/fmed.2024.1414892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Accepted: 09/13/2024] [Indexed: 03/16/2025] Open
Abstract
Communication underlies every stage of the diagnostic process. The Dialog Study aims to characterize the pediatric diagnostic journey, focusing on communication as a source of resilience, in order to ultimately develop and test the efficacy of a structured patient-centered communication intervention in improving outpatient diagnostic safety. In this manuscript, we will describe protocols, data collection instruments, methods, analytic approaches, and theoretical frameworks to be used in to characterize the patient journey in the Dialog Study. Our approach to characterization of the patient journey will attend to patient and structural factors, like race and racism, and language and language access, before developing interventions. Our mixed-methods approach is informed by the Systems Engineering Initiative for Patient Safety (SEIPS) 3.0 framework (which describes the sociotechnical system underpinning diagnoses within the broader context of multiple interactions with different care settings over time) and the Safety II framework (which seeks to understand successful and unsuccessful adaptations to ongoing changes in demand and capacity within the healthcare system). We will assess the validity of different methods to detect diagnostic errors along the diagnostic journey. In doing so, we will emphasize the importance of viewing the diagnostic process as the product of communications situated in systems-of-work that are constantly adapting to everyday challenges.
Collapse
Affiliation(s)
- Irit R. Rasooly
- Clinical Futures: A Center of Emphasis within the CHOP Research Institute, Children’s Hospital of Philadelphia, Philadelphia, PA, United States
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, United States
| | - Trisha L. Marshall
- Division of Hospital Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States
- Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Christina L. Cifra
- Division of Medical Critical Care, Department of Pediatrics, Boston Children’s Hospital, Boston, MA, United States
- Department of Pediatrics, Harvard Medical School, Boston, MA, United States
| | - Ken Catchpole
- Department of Anesthesia and Perioperative Medicine, Medical University of South Carolina, Charleston, SC, United States
| | - Nicholas C. Kuzma
- Drexel University College of Medicine, Philadelphia, PA, United States
- Department of Pediatrics, St. Christopher’s Hospital for Children, Philadelphia, PA, United States
| | - Patrick W. Brady
- Division of Hospital Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States
- Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
- James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States
| | - Katherine Melton
- Division of General Pediatrics, Department of Pediatrics, Boston Children's Hospital, Boston, MA, United States
| | - Alisa Khan
- Department of Pediatrics, Harvard Medical School, Boston, MA, United States
- Division of General Pediatrics, Department of Pediatrics, Boston Children's Hospital, Boston, MA, United States
| | - Alyna T. Chien
- Department of Pediatrics, Harvard Medical School, Boston, MA, United States
- Division of General Pediatrics, Department of Pediatrics, Boston Children's Hospital, Boston, MA, United States
| | - Ellen A. Lipstein
- Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OH, United States
- James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States
| | - Christopher P. Landrigan
- Division of General Pediatrics, Department of Pediatrics, Boston Children's Hospital, Boston, MA, United States
- Division of Sleep and Circadian Disorders, Departments of Medicine and Neurology, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Pediatrics, Department of Medicine, and Division of Sleep Medicine, Harvard Medical School, Boston, MA, United States
| | - Kathleen E. Walsh
- Department of Pediatrics, Harvard Medical School, Boston, MA, United States
- Division of General Pediatrics, Department of Pediatrics, Boston Children's Hospital, Boston, MA, United States
| |
Collapse
|
24
|
Wittmann H, Prediger S, Harendza S. "Do you smoke?" - content and linguistic analysis of students' substance histories in simulated patient interviews. GMS JOURNAL FOR MEDICAL EDUCATION 2024; 41:Doc43. [PMID: 39415815 PMCID: PMC11474643 DOI: 10.3205/zma001698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 06/05/2024] [Accepted: 07/04/2024] [Indexed: 10/19/2024]
Abstract
Background The use of tobacco, alcohol and other drugs has considerable health consequences. Substance histories are often only incompletely taken in everyday clinical practice. When learning to take a medical history in medical school, one of the learning objectives is to inquire about consumption behavior. The aim of this exploratory study was therefore to examine the content and language of substance histories taken by medical students. Methods From a simulation training of a first working day in hospital, 91 video films of medical histories were available, which advanced medical students had conducted with six patients with different consumer behavior. These interviews were verbatim transcribed and analyzed using content-structuring qualitative content analysis according to Kuckartz. For all substances, the reasons for the questions and the depth of the respective substance use were categorized and errors in the questions were examined. In addition, a linguistic analysis of the verbal ways in which the substances were inquired about was carried out. Results The students most frequently asked about smoking (73.3%). In only 15.4% of the interviews were all substances asked about, in none were all substances asked about completely. A total of 112 protocol questions and 21 occasion-related questions were identified. Logical errors and double questions were found. Most of the questions were asked in a factual manner. However, questions in the categories "evasive" and "stigmatizing" were also found. Conclusion The content-related and linguistic deficits of medical students in the collection of substance histories identified in this study should be addressed in communication courses at an early stage of undergraduate medical studies.
Collapse
Affiliation(s)
- Hilko Wittmann
- University Medical Center Hamburg-Eppendorf, III. Medical Clinic, Hamburg, Germany
| | - Sarah Prediger
- University Medical Center Hamburg-Eppendorf, III. Medical Clinic, Hamburg, Germany
| | - Sigrid Harendza
- University Medical Center Hamburg-Eppendorf, III. Medical Clinic, Hamburg, Germany
| |
Collapse
|
25
|
Dahm MR, Chien LJ, Morris J, Lutze L, Scanlan S, Crock C. Addressing diagnostic uncertainty and excellence in emergency care-from multicountry policy analysis to communication practice in Australian emergency departments: a multimethod study protocol. BMJ Open 2024; 14:e085335. [PMID: 39277199 PMCID: PMC11404230 DOI: 10.1136/bmjopen-2024-085335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 08/30/2024] [Indexed: 09/17/2024] Open
Abstract
INTRODUCTION Communication failings may compromise the diagnostic process and pose a risk to quality of care and patient safety. With a focus on emergency care settings, this project aims to examine the critical role and impact of communication in the diagnostic process, including in diagnosis-related health and research policy, and diagnostic patient-clinician interactions in emergency departments (EDs). METHODS AND ANALYSIS This project uses a qualitatively driven multimethod design integrating findings from two research studies to gain a comprehensive understanding of the impact of context and communication on diagnostic excellence from diverse perspectives. Study 1 will map the diagnostic policy and practice landscape in Australia, New Zealand and the USA through qualitative expert interviews and policy analysis. Study 2 will investigate the communication of uncertainty in diagnostic interactions through a qualitative ethnography of two metropolitan Australian ED sites incorporating observations, field notes, video-recorded interactions, semistructured interviews and written medical documentation, including linguistic analysis of recorded diagnostic interactions and written documentation. This study will also feature a description of clinician, patient and carer perspectives on, and involvement in, interpersonal diagnostic interactions and will provide crucial new insights into the impact of communicating diagnostic uncertainty for these groups. Project-spanning patient and stakeholder involvement strategies will build research capacity among healthcare consumers via educational workshops, engage with community stakeholders in analysis and build consensus among stakeholders. ETHICS AND DISSEMINATION The project has received ethical approvals from the Human Research Ethics Committee at ACT Health, Northern Sydney Local Health District and the Australian National University. Findings will be disseminated to academic peers, clinicians and healthcare consumers, health policy-makers and the general public, using local and international academic and consumer channels (journals, evidence briefs and conferences) and outreach activities (workshops and seminars).
Collapse
Affiliation(s)
- Maria R Dahm
- Institute for Communication in Health Care, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Laura J Chien
- Institute for Communication in Health Care, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Jen Morris
- Institute for Communication in Health Care, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Lucy Lutze
- Hornsby and Ku-ring-gai Hospital, Hornsby, New South Wales, Australia
| | - Sam Scanlan
- Canberra Health Services, Canberra, Australian Capital Territory, Australia
| | - Carmel Crock
- The Royal Victorian Eye and Ear Hospital, East Melbourne, Victoria, Australia
| |
Collapse
|
26
|
Swinckels L, Bennis FC, Ziesemer KA, Scheerman JFM, Bijwaard H, de Keijzer A, Bruers JJ. The Use of Deep Learning and Machine Learning on Longitudinal Electronic Health Records for the Early Detection and Prevention of Diseases: Scoping Review. J Med Internet Res 2024; 26:e48320. [PMID: 39163096 PMCID: PMC11372333 DOI: 10.2196/48320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 09/29/2023] [Accepted: 04/29/2024] [Indexed: 08/21/2024] Open
Abstract
BACKGROUND Electronic health records (EHRs) contain patients' health information over time, including possible early indicators of disease. However, the increasing amount of data hinders clinicians from using them. There is accumulating evidence suggesting that machine learning (ML) and deep learning (DL) can assist clinicians in analyzing these large-scale EHRs, as algorithms thrive on high volumes of data. Although ML has become well developed, studies mainly focus on engineering but lack medical outcomes. OBJECTIVE This study aims for a scoping review of the evidence on how the use of ML on longitudinal EHRs can support the early detection and prevention of disease. The medical insights and clinical benefits that have been generated were investigated by reviewing applications in a variety of diseases. METHODS This study was conducted according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. A literature search was performed in 2022 in collaboration with a medical information specialist in the following databases: PubMed, Embase, Web of Science Core Collection (Clarivate Analytics), and IEEE Xplore Digital Library and computer science bibliography. Studies were eligible when longitudinal EHRs were used that aimed for the early detection of disease via ML in a prevention context. Studies with a technical focus or using imaging or hospital admission data were beyond the scope of this review. Study screening and selection and data extraction were performed independently by 2 researchers. RESULTS In total, 20 studies were included, mainly published between 2018 and 2022. They showed that a variety of diseases could be detected or predicted, particularly diabetes; kidney diseases; diseases of the circulatory system; and mental, behavioral, and neurodevelopmental disorders. Demographics, symptoms, procedures, laboratory test results, diagnoses, medications, and BMI were frequently used EHR data in basic recurrent neural network or long short-term memory techniques. By developing and comparing ML and DL models, medical insights such as a high diagnostic performance, an earlier detection, the most important predictors, and additional health indicators were obtained. A clinical benefit that has been evaluated positively was preliminary screening. If these models are applied in practice, patients might also benefit from personalized health care and prevention, with practical benefits such as workload reduction and policy insights. CONCLUSIONS Longitudinal EHRs proved to be helpful for support in health care. Current ML models on EHRs can support the detection of diseases in terms of accuracy and offer preliminary screening benefits. Regarding the prevention of diseases, ML and specifically DL models can accurately predict or detect diseases earlier than current clinical diagnoses. Adding personally responsible factors allows targeted prevention interventions. While ML models based on textual EHRs are still in the developmental stage, they have high potential to support clinicians and the health care system and improve patient outcomes.
Collapse
Affiliation(s)
- Laura Swinckels
- Department of Oral Public Health, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit, Amsterdam, Netherlands
- Department Oral Hygiene, Cluster Health, Sports and Welfare, Inholland University of Applied Sciences, Amsterdam, Netherlands
- Medical Technology Research Group, Cluster Health, Sport and Welfare, Inholland University of Applied Sciences, Haarlem, Netherlands
- Data Driven Smart Society Research Group, Faculty of Engineering, Design & Computing, Inholland University of Applied Sciences, Alkmaar, Netherlands
| | - Frank C Bennis
- Quantitative Data Analytics Group, Department of Computer Science, Vrije Universiteit, Amsterdam, Netherlands
- Department of Pediatrics, Emma Neuroscience Group, Emma Children's Hospital, Amsterdam UMC, Amsterdam, Netherlands
- Amsterdam Reproduction and Development Research Institute, Amsterdam, Netherlands
| | - Kirsten A Ziesemer
- Medical Library, University Library, Vrije Universiteit, Amsterdam, Netherlands
| | - Janneke F M Scheerman
- Department Oral Hygiene, Cluster Health, Sports and Welfare, Inholland University of Applied Sciences, Amsterdam, Netherlands
- Medical Technology Research Group, Cluster Health, Sport and Welfare, Inholland University of Applied Sciences, Haarlem, Netherlands
| | - Harmen Bijwaard
- Medical Technology Research Group, Cluster Health, Sport and Welfare, Inholland University of Applied Sciences, Haarlem, Netherlands
| | - Ander de Keijzer
- Data Driven Smart Society Research Group, Faculty of Engineering, Design & Computing, Inholland University of Applied Sciences, Alkmaar, Netherlands
- Applied Responsible Artificial Intelligence, Avans University of Applied Sciences, Breda, Netherlands
| | - Josef Jan Bruers
- Department of Oral Public Health, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit, Amsterdam, Netherlands
- Royal Dutch Dental Association (KNMT), Utrecht, Netherlands
| |
Collapse
|
27
|
Brasen CL, Andersen ES, Madsen JB, Hastrup J, Christensen H, Andersen DP, Lind PM, Mogensen N, Madsen PH, Christensen AF, Madsen JS, Ejlersen E, Brandslund I. Machine learning in diagnostic support in medical emergency departments. Sci Rep 2024; 14:17889. [PMID: 39095565 PMCID: PMC11297196 DOI: 10.1038/s41598-024-66837-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 07/04/2024] [Indexed: 08/04/2024] Open
Abstract
Diagnosing patients in the medical emergency department is complex and this is expected to increase in many countries due to an ageing population. In this study we investigate the feasibility of training machine learning algorithms to assist physicians handling the complex situation in the medical emergency departments. This is expected to reduce diagnostic errors and improve patient logistics and outcome. We included a total of 9,190 consecutive patient admissions diagnosed and treated in two hospitals in this cohort study. Patients had a biochemical workup including blood and urine analyses on clinical decision totaling 260 analyses. After adding nurse-registered data we trained 19 machine learning algorithms on a random 80% sample of the patients and validated the results on the remaining 20%. We trained algorithms for 19 different patient outcomes including the main outcomes death in 7 (Area under the Curve (AUC) 91.4%) and 30 days (AUC 91.3%) and safe-discharge(AUC 87.3%). The various algorithms obtained areas under the Receiver Operating Characteristics -curves in the range of 71.8-96.3% in the holdout cohort (68.3-98.2% in the training cohort). Performing this list of biochemical analyses at admission also reduced the number of subsequent venipunctures within 24 h from patient admittance by 22%. We have shown that it is possible to develop a list of machine-learning algorithms with high AUC for use in medical emergency departments. Moreover, the study showed that it is possible to reduce the number of venipunctures in this cohort.
Collapse
Affiliation(s)
- Claus Lohman Brasen
- Department of Biochemistry and Immunology, Lillebaelt Hospital, University Hospital of Southern Denmark, Beriderbakken 4, 7100, Vejle, Denmark.
- Faculty of Health Sciences, Department of Regional Health Research, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark.
| | - Eline Sandvig Andersen
- Department of Biochemistry and Immunology, Lillebaelt Hospital, University Hospital of Southern Denmark, Beriderbakken 4, 7100, Vejle, Denmark
- Faculty of Health Sciences, Department of Regional Health Research, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
| | - Jeppe Buur Madsen
- Department of Biochemistry and Immunology, Lillebaelt Hospital, University Hospital of Southern Denmark, Beriderbakken 4, 7100, Vejle, Denmark
| | - Jens Hastrup
- Department of Biochemistry and Immunology, Lillebaelt Hospital, University Hospital of Southern Denmark, Beriderbakken 4, 7100, Vejle, Denmark
| | - Henry Christensen
- Department of Biochemistry and Immunology, Lillebaelt Hospital, University Hospital of Southern Denmark, Beriderbakken 4, 7100, Vejle, Denmark
| | - Dorte Patuel Andersen
- Department of Emergency, Kolding Hospital, Lillebaelt Hospital, University Hospital of Southern Denmark, Sygehusvej 24, 6000, Kolding, Denmark
| | - Pia Margrethe Lind
- Department of Biochemistry and Immunology, Lillebaelt Hospital, University Hospital of Southern Denmark, Beriderbakken 4, 7100, Vejle, Denmark
| | - Nina Mogensen
- Department of Biochemistry and Immunology, Lillebaelt Hospital, University Hospital of Southern Denmark, Beriderbakken 4, 7100, Vejle, Denmark
| | - Poul Henning Madsen
- Department of Medicine, Kolding Hospital, Lillebaelt Hospital, University Hospital of Southern Denmark, Sygehusvej 24, 6000, Kolding, Denmark
- Emergency, Acute Care and Trauma Centre, Odense University Hospital, J. B. Winsløws Vej 4, 5000, Odense, Denmark
| | - Anne Friesgaard Christensen
- Department of Medicine, Kolding Hospital, Lillebaelt Hospital, University Hospital of Southern Denmark, Sygehusvej 24, 6000, Kolding, Denmark
| | - Jonna Skov Madsen
- Department of Biochemistry and Immunology, Lillebaelt Hospital, University Hospital of Southern Denmark, Beriderbakken 4, 7100, Vejle, Denmark
- Faculty of Health Sciences, Department of Regional Health Research, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
| | - Ejler Ejlersen
- Department of Medicine, Vejle Hospital, Lillebaelt Hospital, University Hospital of Southern Denmark, Beriderbakken 4, 7100, Vejle, Denmark
| | - Ivan Brandslund
- Department of Biochemistry and Immunology, Lillebaelt Hospital, University Hospital of Southern Denmark, Beriderbakken 4, 7100, Vejle, Denmark
- Faculty of Health Sciences, Department of Regional Health Research, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
| |
Collapse
|
28
|
Jay R, Davenport C, Patel R. Clinical reasoning-the essentials for teaching medical students, trainees and non-medical healthcare professionals. Br J Hosp Med (Lond) 2024; 85:1-8. [PMID: 39078902 DOI: 10.12968/hmed.2024.0052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2025]
Abstract
Clinical reasoning is fundamental for effective clinical practice. Traditional consultation models for teaching clinical reasoning or conventional approaches for teaching students how to make a diagnosis or management plan that rely on learning through observation only, are increasingly recognised as insufficient. There are also many challenges to supporting learners in developing clinical reasoning over time as well as across different clinical presentations and contexts. These challenges are compounded by the differences in how experts and novices make sense of clinical information, and the different cognitive processes each use when processing and communicating this information using precise medical language. Diagnostic errors may be due to cognitive biases but also, in a majority of cases, due to a lack of clinical knowledge. Therefore, effective educational strategies to develop clinical reasoning include identifying learners' knowledge gaps, using worked examples to prevent cognitive overload, promoting the use of key features and practising the construction of accurate problem representations. Deliberate reflection on diagnostic justification is also recommended, and overall, contributes to a growing number of evidence-based and theory-driven educational interventions for reducing diagnostic errors and improving patient care.
Collapse
Affiliation(s)
- Robert Jay
- Lincoln Medical School, University of Lincoln, Lincoln, UK
- Faculty of Health, Social Care and Medicine, Edge Hill University, Ormskirk, UK
| | - Clare Davenport
- Institute of Applied Health Research, University of Birmingham, Birmingham, UK
| | - Rakesh Patel
- Barts and the London Faculty of Medicine and Dentistry, Queen Mary University London, London, UK
| |
Collapse
|
29
|
Kämmer JE, Hautz WE, Krummrey G, Sauter TC, Penders D, Birrenbach T, Bienefeld N. Effects of interacting with a large language model compared with a human coach on the clinical diagnostic process and outcomes among fourth-year medical students: study protocol for a prospective, randomised experiment using patient vignettes. BMJ Open 2024; 14:e087469. [PMID: 39025818 PMCID: PMC11261684 DOI: 10.1136/bmjopen-2024-087469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Accepted: 07/02/2024] [Indexed: 07/20/2024] Open
Abstract
INTRODUCTION Versatile large language models (LLMs) have the potential to augment diagnostic decision-making by assisting diagnosticians, thanks to their ability to engage in open-ended, natural conversations and their comprehensive knowledge access. Yet the novelty of LLMs in diagnostic decision-making introduces uncertainties regarding their impact. Clinicians unfamiliar with the use of LLMs in their professional context may rely on general attitudes towards LLMs more broadly, potentially hindering thoughtful use and critical evaluation of their input, leading to either over-reliance and lack of critical thinking or an unwillingness to use LLMs as diagnostic aids. To address these concerns, this study examines the influence on the diagnostic process and outcomes of interacting with an LLM compared with a human coach, and of prior training vs no training for interacting with either of these 'coaches'. Our findings aim to illuminate the potential benefits and risks of employing artificial intelligence (AI) in diagnostic decision-making. METHODS AND ANALYSIS We are conducting a prospective, randomised experiment with N=158 fourth-year medical students from Charité Medical School, Berlin, Germany. Participants are asked to diagnose patient vignettes after being assigned to either a human coach or ChatGPT and after either training or no training (both between-subject factors). We are specifically collecting data on the effects of using either of these 'coaches' and of additional training on information search, number of hypotheses entertained, diagnostic accuracy and confidence. Statistical methods will include linear mixed effects models. Exploratory analyses of the interaction patterns and attitudes towards AI will also generate more generalisable knowledge about the role of AI in medicine. ETHICS AND DISSEMINATION The Bern Cantonal Ethics Committee considered the study exempt from full ethical review (BASEC No: Req-2023-01396). All methods will be conducted in accordance with relevant guidelines and regulations. Participation is voluntary and informed consent will be obtained. Results will be published in peer-reviewed scientific medical journals. Authorship will be determined according to the International Committee of Medical Journal Editors guidelines.
Collapse
Affiliation(s)
- Juliane E Kämmer
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Wolf E Hautz
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Gert Krummrey
- Institute for Medical Informatics (I4MI), Bern University of Applied Sciences, Bern, Switzerland
| | - Thomas C Sauter
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Dorothea Penders
- Department of Anesthesiology and Operative Intensive Care Medicine CCM & CVK, Charité Universitätsmedizin Berlin, Berlin, Germany
- Lernzentrum (Skills Lab), Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Tanja Birrenbach
- Department of Emergency Medicine, Inselspital University Hospital Bern, University of Bern, Bern, Switzerland
| | - Nadine Bienefeld
- Department of Management, Technology, and Economics, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
30
|
Fortin K, Wood JN, Udell SM, Christian CW. Emergency Department Triage Chief Complaints Among Children Evaluated for Physical Abuse Concerns. Pediatr Emerg Care 2024; 40:527-531. [PMID: 38713852 DOI: 10.1097/pec.0000000000003191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 05/09/2024]
Abstract
OBJECTIVES The aims of this study were to describe chief complaints provided at emergency department triage for young children ultimately given a diagnosed with injuries concerning for physical abuse and compare chief complaints by hospital child protection team assessment (abuse most likely, accident most likely, undetermined) among children younger than 2 years who were the subject of a report to child protective services. METHODS This is a retrospective review of children evaluated by the child protection team at an urban children's hospital over a 5-year period. Children younger than 2 years who were the subject of a report to child protective services for suspected physical abuse were included. Chief complaints noted in emergency department triage notes were categorized as follows: 1, medical sign or symptom; 2, accidental trauma incident; 3, identified injury; 4, concern for abuse; or 5, multiple unrelated complaints. Child protection team assessments were categorized as follows: 1, abuse most likely; 2, accident most likely; or 3, undetermined. We used descriptive statistics and tests of association (χ 2 , Fisher exact, Kruskal-Wallis). RESULTS Median age of the 422 children included was 4.9 months. Child protection team assessment was abuse most likely in 44%, accident most likely in 23%, and undetermined in 34%. Chief complaints in the overall sample were 39% medical, 29% trauma incident, 16% injury, 10% abuse concern, and 6% multiple unrelated. When the abuse most likely and accident most likely groups were compared, medical chief complaints were more common in the former (47% vs 19%, P < 0.001), whereas trauma incident chief complaints were more common in the latter (19% vs 64%, P < 0.001). Most common medical complaints in the abuse most likely group were altered mental status, abnormal limb use, swelling, pain, apnea, and vomiting. CONCLUSION Many children found to have injuries concerning for abuse (47%) present without mention of trauma, injury, or abuse concern as part of the chief complaint. Our findings suggest important topics to include in training physicians about recognition of abuse.
Collapse
|
31
|
Hägglund M, Kharko A, Bärkås A, Blease C, Cajander Å, DesRoches C, Fagerlund AJ, Hagström J, Huvila I, Hörhammer I, Kane B, Klein GO, Kristiansen E, Moll J, Muli I, Rexhepi H, Riggare S, Ross P, Scandurra I, Simola S, Soone H, Wang B, Ghorbanian Zolbin M, Åhlfeldt RM, Kujala S, Johansen MA. A Nordic Perspective on Patient Online Record Access and the European Health Data Space. J Med Internet Res 2024; 26:e49084. [PMID: 38935430 PMCID: PMC11240068 DOI: 10.2196/49084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 10/31/2023] [Accepted: 04/25/2024] [Indexed: 06/28/2024] Open
Abstract
The Nordic countries are, together with the United States, forerunners in online record access (ORA), which has now become widespread. The importance of accessible and structured health data has also been highlighted by policy makers internationally. To ensure the full realization of ORA's potential in the short and long term, there is a pressing need to study ORA from a cross-disciplinary, clinical, humanistic, and social sciences perspective that looks beyond strictly technical aspects. In this viewpoint paper, we explore the policy changes in the European Health Data Space (EHDS) proposal to advance ORA across the European Union, informed by our research in a Nordic-led project that carries out the first of its kind, large-scale international investigation of patients' ORA-NORDeHEALTH (Nordic eHealth for Patients: Benchmarking and Developing for the Future). We argue that the EHDS proposal will pave the way for patients to access and control third-party access to their electronic health records. In our analysis of the proposal, we have identified five key principles for ORA: (1) the right to access, (2) proxy access, (3) patient input of their own data, (4) error and omission rectification, and (5) access control. ORA implementation today is fragmented throughout Europe, and the EHDS proposal aims to ensure all European citizens have equal online access to their health data. However, we argue that in order to implement the EHDS, we need more research evidence on the key ORA principles we have identified in our analysis. Results from the NORDeHEALTH project provide some of that evidence, but we have also identified important knowledge gaps that still need further exploration.
Collapse
Affiliation(s)
- Maria Hägglund
- Participatory eHealth and Health Data Research Group, Department of Women's and Children's Health, Uppsala University, Uppsala, Sweden
- Medtech Science & Innovation Centre, Uppsala University Hospital, Uppsala, Sweden
| | - Anna Kharko
- Participatory eHealth and Health Data Research Group, Department of Women's and Children's Health, Uppsala University, Uppsala, Sweden
- School of Psychology, Faculty of Health, University of Plymouth, Plymouth, United Kingdom
| | - Annika Bärkås
- Participatory eHealth and Health Data Research Group, Department of Women's and Children's Health, Uppsala University, Uppsala, Sweden
| | - Charlotte Blease
- Participatory eHealth and Health Data Research Group, Department of Women's and Children's Health, Uppsala University, Uppsala, Sweden
- Division of General Medicine, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, United States
| | - Åsa Cajander
- Department of Information Technology, Uppsala University, Uppsala, Sweden
| | - Catherine DesRoches
- Division of General Medicine, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, United States
| | | | - Josefin Hagström
- Participatory eHealth and Health Data Research Group, Department of Women's and Children's Health, Uppsala University, Uppsala, Sweden
| | - Isto Huvila
- Department of ALM, Uppsala University, Uppsala, Sweden
| | - Iiris Hörhammer
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Bridget Kane
- Participatory eHealth and Health Data Research Group, Department of Women's and Children's Health, Uppsala University, Uppsala, Sweden
- Business School, Karlstad University, Karlstad, Sweden
| | - Gunnar O Klein
- Centre for Empirical Research on Information Systems, School of Business, Örebro University, Örebro, Sweden
| | - Eli Kristiansen
- Norwegian Centre for E-Health Research, University Hospital of North Norway, Tromsø, Norway
| | - Jonas Moll
- Centre for Empirical Research on Information Systems, School of Business, Örebro University, Örebro, Sweden
| | - Irene Muli
- Participatory eHealth and Health Data Research Group, Department of Women's and Children's Health, Uppsala University, Uppsala, Sweden
| | - Hanife Rexhepi
- School of Informatics, University of Skövde, Skövde, Sweden
| | - Sara Riggare
- Participatory eHealth and Health Data Research Group, Department of Women's and Children's Health, Uppsala University, Uppsala, Sweden
| | - Peeter Ross
- E-Medicine Centre, Department of Health Technologies, Tallinn University of Technology, Tallinn, Estonia
- Research Department, East Tallinn Central Hospital, Tallinn, Estonia
| | - Isabella Scandurra
- Centre for Empirical Research on Information Systems, School of Business, Örebro University, Örebro, Sweden
| | - Saija Simola
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Hedvig Soone
- E-Medicine Centre, Department of Health Technologies, Tallinn University of Technology, Tallinn, Estonia
| | - Bo Wang
- Norwegian Centre for E-Health Research, University Hospital of North Norway, Tromsø, Norway
| | | | | | - Sari Kujala
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Monika Alise Johansen
- Norwegian Centre for E-Health Research, University Hospital of North Norway, Tromsø, Norway
| |
Collapse
|
32
|
Marshan A, Almutairi AN, Ioannou A, Bell D, Monaghan A, Arzoky M. MedT5SQL: a transformers-based large language model for text-to-SQL conversion in the healthcare domain. Front Big Data 2024; 7:1371680. [PMID: 38988646 PMCID: PMC11233734 DOI: 10.3389/fdata.2024.1371680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 06/10/2024] [Indexed: 07/12/2024] Open
Abstract
Introduction In response to the increasing prevalence of electronic medical records (EMRs) stored in databases, healthcare staff are encountering difficulties retrieving these records due to their limited technical expertise in database operations. As these records are crucial for delivering appropriate medical care, there is a need for an accessible method for healthcare staff to access EMRs. Methods To address this, natural language processing (NLP) for Text-to-SQL has emerged as a solution, enabling non-technical users to generate SQL queries using natural language text. This research assesses existing work on Text-to-SQL conversion and proposes the MedT5SQL model specifically designed for EMR retrieval. The proposed model utilizes the Text-to-Text Transfer Transformer (T5) model, a Large Language Model (LLM) commonly used in various text-based NLP tasks. The model is fine-tuned on the MIMICSQL dataset, the first Text-to-SQL dataset for the healthcare domain. Performance evaluation involves benchmarking the MedT5SQL model on two optimizers, varying numbers of training epochs, and using two datasets, MIMICSQL and WikiSQL. Results For MIMICSQL dataset, the model demonstrates considerable effectiveness in generating question-SQL pairs achieving accuracy of 80.63%, 98.937%, and 90% for exact match accuracy matrix, approximate string-matching, and manual evaluation, respectively. When testing the performance of the model on WikiSQL dataset, the model demonstrates efficiency in generating SQL queries, with an accuracy of 44.2% on WikiSQL and 94.26% for approximate string-matching. Discussion Results indicate improved performance with increased training epochs. This work highlights the potential of fine-tuned T5 model to convert medical-related questions written in natural language to Structured Query Language (SQL) in healthcare domain, providing a foundation for future research in this area.
Collapse
Affiliation(s)
- Alaa Marshan
- School of Computer Science and Electronic Engineering, University of Surrey, Guildford, United Kingdom
| | | | - Athina Ioannou
- Surrey Business School, University of Surrey, Guildford, United Kingdom
| | - David Bell
- Department of Computer Science, Brunel University London, London, United Kingdom
| | - Asmat Monaghan
- School of Business and Management, Royal Holloway, University of London, London, United Kingdom
| | - Mahir Arzoky
- Department of Computer Science, Brunel University London, London, United Kingdom
| |
Collapse
|
33
|
Stankovic I, Zivanic A, Vranic I, Neskovic AN. Correlations and discrepancies between cardiac ultrasound, clinical diagnosis and the autopsy findings in early deceased patients with suspected cardiovascular emergencies. THE INTERNATIONAL JOURNAL OF CARDIOVASCULAR IMAGING 2024; 40:1353-1361. [PMID: 38652394 DOI: 10.1007/s10554-024-03107-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 04/03/2024] [Indexed: 04/25/2024]
Abstract
Cardiac ultrasound (CUS), either focused cardiac ultrasound (FoCUS) or emergency echocardiography, is frequently used in cardiovascular (CV) emergencies. We assessed correlations and discrepancies between CUS, clinical diagnosis and the autopsy findings in early deceased patients with suspected CV emergencies. We retrospectively analysed clinical and autopsy data of 131 consecutive patients who died within 24 h of hospital admission. The type of CUS and its findings were analysed in relation to the clinical and autopsy diagnoses. CUS was performed in 58% of patients - FoCUS in 83%, emergency echocardiography in 12%, and both types of CUS in 5% of cases. CUS was performed more frequently in patients without a history of CV disease (64 vs. 40%, p = 0.08) and when the time between admission and death was longer (6 vs. 2 h, p = 0.021). In 7% of patients, CUS was inconclusive. In 10% of patients, the ante-mortem cause of death could not be determined, while discrepancies between the clinical and post-mortem diagnosis were found in 26% of cases. In the multivariate logistic regression model, only conclusive CUS [odds ratio (OR) 2.76, 95% confidence interval (CI) 1.30-7.39, p = 0.044] and chest pain at presentation (OR 30.19, 95%CI 5.65 -161.22, p < 0.001) were independently associated with congruent clinical and autopsy diagnosis. In a tertiary university hospital, FoCUS was used more frequently than emergency echocardiography in critically ill patients with suspected cardiac emergencies. Chest pain at presentation and a conclusive CUS were associated with concordant clinical and autopsy diagnoses.
Collapse
Affiliation(s)
- Ivan Stankovic
- Department of Cardiology, Clinical Hospital Centre Zemun, Vukova 9, Belgrade, 11080, Serbia.
- Faculty of Medicine, University of Belgrade, Belgrade, Serbia.
| | - Aleksandra Zivanic
- Department of Cardiology, Clinical Hospital Centre Zemun, Vukova 9, Belgrade, 11080, Serbia
| | - Ivona Vranic
- Department of Cardiology, Clinical Hospital Centre Zemun, Vukova 9, Belgrade, 11080, Serbia
| | - Aleksandar N Neskovic
- Department of Cardiology, Clinical Hospital Centre Zemun, Vukova 9, Belgrade, 11080, Serbia
- Faculty of Medicine, University of Belgrade, Belgrade, Serbia
| |
Collapse
|
34
|
Kotwal S, Howell M, Zwaan L, Wright SM. Exploring Clinical Lessons Learned by Experienced Hospitalists from Diagnostic Errors and Successes. J Gen Intern Med 2024; 39:1386-1392. [PMID: 38277023 PMCID: PMC11169201 DOI: 10.1007/s11606-024-08625-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 01/09/2024] [Indexed: 01/27/2024]
Abstract
BACKGROUND Diagnostic errors cause significant patient harm. The clinician's ultimate goal is to achieve diagnostic excellence in order to serve patients safely. This can be accomplished by learning from both errors and successes in patient care. However, the extent to which clinicians grow and navigate diagnostic errors and successes in patient care is poorly understood. Clinically experienced hospitalists, who have cared for numerous acutely ill patients, should have great insights from their successes and mistakes to inform others striving for excellence in patient care. OBJECTIVE To identify and characterize clinical lessons learned by experienced hospitalists from diagnostic errors and successes. DESIGN A semi-structured interview guide was used to collect qualitative data from hospitalists at five independently administered hospitals in the Mid-Atlantic area from February to June 2022. PARTICIPANTS 12 academic and 12 community-based hospitalists with ≥ 5 years of clinical experience. APPROACH A constructivist qualitative approach was used and "reflexive thematic analysis" of interview transcripts was conducted to identify themes and patterns of meaning across the dataset. RESULTS Five themes were generated from the data based on clinical lessons learned by hospitalists from diagnostic errors and successes. The ideas included appreciating excellence in clinical reasoning as a core skill, connecting with patients and other members of the health care team to be able to tap into their insights, reflecting on the diagnostic process, committing to growth, and prioritizing self-care. CONCLUSIONS The study identifies key lessons learned from the errors and successes encountered in patient care by clinically experienced hospitalists. These findings may prove helpful for individuals and groups that are authentically committed to moving along the continuum from diagnostic competence towards excellence.
Collapse
Affiliation(s)
- Susrutha Kotwal
- Department of Medicine, Division of Hospital Medicine, Johns Hopkins Bayview Medical Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Mason Howell
- Department of Biosciences, Rice University, Houston, TX, USA
| | - Laura Zwaan
- Erasmus Medical Center, Institute of Medical Education Research Rotterdam, Rotterdam, The Netherlands
| | - Scott M Wright
- Department of Medicine, Division of Hospital Medicine, Johns Hopkins Bayview Medical Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Medicine, Division of General Internal Medicine, Johns Hopkins Bayview Medical Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
35
|
Harada Y, Sakamoto T, Sugimoto S, Shimizu T. Longitudinal Changes in Diagnostic Accuracy of a Differential Diagnosis List Developed by an AI-Based Symptom Checker: Retrospective Observational Study. JMIR Form Res 2024; 8:e53985. [PMID: 38758588 PMCID: PMC11143391 DOI: 10.2196/53985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/23/2024] [Accepted: 04/24/2024] [Indexed: 05/18/2024] Open
Abstract
BACKGROUND Artificial intelligence (AI) symptom checker models should be trained using real-world patient data to improve their diagnostic accuracy. Given that AI-based symptom checkers are currently used in clinical practice, their performance should improve over time. However, longitudinal evaluations of the diagnostic accuracy of these symptom checkers are limited. OBJECTIVE This study aimed to assess the longitudinal changes in the accuracy of differential diagnosis lists created by an AI-based symptom checker used in the real world. METHODS This was a single-center, retrospective, observational study. Patients who visited an outpatient clinic without an appointment between May 1, 2019, and April 30, 2022, and who were admitted to a community hospital in Japan within 30 days of their index visit were considered eligible. We only included patients who underwent an AI-based symptom checkup at the index visit, and the diagnosis was finally confirmed during follow-up. Final diagnoses were categorized as common or uncommon, and all cases were categorized as typical or atypical. The primary outcome measure was the accuracy of the differential diagnosis list created by the AI-based symptom checker, defined as the final diagnosis in a list of 10 differential diagnoses created by the symptom checker. To assess the change in the symptom checker's diagnostic accuracy over 3 years, we used a chi-square test to compare the primary outcome over 3 periods: from May 1, 2019, to April 30, 2020 (first year); from May 1, 2020, to April 30, 2021 (second year); and from May 1, 2021, to April 30, 2022 (third year). RESULTS A total of 381 patients were included. Common diseases comprised 257 (67.5%) cases, and typical presentations were observed in 298 (78.2%) cases. Overall, the accuracy of the differential diagnosis list created by the AI-based symptom checker was 172 (45.1%), which did not differ across the 3 years (first year: 97/219, 44.3%; second year: 32/72, 44.4%; and third year: 43/90, 47.7%; P=.85). The accuracy of the differential diagnosis list created by the symptom checker was low in those with uncommon diseases (30/124, 24.2%) and atypical presentations (12/83, 14.5%). In the multivariate logistic regression model, common disease (P<.001; odds ratio 4.13, 95% CI 2.50-6.98) and typical presentation (P<.001; odds ratio 6.92, 95% CI 3.62-14.2) were significantly associated with the accuracy of the differential diagnosis list created by the symptom checker. CONCLUSIONS A 3-year longitudinal survey of the diagnostic accuracy of differential diagnosis lists developed by an AI-based symptom checker, which has been implemented in real-world clinical practice settings, showed no improvement over time. Uncommon diseases and atypical presentations were independently associated with a lower diagnostic accuracy. In the future, symptom checkers should be trained to recognize uncommon conditions.
Collapse
Affiliation(s)
- Yukinori Harada
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Shimotsuga, Japan
- Department of General Medicine, Nagano Chuo Hospital, Nagano, Japan
| | - Tetsu Sakamoto
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Shimotsuga, Japan
| | - Shu Sugimoto
- Department of Medicine (Neurology and Rheumatology), Shinshu University School of Medicine, Matsumoto, Japan
| | - Taro Shimizu
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Shimotsuga, Japan
| |
Collapse
|
36
|
Harada Y, Otaka Y, Katsukura S, Shimizu T. Effect of contextual factors on the prevalence of diagnostic errors among patients managed by physicians of the same specialty: a single-centre retrospective observational study. BMJ Qual Saf 2024; 33:386-394. [PMID: 36690471 DOI: 10.1136/bmjqs-2022-015436] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 01/13/2023] [Indexed: 01/24/2023]
Abstract
BACKGROUND There has been growing recognition that contextual factors influence the physician's cognitive processes. However, given that cognitive processes may depend on the physicians' specialties, the effects of contextual factors on diagnostic errors reported in previous studies could be confounded by difference in physicians. OBJECTIVE This study aimed to clarify whether contextual factors such as location and consultation type affect diagnostic accuracy. METHODS We reviewed the medical records of 1992 consecutive outpatients consulted by physicians from the Department of Diagnostic and Generalist Medicine in a university hospital between 1 January and 31 December 2019. Diagnostic processes were assessed using the Revised Safer Dx Instrument. Patients were categorised into three groups according to contextual factors (location and consultation type): (1) referred patients with scheduled visit to the outpatient department; (2) patients with urgent visit to the outpatient department; and (3) patients with emergency visit to the emergency room. The effect of the contextual factors on the prevalence of diagnostic errors was investigated using logistic regression analysis. RESULTS Diagnostic errors were observed in 12 of 534 referred patients with scheduled visit to the outpatient department (2.2%), 3 of 599 patients with urgent visit to the outpatient department (0.5%) and 13 of 859 patients with emergency visit to the emergency room (1.5%). Multivariable logistic regression analysis showed a significantly higher prevalence of diagnostic errors in referred patients with scheduled visit to the outpatient department than in patients with urgent visit to the outpatient department (OR 4.08, p=0.03), but no difference between patients with emergency and urgent visit to the emergency room and outpatient department, respectively. CONCLUSION Contextual factors such as consultation type may affect diagnostic errors; however, since the differences in the prevalence of diagnostic errors were small, the effect of contextual factors on diagnostic accuracy may be small in physicians working in different care settings.
Collapse
Affiliation(s)
- Yukinori Harada
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Mibu, Tochigi, Japan
| | - Yumi Otaka
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Mibu, Tochigi, Japan
| | - Shinichi Katsukura
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Mibu, Tochigi, Japan
| | - Taro Shimizu
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Mibu, Tochigi, Japan
| |
Collapse
|
37
|
Goh E, Gallo R, Hom J, Strong E, Weng Y, Kerman H, Cool J, Kanjee Z, Parsons AS, Ahuja N, Horvitz E, Yang D, Milstein A, Olson APJ, Rodman A, Chen JH. Influence of a Large Language Model on Diagnostic Reasoning: A Randomized Clinical Vignette Study. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.12.24303785. [PMID: 38559045 PMCID: PMC10980135 DOI: 10.1101/2024.03.12.24303785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Importance Diagnostic errors are common and cause significant morbidity. Large language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves diagnostic reasoning. Objective To assess the impact of the GPT-4 LLM on physicians' diagnostic reasoning compared to conventional resources. Design Multi-center, randomized clinical vignette study. Setting The study was conducted using remote video conferencing with physicians across the country and in-person participation across multiple academic medical institutions. Participants Resident and attending physicians with training in family medicine, internal medicine, or emergency medicine. Interventions Participants were randomized to access GPT-4 in addition to conventional diagnostic resources or to just conventional resources. They were allocated 60 minutes to review up to six clinical vignettes adapted from established diagnostic reasoning exams. Main Outcomes and Measures The primary outcome was diagnostic performance based on differential diagnosis accuracy, appropriateness of supporting and opposing factors, and next diagnostic evaluation steps. Secondary outcomes included time spent per case and final diagnosis. Results 50 physicians (26 attendings, 24 residents) participated, with an average of 5.2 cases completed per participant. The median diagnostic reasoning score per case was 76.3 percent (IQR 65.8 to 86.8) for the GPT-4 group and 73.7 percent (IQR 63.2 to 84.2) for the conventional resources group, with an adjusted difference of 1.6 percentage points (95% CI -4.4 to 7.6; p=0.60). The median time spent on cases for the GPT-4 group was 519 seconds (IQR 371 to 668 seconds), compared to 565 seconds (IQR 456 to 788 seconds) for the conventional resources group, with a time difference of -82 seconds (95% CI -195 to 31; p=0.20). GPT-4 alone scored 15.5 percentage points (95% CI 1.5 to 29, p=0.03) higher than the conventional resources group. Conclusions and Relevance In a clinical vignette-based study, the availability of GPT-4 to physicians as a diagnostic aid did not significantly improve clinical reasoning compared to conventional resources, although it may improve components of clinical reasoning such as efficiency. GPT-4 alone demonstrated higher performance than both physician groups, suggesting opportunities for further improvement in physician-AI collaboration in clinical practice.
Collapse
Affiliation(s)
- Ethan Goh
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA
- Stanford Clinical Excellence Research Center, Stanford University, Stanford, CA
| | - Robert Gallo
- Center for Innovation to Implementation, VA Palo Alto Health Care System, PA, CA
| | - Jason Hom
- Stanford University School of Medicine, Stanford, CA
| | - Eric Strong
- Stanford University School of Medicine, Stanford, CA
| | - Yingjie Weng
- Quantitative Sciences Unit, Stanford University School of Medicine, Stanford, CA
| | - Hannah Kerman
- Beth Israel Deaconess Medical Center, Boston, MA
- Harvard Medical School, Boston, MA
| | - Josephine Cool
- Beth Israel Deaconess Medical Center, Boston, MA
- Harvard Medical School, Boston, MA
| | - Zahir Kanjee
- Beth Israel Deaconess Medical Center, Boston, MA
- Harvard Medical School, Boston, MA
| | | | - Neera Ahuja
- Stanford University School of Medicine, Stanford, CA
| | - Eric Horvitz
- Microsoft, Redmond, WA
- Stanford HAI, Stanford, CA
| | | | - Arnold Milstein
- Stanford Clinical Excellence Research Center, Stanford University, Stanford, CA
| | | | - Adam Rodman
- Beth Israel Deaconess Medical Center, Boston, MA
- Harvard Medical School, Boston, MA
| | - Jonathan H Chen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA
- Stanford Clinical Excellence Research Center, Stanford University, Stanford, CA
- Division of Hospital Medicine, Stanford University, Stanford, CA
| |
Collapse
|
38
|
Butler MJ, Chiuzan C, Ahn H, Gao M, D’Angelo S, Yeh J, Davidson K. Before and after COVID-19: Changes in symptoms and diagnoses in 13,033 adults. PLoS One 2024; 19:e0286371. [PMID: 38457409 PMCID: PMC10923490 DOI: 10.1371/journal.pone.0286371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 05/15/2023] [Indexed: 03/10/2024] Open
Abstract
BACKGROUND Most patients with COVID-19 report experiencing one or more symptoms after acute infection subsides, known as post-acute sequelae of SARS-CoV-2 infection (PASC). Though research has examined PASC after acute COVID-19, few studies have examined PASC over a longer follow-up duration or accounted for rates of symptoms and diagnoses before COVID-19 infection, and included those not actively seeking treatment for PASC. To determine what symptoms and diagnoses are occurring at higher rates after acute COVID-19 infection from a more inclusive sample, we extracted electronic hospital records (EHR) data from 13,033 adults with previously known diagnoses and symptoms. METHODS The sample was comprised of patients who had a positive PCR test for SARS-CoV-2 between March 1, 2020, and December 31, 2020, and follow-up was conducted through November 29, 2021. All patients in the sample had medical appointments ≥4 weeks before and ≥4 weeks after their positive PCR test. At these appointments, all ICD-10 codes recorded in the EHR were classified into 21 categories based on the literature and expert review. Conditional logistic regression models were used to quantify the odds of these symptoms and diagnostic categories following COVID-19 infection relative to visits occurring before infection. The sample was comprised of 28.0% adults over 65 and was 57.0% female. After the positive PCR test, the most recorded diagnoses and symptoms were dyspnea and respiratory failure, myositis, musculoskeletal pain/stiffness, anxiety, and depression. RESULTS Results from regression analyses showed increased odds of diagnosis for 15 of the 21 categories following positive PCR. Relative to pre-COVID, the diagnoses and symptoms with the greatest odds after a positive PCR test were loss of smell or taste [OR (95% CI) = 6.20 (3.18-12.09)], pulmonary fibrosis [3.50 (1.59-7.68)], and dyspnea/respiratory failure [2.14 (1.92-2.40)]. Stratification of these analyses by age, gender, race, and ethnicity showed similar results. CONCLUSION The increased symptoms and diagnoses detected in the current study match prior analyses of PASC diagnosis and treatment-seeking patients. The current research expands upon the literature by showing that these symptoms are more frequently detected following acute COVID-19 than before COVID-19. Further, our analyses provide a broad snapshot of the population as we were able to describe PASC among all patients who tested positive for COVID-19.
Collapse
Affiliation(s)
- Mark J. Butler
- Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, New York, NY, United States of America
| | - Codruta Chiuzan
- Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, New York, NY, United States of America
| | - Heejoon Ahn
- Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, New York, NY, United States of America
| | - Michael Gao
- Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, New York, NY, United States of America
| | - Stefani D’Angelo
- Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, New York, NY, United States of America
| | - Jackson Yeh
- Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, New York, NY, United States of America
| | - Karina Davidson
- Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, New York, NY, United States of America
- Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Northwell Health, Hempstead, NY, United States of America
| |
Collapse
|
39
|
Damico Smith C, Nanda N, Bonnet K, Schlundt D, Anderson C, Fernandes-Taylor S, Gelbard A, Francis DO. Navigating Pathways to Diagnosis in Idiopathic Subglottic Stenosis: A Qualitative Study. Laryngoscope 2024; 134:815-824. [PMID: 37740907 DOI: 10.1002/lary.31023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 07/28/2023] [Accepted: 08/09/2023] [Indexed: 09/25/2023]
Abstract
OBJECTIVE Idiopathic subglottic stenosis is a rare disease, and time to diagnosis is often prolonged. In the United States, some estimate it takes an average of 9 years for patients with similar rare disease to be diagnosed. Patient experience during this period is termed the diagnostic odyssey. The aim of this study is to use qualitative methods grounded in behavioral-ecological conceptual frameworks to identify drivers of diagnostic odyssey length that can help inform efforts to improve health care for iSGS patients. METHODS Qualitative study using semi-structured interviews. Setting consisted of participants who were recruited from those enrolled in a large, prospective multicenter trial. We use directed content analysis to analyze qualitative semi-structured interviews with iSGS patients focusing on their pathways to diagnosis. RESULTS Overall, 30 patients with iSGS underwent semi-structured interviews. The patient-reported median time to diagnosis was 21 months. On average, the participants visited four different health care providers. Specialists were most likely to make an appropriate referral to otolaryngology that ended in diagnosis. However, when primary care providers referred to otolaryngology, patients experienced a shorter diagnostic odyssey. The most important behavioral-ecological factors in accelerating diagnosis were strong social support for the patient and providers' willingness to refer. CONCLUSION Several factors affected time to diagnosis for iSGS patients. Patient social capital was a catalyst in decreasing time to diagnosis. Patient-reported medical paternalism and gatekeeping limited specialty care referrals extended diagnostic odysseys. Additional research is needed to understand the effect of patient-provider and provider-provider relationships on time to diagnosis for patients with iSGS. LEVEL OF EVIDENCE 4 Laryngoscope, 134:815-824, 2024.
Collapse
Affiliation(s)
- Cara Damico Smith
- Department of Surgery, University of Wisconsin-Madison, Madison, Wisconsin, U.S.A
| | - Nainika Nanda
- Division of Otolaryngology, University of Wisconsin-Madison, Madison, Wisconsin, U.S.A
| | - Kemberlee Bonnet
- Department of Psychology, Vanderbilt University, Nashville, Tennessee, U.S.A
| | - David Schlundt
- Department of Psychology, Vanderbilt University, Nashville, Tennessee, U.S.A
| | | | | | - Alexander Gelbard
- Department of Otolaryngology-Head & Neck Surgery, Vanderbilt University
| | - David O Francis
- Division of Otolaryngology, University of Wisconsin-Madison, Madison, Wisconsin, U.S.A
| |
Collapse
|
40
|
Dalal AK, Schnipper JL, Raffel K, Ranji S, Lee T, Auerbach A. Identifying and classifying diagnostic errors in acute care across hospitals: Early lessons from the Utility of Predictive Systems in Diagnostic Errors (UPSIDE) study. J Hosp Med 2024; 19:140-145. [PMID: 37211760 DOI: 10.1002/jhm.13136] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 04/20/2023] [Accepted: 05/02/2023] [Indexed: 05/23/2023]
Affiliation(s)
- Anuj K Dalal
- Hospital Medicine Unit, Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Jeffrey L Schnipper
- Hospital Medicine Unit, Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Katie Raffel
- Division of Hospital Medicine, University of Colorado Anschutz Medical Campus, Denver, Colorado, USA
| | - Sumant Ranji
- Division of Hospital Medicine, University of California San Francisco, San Francisco, California, USA
| | | | - Andrew Auerbach
- Division of Hospital Medicine, University of California San Francisco, San Francisco, California, USA
| |
Collapse
|
41
|
Marang-van de Mheen PJ, Thomas EJ, Graber ML. How safe is the diagnostic process in healthcare? BMJ Qual Saf 2024; 33:82-85. [PMID: 37793802 DOI: 10.1136/bmjqs-2023-016496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/12/2023] [Indexed: 10/06/2023]
Affiliation(s)
- Perla J Marang-van de Mheen
- Safety & Security Science, Delft University of Technology, Faculty of Technology, Policy & Management, Delft, The Netherlands
- Centre for Safety in Healthcare, Delft University of Technology, Delft, The Netherlands
| | - Eric J Thomas
- Internal Medicine, University of Texas John P and Katherine G McGovern Medical School, Houston, Texas, USA
- The UTHealth-Memorial Hermann Center for Healthcare Quality and Safety, UTHealth, Houston, Texas, USA
| | | |
Collapse
|
42
|
Newman-Toker DE, Nassery N, Schaffer AC, Yu-Moe CW, Clemens GD, Wang Z, Zhu Y, Saber Tehrani AS, Fanai M, Hassoon A, Siegal D. Burden of serious harms from diagnostic error in the USA. BMJ Qual Saf 2024; 33:109-120. [PMID: 37460118 PMCID: PMC10792094 DOI: 10.1136/bmjqs-2021-014130] [Citation(s) in RCA: 49] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 06/24/2023] [Indexed: 08/10/2023]
Abstract
BACKGROUND Diagnostic errors cause substantial preventable harms worldwide, but rigorous estimates for total burden are lacking. We previously estimated diagnostic error and serious harm rates for key dangerous diseases in major disease categories and validated plausible ranges using clinical experts. OBJECTIVE We sought to estimate the annual US burden of serious misdiagnosis-related harms (permanent morbidity, mortality) by combining prior results with rigorous estimates of disease incidence. METHODS Cross-sectional analysis of US-based nationally representative observational data. We estimated annual incident vascular events and infections from 21.5 million (M) sampled US hospital discharges (2012-2014). Annual new cancers were taken from US-based registries (2014). Years were selected for coding consistency with prior literature. Disease-specific incidences for 15 major vascular events, infections and cancers ('Big Three' categories) were multiplied by literature-based rates to derive diagnostic errors and serious harms. We calculated uncertainty estimates using Monte Carlo simulations. Validity checks included sensitivity analyses and comparison with prior published estimates. RESULTS Annual US incidence was 6.0 M vascular events, 6.2 M infections and 1.5 M cancers. Per 'Big Three' dangerous disease case, weighted mean error and serious harm rates were 11.1% and 4.4%, respectively. Extrapolating to all diseases (including non-'Big Three' dangerous disease categories), we estimated total serious harms annually in the USA to be 795 000 (plausible range 598 000-1 023 000). Sensitivity analyses using more conservative assumptions estimated 549 000 serious harms. Results were compatible with setting-specific serious harm estimates from inpatient, emergency department and ambulatory care. The 15 dangerous diseases accounted for 50.7% of total serious harms and the top 5 (stroke, sepsis, pneumonia, venous thromboembolism and lung cancer) accounted for 38.7%. CONCLUSION An estimated 795 000 Americans become permanently disabled or die annually across care settings because dangerous diseases are misdiagnosed. Just 15 diseases account for about half of all serious harms, so the problem may be more tractable than previously imagined.
Collapse
Affiliation(s)
- David E Newman-Toker
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Najlla Nassery
- Department of Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | - Adam C Schaffer
- Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA
- Department of Patient Safety, The Risk Management Foundation of the Harvard Medical Institutions Inc, Boston, Massachusetts, USA
| | - Chihwen Winnie Yu-Moe
- Department of Patient Safety, The Risk Management Foundation of the Harvard Medical Institutions Inc, Boston, Massachusetts, USA
| | - Gwendolyn D Clemens
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Zheyu Wang
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
- Department of Oncology, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | - Yuxin Zhu
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Ali S Saber Tehrani
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | - Mehdi Fanai
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | - Ahmed Hassoon
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Dana Siegal
- Candello, The Risk Management Foundation of the Harvard Medical Institutions Inc, Boston, Massachusetts, USA
- Department of Risk Management & Analytics, Coverys, Boston, Massachusetts, USA
| |
Collapse
|
43
|
Dipaola F, Gatti M, Menè R, Shiffer D, Giaj Levra A, Solbiati M, Villa P, Costantino G, Furlan R. A Hybrid Model for 30-Day Syncope Prognosis Prediction in the Emergency Department. J Pers Med 2023; 14:4. [PMID: 38276219 PMCID: PMC10817569 DOI: 10.3390/jpm14010004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 12/06/2023] [Accepted: 12/11/2023] [Indexed: 01/27/2024] Open
Abstract
Syncope is a challenging problem in the emergency department (ED) as the available risk prediction tools have suboptimal predictive performances. Predictive models based on machine learning (ML) are promising tools whose application in the context of syncope remains underexplored. The aim of the present study was to develop and compare the performance of ML-based models in predicting the risk of clinically significant outcomes in patients presenting to the ED for syncope. We enrolled 266 consecutive patients (age 73, IQR 58-83; 52% males) admitted for syncope at three tertiary centers. We collected demographic and clinical information as well as the occurrence of clinically significant outcomes at a 30-day telephone follow-up. We implemented an XGBoost model based on the best-performing candidate predictors. Subsequently, we integrated the XGboost predictors with knowledge-based rules. The obtained hybrid model outperformed the XGboost model (AUC = 0.81 vs. 0.73, p < 0.001) with acceptable calibration. In conclusion, we developed an ML-based model characterized by a commendable capability to predict adverse events within 30 days post-syncope evaluation in the ED. This model relies solely on clinical data routinely collected during a patient's initial syncope evaluation, thus obviating the need for laboratory tests or syncope experienced clinical judgment.
Collapse
Affiliation(s)
- Franca Dipaola
- Internal Medicine, Syncope Unit, IRCCS Humanitas Research Hospital, 20089 Milan, Italy;
| | | | - Roberto Menè
- Department of Medicine and Surgery, University of Milano-Bicocca, 20100 Milan, Italy;
| | - Dana Shiffer
- Emergency Department, IRCCS Humanitas Research Hospital, 20089 Milan, Italy;
- Department of Biomedical Sciences, Humanitas University, 20072 Milan, Italy;
| | | | - Monica Solbiati
- Emergency Department, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, Università Degli Studi Di Milano, 20100 Milan, Italy; (M.S.); (G.C.)
| | - Paolo Villa
- Emergency Medicine Unit, Luigi Sacco Hospital, ASST Fatebenefratelli Sacco, 20100 Milan, Italy;
| | - Giorgio Costantino
- Emergency Department, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, Università Degli Studi Di Milano, 20100 Milan, Italy; (M.S.); (G.C.)
| | - Raffaello Furlan
- Internal Medicine, Syncope Unit, IRCCS Humanitas Research Hospital, 20089 Milan, Italy;
- Department of Biomedical Sciences, Humanitas University, 20072 Milan, Italy;
| |
Collapse
|
44
|
Frey J, Braun LT, Handgriff L, Kendziora B, Fischer MR, Reincke M, Zwaan L, Schmidmaier R. Insights into diagnostic errors in endocrinology: a prospective, case-based, international study. BMC MEDICAL EDUCATION 2023; 23:934. [PMID: 38066602 PMCID: PMC10709946 DOI: 10.1186/s12909-023-04927-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 12/03/2023] [Indexed: 12/18/2023]
Abstract
BACKGROUND Diagnostic errors in internal medicine are common. While cognitive errors have previously been identified to be the most common contributor to errors, very little is known about errors in specific fields of internal medicine such as endocrinology. This prospective, multicenter study focused on better understanding the causes of diagnostic errors made by general practitioners and internal specialists in the area of endocrinology. METHODS From August 2019 until January 2020, 24 physicians completed five endocrine cases on an online platform that simulated the diagnostic process. After each case, the participants had to state and explain why they chose their assumed diagnosis. The data gathering process as well as the participants' explanations were quantitatively and qualitatively analyzed to determine the causes of the errors. The diagnostic processes in correctly and incorrectly solved cases were compared. RESULTS Seven different causes of diagnostic error were identified, the most frequent being misidentification (mistaking one diagnosis with a related one or with more frequent and similar diseases) in 23% of the cases. Other causes were faulty context generation (21%) and premature closure (17%). The diagnostic confidence did not differ between correctly and incorrectly solved cases (median 8 out of 10, p = 0.24). However, in incorrectly solved cases, physicians spent less time on the technical findings (such as lab results, imaging) (median 250 s versus 199 s, p < 0.049). CONCLUSIONS The causes for errors in endocrine case scenarios are similar to the causes in other fields of internal medicine. Spending more time on technical findings might prevent misdiagnoses in everyday clinical practice.
Collapse
Affiliation(s)
- Jessica Frey
- Medizinische Klinik und Poliklinik IV, University Hospital, Ludwig-Maximilians-University Munich, Ziemssenstr. 5, 80336, Munich, Germany
| | - Leah T Braun
- Medizinische Klinik und Poliklinik IV, University Hospital, Ludwig-Maximilians-University Munich, Ziemssenstr. 5, 80336, Munich, Germany.
| | - Laura Handgriff
- Medizinische Klinik und Poliklinik IV, University Hospital, Ludwig-Maximilians-University Munich, Ziemssenstr. 5, 80336, Munich, Germany
| | - Benjamin Kendziora
- Department of Dermatology and Allergology, University Hospital, LMU Munich, Munich, Germany
| | - Martin R Fischer
- Institute of Medical Education, University Hospital, LMU Munich, Munich, Germany
| | - Martin Reincke
- Medizinische Klinik und Poliklinik IV, University Hospital, Ludwig-Maximilians-University Munich, Ziemssenstr. 5, 80336, Munich, Germany
| | - Laura Zwaan
- Erasmus MC iMERR (Institute of Medical Education Research Rotterdam), Rotterdam, Netherlands
| | - Ralf Schmidmaier
- Medizinische Klinik und Poliklinik IV, University Hospital, Ludwig-Maximilians-University Munich, Ziemssenstr. 5, 80336, Munich, Germany
- Institute of Medical Education, University Hospital, LMU Munich, Munich, Germany
| |
Collapse
|
45
|
Gips JR, Stein AA, Luckin J, Garibaldi BT. Internal medicine intern performance on the gastrointestinal physical exam. Diagnosis (Berl) 2023; 10:412-416. [PMID: 37475198 DOI: 10.1515/dx-2023-0051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 06/27/2023] [Indexed: 07/22/2023]
Abstract
OBJECTIVES The gastrointestinal (GI) physical exam provides critical information about underlying disease states. However, since assessment of physical examination skills is rarely conducted as part of internal medicine residency training, little is known about resident performance on the GI physical exam. METHODS During a clinical skills assessment that took place between November 2019 and February 2020, internal medicine interns examined the same patient with chronic liver disease while being observed by faculty preceptors. We compared the exam maneuvers performed with those expected by the faculty evaluators. We noted which maneuvers were performed incorrectly, whether physical exam technique correlated with identification of physical exam findings, and if performance on the physical exam was associated with building an appropriate differential diagnosis. This four-hour assessment was required for internal medicine interns within two different residency programs in the Baltimore area. RESULTS More than half of the 29 participating interns (n=17, 58.6 %) received a "needs improvement" score on their physical exam technique. Technique was highly correlated with identifying the correct physical signs (r=0.88, p<0.0001). The most commonly excluded maneuvers were assessing for splenomegaly and hepatomegaly. The most commonly missed findings were splenomegaly and hepatomegaly. Most interns included chronic liver disease as part of their differential diagnosis even if they received "needs improvement" scores on physical exam technique or identifying physical signs. CONCLUSIONS Internal medicine interns would benefit from learning an organized approach to the gastrointestinal exam. This would likely lead to increased identification of important gastrointestinal findings.
Collapse
|
46
|
Harada Y, Watari T, Nagano H, Suzuki T, Kunitomo K, Miyagami T, Aita T, Ishizuka K, Maebashi M, Harada T, Sakamoto T, Tomiyama S, Shimizu T. Diagnostic errors in uncommon conditions: a systematic review of case reports of diagnostic errors. Diagnosis (Berl) 2023; 10:329-336. [PMID: 37561056 DOI: 10.1515/dx-2023-0030] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 06/21/2023] [Indexed: 08/11/2023]
Abstract
OBJECTIVES To assess the usefulness of case reports as sources for research on diagnostic errors in uncommon diseases and atypical presentations. CONTENT We reviewed 563 case reports of diagnostic error. The commonality of the final diagnoses was classified based on the description in the articles, Orphanet, or epidemiological data on available references; the typicality of presentation was classified based on the description in the articles and the judgment of the physician researchers. Diagnosis Error Evaluation and Research (DEER), Reliable Diagnosis Challenges (RDC), and Generic Diagnostic Pitfalls (GDP) taxonomies were used to assess the factors contributing to diagnostic errors. SUMMARY AND OUTLOOK Excluding three cases in that commonality could not be classified, 560 cases were classified into four categories: typical presentations of common diseases (60, 10.7 %), atypical presentations of common diseases (35, 6.2 %), typical presentations of uncommon diseases (276, 49.3 %), and atypical presentations of uncommon diseases (189, 33.8 %). The most important DEER taxonomy was "Failure/delay in considering the diagnosis" among the four categories, whereas the most important RDC and GDP taxonomies varied with the categories. Case reports can be a useful data source for research on the diagnostic errors of uncommon diseases with or without atypical presentations.
Collapse
Affiliation(s)
- Yukinori Harada
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Shimotsuga-Gun, Japan
| | - Takashi Watari
- General Medicine Center, Shimane University Hospital, Izumo, Japan
| | - Hiroyuki Nagano
- Department of Healthcare Economics and Quality Management, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | | | - Kotaro Kunitomo
- National Hospital Organisation Kumamoto Medical Center, Kumamoto, Japan
| | | | - Tetsuro Aita
- Department of General Internal Medicine, Fukushima Medical University, Fukushima, Japan
| | - Kosuke Ishizuka
- Department of General Medicine, Yokohama City University School of Medicine, Yokohama, Japan
| | | | - Taku Harada
- Division of General Medicine, Nerima Hikarigaoka Hospital, Nerima-Ku, Tokyo
| | - Tetsu Sakamoto
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Shimotsuga-Gun, Japan
| | - Shusaku Tomiyama
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Shimotsuga-Gun, Japan
| | - Taro Shimizu
- Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Shimotsuga-Gun, Japan
| |
Collapse
|
47
|
Jala S, Fry M, Elliott R. Cognitive bias during clinical decision-making and its influence on patient outcomes in the emergency department: A scoping review. J Clin Nurs 2023; 32:7076-7085. [PMID: 37605250 DOI: 10.1111/jocn.16845] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 06/16/2023] [Accepted: 07/31/2023] [Indexed: 08/23/2023]
Abstract
BACKGROUND An integral part of clinical practice is decision-making. Yet there is widespread acceptance that there is evidence of cognitive bias within clinical practice among nurses and physicians. However, how cognitive bias among emergency nurses and physicians' decision-making influences patient outcomes remains unclear. AIM The aim of this review was to systematically synthesise research exploring the emergency nurses' and physicians' cognitive bias in decision-making and its influence on patient outcomes. METHODS This scoping review was guided by the PRISMA Extension for Scoping Reviews. The databases searched included CINAHL, MEDLINE, Web of Science and PubMed. No date limits were applied. The Patterns, Advances, Gaps, Evidence for practice and Research recommendation (PAGER) framework was used to guide the discussion. RESULTS The review included 18 articles, consisting of 10 primary studies (nine quantitative and one qualitative) and eight literature reviews. Of the 18 articles, nine investigated physicians, five articles examined nurses, and four both physicians and nurses with sample sizes ranging from 13 to 3547. Six primary studies were cross-sectional and five used hypothetical scenarios, and one real-world assessment. Three were experimental studies. Twenty-nine cognitive biases were identified with Implicit bias (n = 12) most frequently explored, followed by outcome bias (n = 4). Results were inconclusive regarding the influence of biases on treatment decisions and patient outcomes. Four key themes were identified; (i) cognitive biases among emergency clinicians; (ii) measurement of cognitive bias; (iii) influence of cognitive bias on clinical decision-making; and (iv) association between emergency clinicians' cognitive bias and patient outcomes. CONCLUSIONS This review identified that cognitive biases were present among emergency nurses and physicians during clinical decision-making, but it remains unclear how cognitive bias influences patient outcomes. Further research examining emergency clinicians' cognitive bias is required. RELEVANCE TO CLINICAL PRACTICE Awareness of emergency clinicians' own cognitive biases may result to the provision of equity in care. NO PATIENT OR PUBLIC CONTRIBUTION IN THIS REVIEW We intend to disseminate the results through publication in a peer-reviewed journals and conference presentations.
Collapse
Affiliation(s)
- Sheila Jala
- Faculty of Health, School of Nursing and Midwifery, University of Technology Sydney, Sydney, New South Wales, Australia
- Neurology Department, Royal North Shore Hospital, St Leonards, New South Wales, Australia
| | - Margaret Fry
- Faculty of Health, School of Nursing and Midwifery, University of Technology Sydney, Sydney, New South Wales, Australia
| | - Rosalind Elliott
- Faculty of Health, School of Nursing and Midwifery, University of Technology Sydney, Sydney, New South Wales, Australia
- Nursing and Midwifery Research Centre, Nursing and Midwifery Directorate, Northern Sydney Local Health District, Royal North Shore Hospital, St Leonards, New South Wales, Australia
- Department of Intensive Care Medicine, Royal North Shore Hospital, St Leonards, New South Wales, Australia
| |
Collapse
|
48
|
Gupta AB, Greene MT, Fowler KE, Chopra VI. Associations Between Hospitalist Shift Busyness, Diagnostic Confidence, and Resource Utilization: A Pilot Study. J Patient Saf 2023; 19:447-452. [PMID: 37729642 PMCID: PMC10516505 DOI: 10.1097/pts.0000000000001157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
OBJECTIVES Hospitalized patients are at risk for diagnostic errors. Hospitalists caring for these patients are often multitasking when overseeing patient care. We aimed to measure hospitalist workload and understand its influences on diagnostic performance in a real-world clinical setting. METHODS We conducted a single-center, prospective, pilot observational study of hospitalists admitting new patients to the hospital. Hospitalists completed an abridged Mindful Attention Awareness Tool and a survey about diagnostic confidence at shift completion. Data on differential diagnoses and resource utilization (e.g., laboratory, imaging tests ordered, and consultations) were collected from the medical record. The number of admissions and paging volume per shift were used as separate proxies for shift busyness. Data were analyzed using linear mixed effects models (continuous outcomes) or mixed effects logistic regression (dichotomous outcomes). RESULTS Of the 53 hospitalists approached, 47 (89%) agreed to participate; complete data were available for 37 unique hospitalists who admitted 160 unique patients. Increases in admissions (odds ratio, 1.99; 95% confidence interval [CI], 1.04 to 3.82; P = 0.04) and pages (odds ratio, 1.11; 95% CI, 1.02 to 1.21; P = 0.01) were associated with increased odds of hospitalists finding it "difficult to focus on what is happening in the present." Increased pages was associated with a decrease in the number of listed differential diagnoses (coefficient, -0.02; 95% CI, -0.04 to -0.003; P = 0.02). CONCLUSIONS Evaluation of hospitalist busyness and its associations with factors that may influence diagnosis in a real-world environment was feasible and demonstrated important implications on physician focus and differential diagnosis.
Collapse
|
49
|
McKoane A, Sherman DK. Diagnostic uncertainty in patients, parents, and physicians: a compensatory control theory perspective. Health Psychol Rev 2023; 17:439-455. [PMID: 35672909 DOI: 10.1080/17437199.2022.2086899] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 06/02/2022] [Indexed: 11/04/2022]
Abstract
Medical diagnoses offer a structure by which psychological uncertainty can be attenuated, allowing patients to diminish psychological threats and focus on health prognosis. Yet when no diagnosis can be made, patients may experience diagnostic uncertainty - perceiving the medical field as unable to provide an accurate explanation of the cause of their health problems. This review examines the psychological threat that diagnostic uncertainty imposes on individuals' need for control and understanding, and the resulting consequences experienced by patients, parents of pediatric patients, and physicians. Using compensatory control theory as a framework, we propose a taxonomy of behaviors that people may adopt in order to regain control in the face of diagnostic uncertainty and to reaffirm that the world is not random and chaotic. To manage diagnostic uncertainty, people may bolster their personal agency, affiliate with external systems they see as acting in their interest, affirm clear connections between behaviors and outcomes, and affirm nonspecific epistemic structure. Diagnostic uncertainty is approached from the perspectives of patients, parents of pediatric patients, and physicians, demonstrating how each group responds in order to maintain a sense that the world has structure and is not random. Discussion centers on moderators, limitations, and implications for clinical practice.
Collapse
Affiliation(s)
- Ashley McKoane
- Psychological & Brain Sciences, University of California, Santa Barbara, CA, USA
| | - David K Sherman
- Psychological & Brain Sciences, University of California, Santa Barbara, CA, USA
| |
Collapse
|
50
|
Ladell MM, Shafer G, Ziniel SI, Grubenhoff JA. Comparative Perspectives on Diagnostic Error Discussions Between Inpatient and Outpatient Pediatric Providers. Am J Med Qual 2023; 38:245-254. [PMID: 37678302 PMCID: PMC10484186 DOI: 10.1097/jmq.0000000000000148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Diagnostic error remains understudied and underaddressed despite causing significant morbidity and mortality. One barrier to addressing this issue remains provider discomfort. Survey studies have shown significantly more discomfort among providers in discussing diagnostic error compared with other forms of error. Whether the comfort in discussing diagnostic error differs depending on practice setting has not been previously studied. The objective of this study was to assess differences in provider willingness to discuss diagnostic error in the inpatient versus outpatient setting. A multicenter survey was sent out to 3881 providers between May and June 2018. This survey was designed to assess comfort level of discussing diagnostic error and looking at barriers to discussing diagnostic error. Forty-three percent versus 22% of inpatient versus outpatient providers (P = 0.004) were comfortable discussing short-term diagnostic error publicly. Similarly, 76% versus 60% of inpatient versus outpatient providers (P = 0.010) were comfortable discussing short-term diagnostic error privately. A higher percentage of inpatient (64%) compared with outpatient providers (46%) (P = 0.043) were comfortable discussing long-term diagnostic error privately. Forty percent versus 24% of inpatient versus outpatient providers (P = 0.018) were comfortable discussing long-term error publicly. No difference in barriers cited depending on practice setting. Inpatient providers are more comfortable discussing diagnostic error than their outpatient counterparts. More study is needed to determine the etiology of this discrepancy and to develop strategies to increase outpatient provider comfort.
Collapse
Affiliation(s)
- Meagan M. Ladell
- Department of Pediatric (Section of Emergency Medicine), Children’s Wisconsin and Medical College of Wisconsin, Milwaukee, WI
| | - Grant Shafer
- Department of Pediatrics (Section of Neonatology), Children’s Hospital of Orange County and University of California Irvine, Orange, CA
| | - Sonja I. Ziniel
- Department of Pediatrics, University of Colorado School of Medicine and Children’s Hospital Colorado, Aurora, CO
| | - Joseph A. Grubenhoff
- Department of Pediatrics (Section of Emergency Medicine), University of Colorado School of Medicine and Children’s Hospital Colorado, Aurora, CO
| |
Collapse
|