Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sibbald M, Monteiro S, Sherbino J, LoGiudice A, Friedman C, Norman G. Should electronic differential diagnosis support be used early or late in the diagnostic process? A multicentre experimental study of Isabel. BMJ Qual Saf 2021;31:426-433. [PMID: 34611040 PMCID: PMC9132870 DOI: 10.1136/bmjqs-2021-013493] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 09/09/2021] [Indexed: 12/17/2022]

For:	Sibbald M, Monteiro S, Sherbino J, LoGiudice A, Friedman C, Norman G. Should electronic differential diagnosis support be used early or late in the diagnostic process? A multicentre experimental study of Isabel. BMJ Qual Saf 2021;31:426-433. [PMID: 34611040 PMCID: PMC9132870 DOI: 10.1136/bmjqs-2021-013493] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 09/09/2021] [Indexed: 12/17/2022]

Number

Cited by Other Article(s)

Rutledge GW. Diagnostic accuracy of GPT-4 on common clinical scenarios and challenging cases. Learn Health Syst 2024;8:e10438. [PMID: 39036534 PMCID: PMC11257049 DOI: 10.1002/lrh2.10438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 05/16/2024] [Accepted: 05/19/2024] [Indexed: 07/23/2024] Open

Schmidt HG, Norman GR, Mamede S, Magzoub M. The influence of context on diagnostic reasoning: A narrative synthesis of experimental findings. J Eval Clin Pract 2024. [PMID: 38818694 DOI: 10.1111/jep.14023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 05/03/2024] [Accepted: 05/13/2024] [Indexed: 06/01/2024]

Abstract

AIMS AND OBJECTIVES

Contextual information which is implicitly available to physicians during clinical encounters has been shown to influence diagnostic reasoning. To better understand the psychological mechanisms underlying the influence of context on diagnostic accuracy, we conducted a review of experimental research on this topic.

METHOD

We searched Web of Science, PubMed, and Scopus for relevant articles and looked for additional records by reading the references and approaching experts. We limited the review to true experiments involving physicians in which the outcome variable was the accuracy of the diagnosis.

RESULTS

The 43 studies reviewed examined two categories of contextual variables: (a) case-intrinsic contextual information and (b) case-extrinsic contextual information. Case-intrinsic information includes implicit misleading diagnostic suggestions in the disease history of the patient, or emotional volatility of the patient. Case-extrinsic or situational information includes a similar (but different) case seen previously, perceived case difficulty, or external digital diagnostic support. Time pressure and interruptions are other extrinsic influences that may affect the accuracy of a diagnosis but have produced conflicting findings.

CONCLUSION

We propose two tentative hypotheses explaining the role of context in diagnostic accuracy. According to the negative-affect hypothesis, diagnostic errors emerge when the physician's attention shifts from the relevant clinical findings to the (irrelevant) source of negative affect (for instance patient aggression) raised in a clinical encounter. The early-diagnosis-primacy hypothesis attributes errors to the extraordinary influence of the initial hypothesis that comes to the physician's mind on the subsequent collecting and interpretation of case information. Future research should test these mechanisms explicitly. Possible alternative mechanisms such as premature closure or increased production of (irrelevant) rival diagnoses in response to context deserve further scrutiny. Implications for medical education and practice are discussed.

Collapse

Harada Y, Sakamoto T, Sugimoto S, Shimizu T. Longitudinal Changes in Diagnostic Accuracy of a Differential Diagnosis List Developed by an AI-Based Symptom Checker: Retrospective Observational Study. JMIR Form Res 2024;8:e53985. [PMID: 38758588 PMCID: PMC11143391 DOI: 10.2196/53985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/23/2024] [Accepted: 04/24/2024] [Indexed: 05/18/2024] Open

Abstract

BACKGROUND

Artificial intelligence (AI) symptom checker models should be trained using real-world patient data to improve their diagnostic accuracy. Given that AI-based symptom checkers are currently used in clinical practice, their performance should improve over time. However, longitudinal evaluations of the diagnostic accuracy of these symptom checkers are limited.

OBJECTIVE

This study aimed to assess the longitudinal changes in the accuracy of differential diagnosis lists created by an AI-based symptom checker used in the real world.

METHODS

This was a single-center, retrospective, observational study. Patients who visited an outpatient clinic without an appointment between May 1, 2019, and April 30, 2022, and who were admitted to a community hospital in Japan within 30 days of their index visit were considered eligible. We only included patients who underwent an AI-based symptom checkup at the index visit, and the diagnosis was finally confirmed during follow-up. Final diagnoses were categorized as common or uncommon, and all cases were categorized as typical or atypical. The primary outcome measure was the accuracy of the differential diagnosis list created by the AI-based symptom checker, defined as the final diagnosis in a list of 10 differential diagnoses created by the symptom checker. To assess the change in the symptom checker's diagnostic accuracy over 3 years, we used a chi-square test to compare the primary outcome over 3 periods: from May 1, 2019, to April 30, 2020 (first year); from May 1, 2020, to April 30, 2021 (second year); and from May 1, 2021, to April 30, 2022 (third year).

RESULTS

A total of 381 patients were included. Common diseases comprised 257 (67.5%) cases, and typical presentations were observed in 298 (78.2%) cases. Overall, the accuracy of the differential diagnosis list created by the AI-based symptom checker was 172 (45.1%), which did not differ across the 3 years (first year: 97/219, 44.3%; second year: 32/72, 44.4%; and third year: 43/90, 47.7%; P=.85). The accuracy of the differential diagnosis list created by the symptom checker was low in those with uncommon diseases (30/124, 24.2%) and atypical presentations (12/83, 14.5%). In the multivariate logistic regression model, common disease (P<.001; odds ratio 4.13, 95% CI 2.50-6.98) and typical presentation (P<.001; odds ratio 6.92, 95% CI 3.62-14.2) were significantly associated with the accuracy of the differential diagnosis list created by the symptom checker.

CONCLUSIONS

A 3-year longitudinal survey of the diagnostic accuracy of differential diagnosis lists developed by an AI-based symptom checker, which has been implemented in real-world clinical practice settings, showed no improvement over time. Uncommon diseases and atypical presentations were independently associated with a lower diagnostic accuracy. In the future, symptom checkers should be trained to recognize uncommon conditions.

Collapse

Bridges JM. Computerized diagnostic decision support systems - a comparative performance study of Isabel Pro vs. ChatGPT4. Diagnosis (Berl) 2024;0:dx-2024-0033. [PMID: 38709491 DOI: 10.1515/dx-2024-0033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/22/2024] [Indexed: 05/07/2024]

Michelson KA, Rees CA, Florin TA, Bachur RG. Emergency Department Volume and Delayed Diagnosis of Serious Pediatric Conditions. JAMA Pediatr 2024;178:362-368. [PMID: 38345811 PMCID: PMC10862268 DOI: 10.1001/jamapediatrics.2023.6672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 12/14/2023] [Indexed: 02/15/2024]

Abstract

Importance

Diagnostic delays are common in the emergency department (ED) and may predispose to worse outcomes.

Objective

To evaluate the association of annual pediatric volume in the ED with delayed diagnosis.

Design, Setting, and Participants

This retrospective cohort study included all children younger than 18 years treated at 954 EDs in 8 states with a first-time diagnosis of any of 23 acute, serious conditions: bacterial meningitis, compartment syndrome, complicated pneumonia, craniospinal abscess, deep neck infection, ectopic pregnancy, encephalitis, intussusception, Kawasaki disease, mastoiditis, myocarditis, necrotizing fasciitis, nontraumatic intracranial hemorrhage, orbital cellulitis, osteomyelitis, ovarian torsion, pulmonary embolism, pyloric stenosis, septic arthritis, sinus venous thrombosis, slipped capital femoral epiphysis, stroke, or testicular torsion. Patients were identified using the Healthcare Cost and Utilization Project State ED and Inpatient Databases. Data were collected from January 2015 to December 2019, and data were analyzed from July to December 2023.

Exposure

Annual volume of children at the first ED visited.

Main Outcomes and Measures

Possible delayed diagnosis, defined as a patient with an ED discharge within 7 days prior to diagnosis. A secondary outcome was condition-specific complications. Rates of possible delayed diagnosis and complications were determined. The association of volume with delayed diagnosis across conditions was evaluated using conditional logistic regression matching on condition, age, and medical complexity. Condition-specific volume-delay associations were tested using hierarchical logistic models with log volume as the exposure, adjusting for age, sex, payer, medical complexity, and hospital urbanicity. The association of delayed diagnosis with complications by condition was then examined using logistic regressions.

Results

Of 58 998 included children, 37 211 (63.1%) were male, and the mean (SD) age was 7.1 (5.8) years. A total of 6709 (11.4%) had a complex chronic condition. Delayed diagnosis occurred in 9296 (15.8%; 95% CI, 15.5-16.1). Each 2-fold increase in annual pediatric volume was associated with a 26.7% (95% CI, 22.5-30.7) decrease in possible delayed diagnosis. For 21 of 23 conditions (all except ectopic pregnancy and sinus venous thrombosis), there were decreased rates of possible delayed diagnosis with increasing ED volume. Condition-specific complications were 11.2% (95% CI, 3.1-20.0) more likely among patients with a possible delayed diagnosis compared with those without.

Conclusions and Relevance

EDs with fewer pediatric encounters had more possible delayed diagnoses across 23 serious conditions. Tools to support timely diagnosis in low-volume EDs are needed.

Collapse

Zampatti S, Peconi C, Megalizzi D, Calvino G, Trastulli G, Cascella R, Strafella C, Caltagirone C, Giardina E. Innovations in Medicine: Exploring ChatGPT's Impact on Rare Disorder Management. Genes (Basel) 2024;15:421. [PMID: 38674356 PMCID: PMC11050022 DOI: 10.3390/genes15040421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 03/25/2024] [Accepted: 03/26/2024] [Indexed: 04/28/2024] Open

Hu Z, Wang M, Zheng S, Xu X, Zhang Z, Ge Q, Li J, Yao Y. Clinical Decision Support Requirements for Ventricular Tachycardia Diagnosis Within the Frameworks of Knowledge and Practice: Survey Study. JMIR Hum Factors 2024;11:e55802. [PMID: 38530337 PMCID: PMC11005434 DOI: 10.2196/55802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2023] [Revised: 02/15/2024] [Accepted: 03/02/2024] [Indexed: 03/27/2024] Open

Abstract

BACKGROUND

Ventricular tachycardia (VT) diagnosis is challenging due to the similarity between VT and some forms of supraventricular tachycardia, complexity of clinical manifestations, heterogeneity of underlying diseases, and potential for life-threatening hemodynamic instability. Clinical decision support systems (CDSSs) have emerged as promising tools to augment the diagnostic capabilities of cardiologists. However, a requirements analysis is acknowledged to be vital for the success of a CDSS, especially for complex clinical tasks such as VT diagnosis.

OBJECTIVE

The aims of this study were to analyze the requirements for a VT diagnosis CDSS within the frameworks of knowledge and practice and to determine the clinical decision support (CDS) needs.

METHODS

Our multidisciplinary team first conducted semistructured interviews with seven cardiologists related to the clinical challenges of VT and expected decision support. A questionnaire was designed by the multidisciplinary team based on the results of interviews. The questionnaire was divided into four sections: demographic information, knowledge assessment, practice assessment, and CDS needs. The practice section consisted of two simulated cases for a total score of 10 marks. Online questionnaires were disseminated to registered cardiologists across China from December 2022 to February 2023. The scores for the practice section were summarized as continuous variables, using the mean, median, and range. The knowledge and CDS needs sections were assessed using a 4-point Likert scale without a neutral option. Kruskal-Wallis tests were performed to investigate the relationship between scores and practice years or specialty.

RESULTS

Of the 687 cardiologists who completed the questionnaire, 567 responses were eligible for further analysis. The results of the knowledge assessment showed that 383 cardiologists (68%) lacked knowledge in diagnostic evaluation. The overall average score of the practice assessment was 6.11 (SD 0.55); the etiological diagnosis section had the highest overall scores (mean 6.74, SD 1.75), whereas the diagnostic evaluation section had the lowest scores (mean 5.78, SD 1.19). A majority of cardiologists (344/567, 60.7%) reported the need for a CDSS. There was a significant difference in practice competency scores between general cardiologists and arrhythmia specialists (P=.02).

CONCLUSIONS

There was a notable deficiency in the knowledge and practice of VT among Chinese cardiologists. Specific knowledge and practice support requirements were identified, which provide a foundation for further development and optimization of a CDSS. Moreover, it is important to consider clinicians' specialization levels and years of practice for effective and personalized support.

Collapse

Sibbald M, Zwaan L, Yilmaz Y, Lal S. Incorporating artificial intelligence in medical diagnosis: A case for an invisible and (un)disruptive approach. J Eval Clin Pract 2024;30:3-8. [PMID: 35761764 DOI: 10.1111/jep.13730] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 06/13/2022] [Indexed: 12/30/2022]

Pacheco K, Ji J, Barbosa K, Lemay K, Fortier JH, Garber GE. Medico-legal risk of infectious disease physicians in Canada: A retrospective review. JOURNAL OF THE ASSOCIATION OF MEDICAL MICROBIOLOGY AND INFECTIOUS DISEASE CANADA = JOURNAL OFFICIEL DE L'ASSOCIATION POUR LA MICROBIOLOGIE MEDICALE ET L'INFECTIOLOGIE CANADA 2024;8:319-327. [PMID: 38250623 PMCID: PMC10797760 DOI: 10.3138/jammi-2023-0022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 09/15/2023] [Accepted: 09/18/2023] [Indexed: 01/23/2024]

Ito N, Kadomatsu S, Fujisawa M, Fukaguchi K, Ishizawa R, Kanda N, Kasugai D, Nakajima M, Goto T, Tsugawa Y. The Accuracy and Potential Racial and Ethnic Biases of GPT-4 in the Diagnosis and Triage of Health Conditions: Evaluation Study. JMIR MEDICAL EDUCATION 2023;9:e47532. [PMID: 37917120 PMCID: PMC10654908 DOI: 10.2196/47532] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 07/07/2023] [Accepted: 09/05/2023] [Indexed: 11/03/2023]

Abstract

BACKGROUND

Whether GPT-4, the conversational artificial intelligence, can accurately diagnose and triage health conditions and whether it presents racial and ethnic biases in its decisions remain unclear.

OBJECTIVE

We aim to assess the accuracy of GPT-4 in the diagnosis and triage of health conditions and whether its performance varies by patient race and ethnicity.

METHODS

We compared the performance of GPT-4 and physicians, using 45 typical clinical vignettes, each with a correct diagnosis and triage level, in February and March 2023. For each of the 45 clinical vignettes, GPT-4 and 3 board-certified physicians provided the most likely primary diagnosis and triage level (emergency, nonemergency, or self-care). Independent reviewers evaluated the diagnoses as "correct" or "incorrect." Physician diagnosis was defined as the consensus of the 3 physicians. We evaluated whether the performance of GPT-4 varies by patient race and ethnicity, by adding the information on patient race and ethnicity to the clinical vignettes.

RESULTS

The accuracy of diagnosis was comparable between GPT-4 and physicians (the percentage of correct diagnosis was 97.8% (44/45; 95% CI 88.2%-99.9%) for GPT-4 and 91.1% (41/45; 95% CI 78.8%-97.5%) for physicians; P=.38). GPT-4 provided appropriate reasoning for 97.8% (44/45) of the vignettes. The appropriateness of triage was comparable between GPT-4 and physicians (GPT-4: 30/45, 66.7%; 95% CI 51.0%-80.0%; physicians: 30/45, 66.7%; 95% CI 51.0%-80.0%; P=.99). The performance of GPT-4 in diagnosing health conditions did not vary among different races and ethnicities (Black, White, Asian, and Hispanic), with an accuracy of 100% (95% CI 78.2%-100%). P values, compared to the GPT-4 output without incorporating race and ethnicity information, were all .99. The accuracy of triage was not significantly different even if patients' race and ethnicity information was added. The accuracy of triage was 62.2% (95% CI 46.5%-76.2%; P=.50) for Black patients; 66.7% (95% CI 51.0%-80.0%; P=.99) for White patients; 66.7% (95% CI 51.0%-80.0%; P=.99) for Asian patients, and 62.2% (95% CI 46.5%-76.2%; P=.69) for Hispanic patients. P values were calculated by comparing the outputs with and without conditioning on race and ethnicity.

CONCLUSIONS

GPT-4's ability to diagnose and triage typical clinical vignettes was comparable to that of board-certified physicians. The performance of GPT-4 did not vary by patient race and ethnicity. These findings should be informative for health systems looking to introduce conversational artificial intelligence to improve the efficiency of patient diagnosis and triage.

Collapse

Ing EB, Balas M, Nassrallah G, DeAngelis D, Nijhawan N. The Isabel Differential Diagnosis Generator for Orbital Diagnosis. Ophthalmic Plast Reconstr Surg 2023;39:461-464. [PMID: 36928323 DOI: 10.1097/iop.0000000000002364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]

Harada Y, Tomiyama S, Sakamoto T, Sugimoto S, Kawamura R, Yokose M, Hayashi A, Shimizu T. Effects of Combinational Use of Additional Differential Diagnostic Generators on the Diagnostic Accuracy of the Differential Diagnosis List Developed by an Artificial Intelligence-Driven Automated History-Taking System: Pilot Cross-Sectional Study. JMIR Form Res 2023;7:e49034. [PMID: 37531164 PMCID: PMC10433017 DOI: 10.2196/49034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 06/23/2023] [Accepted: 07/19/2023] [Indexed: 08/03/2023] Open

Abstract

BACKGROUND

Low diagnostic accuracy is a major concern in automated medical history-taking systems with differential diagnosis (DDx) generators. Extending the concept of collective intelligence to the field of DDx generators such that the accuracy of judgment becomes higher when accepting an integrated diagnosis list from multiple people than when accepting a diagnosis list from a single person may be a possible solution.

OBJECTIVE

The purpose of this study is to assess whether the combined use of several DDx generators improves the diagnostic accuracy of DDx lists.

METHODS

We used medical history data and the top 10 DDx lists (index DDx lists) generated by an artificial intelligence (AI)-driven automated medical history-taking system from 103 patients with confirmed diagnoses. Two research physicians independently created the other top 10 DDx lists (second and third DDx lists) per case by imputing key information into the other 2 DDx generators based on the medical history generated by the automated medical history-taking system without reading the index lists generated by the automated medical history-taking system. We used the McNemar test to assess the improvement in diagnostic accuracy from the index DDx lists to the three types of combined DDx lists: (1) simply combining DDx lists from the index, second, and third lists; (2) creating a new top 10 DDx list using a 1/n weighting rule; and (3) creating new lists with only shared diagnoses among DDx lists from the index, second, and third lists. We treated the data generated by 2 research physicians from the same patient as independent cases. Therefore, the number of cases included in analyses in the case using 2 additional lists was 206 (103 cases × 2 physicians' input).

RESULTS

The diagnostic accuracy of the index lists was 46% (47/103). Diagnostic accuracy was improved by simply combining the other 2 DDx lists (133/206, 65%, P<.001), whereas the other 2 combined DDx lists did not improve the diagnostic accuracy of the DDx lists (106/206, 52%, P=.05 in the collective list with the 1/n weighting rule and 29/206, 14%, P<.001 in the only shared diagnoses among the 3 DDx lists).

CONCLUSIONS

Simply adding each of the top 10 DDx lists from additional DDx generators increased the diagnostic accuracy of the DDx list by approximately 20%, suggesting that the combinational use of DDx generators early in the diagnostic process is beneficial.

Collapse

Yanagita Y, Shikino K, Ishizuka K, Uchida S, Li Y, Yokokawa D, Tsukamoto T, Noda K, Uehara T, Ikusaka M. Improving decision accuracy using a clinical decision support system for medical students during history-taking: a randomized clinical trial. BMC MEDICAL EDUCATION 2023;23:383. [PMID: 37231512 PMCID: PMC10214648 DOI: 10.1186/s12909-023-04370-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 05/17/2023] [Indexed: 05/27/2023]

Abstract

BACKGROUND

A clinical diagnostic support system (CDSS) can support medical students and physicians in providing evidence-based care. In this study, we investigate diagnostic accuracy based on the history of present illness between groups of medical students using a CDSS, Google, and neither (control). Further, the degree of diagnostic accuracy of medical students using a CDSS is compared with that of residents using neither a CDSS nor Google.

METHODS

This study is a randomized educational trial. The participants comprised 64 medical students and 13 residents who rotated in the Department of General Medicine at Chiba University Hospital from May to December 2020. The medical students were randomly divided into the CDSS group (n = 22), Google group (n = 22), and control group (n = 20). Participants were asked to provide the three most likely diagnoses for 20 cases, mainly a history of a present illness (10 common and 10 emergent diseases). Each correct diagnosis was awarded 1 point (maximum 20 points). The mean scores of the three medical student groups were compared using a one-way analysis of variance. Furthermore, the mean scores of the CDSS, Google, and residents' (without CDSS or Google) groups were compared.

RESULTS

The mean scores of the CDSS (12.0 ± 1.3) and Google (11.9 ± 1.1) groups were significantly higher than those of the control group (9.5 ± 1.7; p = 0.02 and p = 0.03, respectively). The residents' group's mean score (14.7 ± 1.4) was higher than the mean scores of the CDSS and Google groups (p = 0.01). Regarding common disease cases, the mean scores were 7.4 ± 0.7, 7.1 ± 0.7, and 8.2 ± 0.7 for the CDSS, Google, and residents' groups, respectively. There were no significant differences in mean scores (p = 0.1).

CONCLUSIONS

Medical students who used the CDSS and Google were able to list differential diagnoses more accurately than those using neither. Furthermore, they could make the same level of differential diagnoses as residents in the context of common diseases.

TRIAL REGISTRATION

This study was retrospectively registered with the University Hospital Medical Information Network Clinical Trials Registry on 24/12/2020 (unique trial number: UMIN000042831).

Collapse

Diagnostic Delays in Sepsis: Lessons Learned From a Retrospective Study of Canadian Medico-Legal Claims. Crit Care Explor 2023;5:e0841. [PMID: 36751515 PMCID: PMC9894347 DOI: 10.1097/cce.0000000000000841] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open

Schmidt HG, Mamede S. Improving diagnostic decision support through deliberate reflection: a proposal. Diagnosis (Berl) 2023;10:38-42. [PMID: 36000188 DOI: 10.1515/dx-2022-0062] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 07/25/2022] [Indexed: 11/15/2022]

Kourtidis P, Nurek M, Delaney B, Kostopoulou O. Influences of early diagnostic suggestions on clinical reasoning. Cogn Res Princ Implic 2022;7:103. [PMID: 36520258 PMCID: PMC9755454 DOI: 10.1186/s41235-022-00453-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 12/02/2022] [Indexed: 12/23/2022] Open

Abstract

Previous research has highlighted the importance of physicians' early hypotheses for their subsequent diagnostic decisions. It has also been shown that diagnostic accuracy improves when physicians are presented with a list of diagnostic suggestions to consider at the start of the clinical encounter. The psychological mechanisms underlying this improvement in accuracy are hypothesised. It is possible that the provision of diagnostic suggestions disrupts physicians' intuitive thinking and reduces their certainty in their initial diagnostic hypotheses. This may encourage them to seek more information before reaching a diagnostic conclusion, evaluate this information more objectively, and be more open to changing their initial hypotheses. Three online experiments explored the effects of early diagnostic suggestions, provided by a hypothetical decision aid, on different aspects of the diagnostic reasoning process. Family physicians assessed up to two patient scenarios with and without suggestions. We measured effects on certainty about the initial diagnosis, information search and evaluation, and frequency of diagnostic changes. We did not find a clear and consistent effect of suggestions and detected mainly non-significant trends, some in the expected direction. We also detected a potential biasing effect: when the most likely diagnosis was included in the list of suggestions (vs. not included), physicians who gave that diagnosis initially, tended to request less information, evaluate it as more supportive of their diagnosis, become more certain about it, and change it less frequently when encountering new but ambiguous information; in other words, they seemed to validate rather than question their initial hypothesis. We conclude that further research using different methodologies and more realistic experimental situations is required to uncover both the beneficial and biasing effects of early diagnostic suggestions.

Collapse

Scott IA. Using information technology to reduce diagnostic error: still a bridge too far? Intern Med J 2022;52:908-911. [PMID: 35718736 DOI: 10.1111/imj.15804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 04/28/2022] [Indexed: 11/28/2022]

Sibbald M, Abdulla B, Keuhl A, Norman G, Monteiro S, Sherbino J. Electronic diagnostic support in emergency physician triage: a qualitative study (Preprint). JMIR Hum Factors 2022;9:e39234. [PMID: 36178728 PMCID: PMC9568817 DOI: 10.2196/39234] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 08/05/2022] [Accepted: 08/29/2022] [Indexed: 12/05/2022] Open

Abstract

Background

Not thinking of a diagnosis is a leading cause of diagnostic error in the emergency department, resulting in delayed treatment, morbidity, and excess mortality. Electronic differential diagnostic support (EDS) results in small but significant reductions in diagnostic error. However, the uptake of EDS by clinicians is limited.

Objective

We sought to understand physician perceptions and barriers to the uptake of EDS within the emergency department triage process.

Methods

We conducted a qualitative study using a research associate to rapidly prototype an embedded EDS into the emergency department triage process. Physicians involved in the triage assessment of a busy emergency department were provided the output of an EDS based on the triage complaint by an embedded researcher to simulate an automated system that would draw from the electronic medical record. Physicians were interviewed immediately after their experience. Verbatim transcripts were analyzed by a team using open and axial coding, informed by direct content analysis.

Results

In all, 4 themes emerged from 14 interviews: (1) the quality of the EDS was inferred from the scope and prioritization of the diagnoses present in the EDS differential; (2) the trust of the EDS was linked to varied beliefs around the diagnostic process and potential for bias; (3) clinicians foresaw more benefit to EDS use for colleagues and trainees rather than themselves; and (4) clinicians felt strongly that EDS output should not be included in the patient record.

Conclusions

The adoption of an EDS into an emergency department triage process will require a system that provides diagnostic suggestions appropriate for the scope and context of the emergency department triage process, transparency of system design, and affordances for clinician beliefs about the diagnostic process and addresses clinician concern around including EDS output in the patient record.

Collapse

Martínez-García M, Hernández-Lemus E. Data Integration Challenges for Machine Learning in Precision Medicine. Front Med (Lausanne) 2022;8:784455. [PMID: 35145977 PMCID: PMC8821900 DOI: 10.3389/fmed.2021.784455] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/28/2021] [Indexed: 12/19/2022] Open

Brush JE, Sherbino J, Norman GR. Diagnostic reasoning in cardiovascular medicine. BMJ 2022;376:e064389. [PMID: 34987062 DOI: 10.1136/bmj-2021-064389] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Ranji SR, Thomas EJ. Research to improve diagnosis: time to study the real world. BMJ Qual Saf 2022;31:255-258. [PMID: 34987085 DOI: 10.1136/bmjqs-2021-014071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/13/2021] [Indexed: 11/04/2022]

Kawamura R, Harada Y, Sugimoto S, Nagase Y, Katsukura S, Shimizu T. Incidence of diagnostic errors in unplanned hospitalized patients using an automated medical history-taking system with differential diagnosis generator: retrospective observational study (Preprint). JMIR Med Inform 2021;10:e35225. [PMID: 35084347 PMCID: PMC8832260 DOI: 10.2196/35225] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 12/11/2021] [Accepted: 01/02/2022] [Indexed: 11/23/2022] Open

Abstract

Background

Automated medical history–taking systems that generate differential diagnosis lists have been suggested to contribute to improved diagnostic accuracy. However, the effect of these systems on diagnostic errors in clinical practice remains unknown.

Objective

This study aimed to assess the incidence of diagnostic errors in an outpatient department, where an artificial intelligence (AI)–driven automated medical history–taking system that generates differential diagnosis lists was implemented in clinical practice.

Methods

We conducted a retrospective observational study using data from a community hospital in Japan. We included patients aged 20 years and older who used an AI-driven, automated medical history–taking system that generates differential diagnosis lists in the outpatient department of internal medicine for whom the index visit was between July 1, 2019, and June 30, 2020, followed by unplanned hospitalization within 14 days. The primary endpoint was the incidence of diagnostic errors, which were detected using the Revised Safer Dx Instrument by at least two independent reviewers. To evaluate the effect of differential diagnosis lists from the AI system on the incidence of diagnostic errors, we compared the incidence of these errors between a group where the AI system generated the final diagnosis in the differential diagnosis list and a group where the AI system did not generate the final diagnosis in the list; the Fisher exact test was used for comparison between these groups. For cases with confirmed diagnostic errors, further review was conducted to identify the contributing factors of these errors via discussion among three reviewers, using the Safer Dx Process Breakdown Supplement as a reference.

Results

A total of 146 patients were analyzed. A final diagnosis was confirmed for 138 patients and was observed in the differential diagnosis list from the AI system for 69 patients. Diagnostic errors occurred in 16 out of 146 patients (11.0%, 95% CI 6.4%-17.2%). Although statistically insignificant, the incidence of diagnostic errors was lower in cases where the final diagnosis was included in the differential diagnosis list from the AI system than in cases where the final diagnosis was not included in the list (7.2% vs 15.9%, P=.18).

Conclusions

The incidence of diagnostic errors among patients in the outpatient department of internal medicine who used an automated medical history–taking system that generates differential diagnosis lists seemed to be lower than the previously reported incidence of diagnostic errors. This result suggests that the implementation of an automated medical history–taking system that generates differential diagnosis lists could be beneficial for diagnostic safety in the outpatient department of internal medicine.

Collapse

Graber ML. Reaching 95%: decision support tools are the surest way to improve diagnosis now. BMJ Qual Saf 2021;31:415-418. [PMID: 34642227 DOI: 10.1136/bmjqs-2021-014033] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/25/2021] [Indexed: 11/04/2022]