Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Berner ES, Webster GD, Shugerman AA, Jackson JR, Algina J, Baker AL, Ball EV, Cobbs CG, Dennis VW, Frenkel EP. Performance of four computer-based diagnostic systems. N Engl J Med 1994;330:1792-6. [PMID: 8190157 DOI: 10.1056/nejm199406233302506] [Citation(s) in RCA: 156] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]

For:	Berner ES, Webster GD, Shugerman AA, Jackson JR, Algina J, Baker AL, Ball EV, Cobbs CG, Dennis VW, Frenkel EP. Performance of four computer-based diagnostic systems. N Engl J Med 1994;330:1792-6. [PMID: 8190157 DOI: 10.1056/nejm199406233302506] [Citation(s) in RCA: 156] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]

Number

Cited by Other Article(s)

Payton EM, Graber ML, Bachiashvili V, Mehta T, Dissanayake PI, Berner ES. Impact of clinical note format on diagnostic accuracy and efficiency. HEALTH INF MANAG J 2024;53:183-188. [PMID: 37129041 DOI: 10.1177/18333583231151979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]

Wieben AM, Alreshidi BG, Douthit BJ, Sileo M, Vyas P, Steege L, Gilmore-Bykovskyi A. Nurses' perceptions of the design, implementation, and adoption of machine learning clinical decision support: A descriptive qualitative study. J Nurs Scholarsh 2024. [PMID: 38898636 DOI: 10.1111/jnu.13001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 05/06/2024] [Accepted: 06/07/2024] [Indexed: 06/21/2024]

Molinet B, Marro S, Cabrio E, Villata S. Explanatory argumentation in natural language for correct and incorrect medical diagnoses. J Biomed Semantics 2024;15:8. [PMID: 38816758 PMCID: PMC11138001 DOI: 10.1186/s13326-024-00306-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 04/12/2024] [Indexed: 06/01/2024] Open

Goh E, Gallo R, Hom J, Strong E, Weng Y, Kerman H, Cool J, Kanjee Z, Parsons AS, Ahuja N, Horvitz E, Yang D, Milstein A, Olson APJ, Rodman A, Chen JH. Influence of a Large Language Model on Diagnostic Reasoning: A Randomized Clinical Vignette Study. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.12.24303785. [PMID: 38559045 PMCID: PMC10980135 DOI: 10.1101/2024.03.12.24303785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]

Abstract

Importance

Diagnostic errors are common and cause significant morbidity. Large language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves diagnostic reasoning.

Objective

To assess the impact of the GPT-4 LLM on physicians' diagnostic reasoning compared to conventional resources.

Design

Multi-center, randomized clinical vignette study.

Setting

The study was conducted using remote video conferencing with physicians across the country and in-person participation across multiple academic medical institutions.

Participants

Resident and attending physicians with training in family medicine, internal medicine, or emergency medicine.

Interventions

Participants were randomized to access GPT-4 in addition to conventional diagnostic resources or to just conventional resources. They were allocated 60 minutes to review up to six clinical vignettes adapted from established diagnostic reasoning exams.

Main Outcomes and Measures

The primary outcome was diagnostic performance based on differential diagnosis accuracy, appropriateness of supporting and opposing factors, and next diagnostic evaluation steps. Secondary outcomes included time spent per case and final diagnosis.

Results

50 physicians (26 attendings, 24 residents) participated, with an average of 5.2 cases completed per participant. The median diagnostic reasoning score per case was 76.3 percent (IQR 65.8 to 86.8) for the GPT-4 group and 73.7 percent (IQR 63.2 to 84.2) for the conventional resources group, with an adjusted difference of 1.6 percentage points (95% CI -4.4 to 7.6; p=0.60). The median time spent on cases for the GPT-4 group was 519 seconds (IQR 371 to 668 seconds), compared to 565 seconds (IQR 456 to 788 seconds) for the conventional resources group, with a time difference of -82 seconds (95% CI -195 to 31; p=0.20). GPT-4 alone scored 15.5 percentage points (95% CI 1.5 to 29, p=0.03) higher than the conventional resources group.

Conclusions and Relevance

In a clinical vignette-based study, the availability of GPT-4 to physicians as a diagnostic aid did not significantly improve clinical reasoning compared to conventional resources, although it may improve components of clinical reasoning such as efficiency. GPT-4 alone demonstrated higher performance than both physician groups, suggesting opportunities for further improvement in physician-AI collaboration in clinical practice.

Collapse

Bakken S, Cimino JJ, Feldman S, Lorenzi NM. Celebrating Eta Berner and her influence on biomedical and health informatics. J Am Med Inform Assoc 2024;31:549-551. [PMID: 38366906 PMCID: PMC10873777 DOI: 10.1093/jamia/ocae011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 01/11/2024] [Indexed: 02/18/2024] Open

Benjamin MM, Rabbat MG. Artificial Intelligence in Transcatheter Aortic Valve Replacement: Its Current Role and Ongoing Challenges. Diagnostics (Basel) 2024;14:261. [PMID: 38337777 PMCID: PMC10855497 DOI: 10.3390/diagnostics14030261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/18/2024] [Accepted: 01/20/2024] [Indexed: 02/12/2024] Open

Zhang H, Ogasawara K. Grad-CAM-Based Explainable Artificial Intelligence Related to Medical Text Processing. Bioengineering (Basel) 2023;10:1070. [PMID: 37760173 PMCID: PMC10525184 DOI: 10.3390/bioengineering10091070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/28/2023] [Accepted: 09/06/2023] [Indexed: 09/29/2023] Open

Kafke SD, Kuhlmey A, Schuster J, Blüher S, Czimmeck C, Zoellick JC, Grosse P. Can clinical decision support systems be an asset in medical education? An experimental approach. BMC MEDICAL EDUCATION 2023;23:570. [PMID: 37568144 PMCID: PMC10416486 DOI: 10.1186/s12909-023-04568-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 08/04/2023] [Indexed: 08/13/2023]

Abstract

BACKGROUND

Diagnostic accuracy is one of the major cornerstones of appropriate and successful medical decision-making. Clinical decision support systems (CDSSs) have recently been used to facilitate physician's diagnostic considerations. However, to date, little is known about the potential assets of CDSS for medical students in an educational setting. The purpose of our study was to explore the usefulness of CDSSs for medical students assessing their diagnostic performances and the influence of such software on students' trust in their own diagnostic abilities.

METHODS

Based on paper cases students had to diagnose two different patients using a CDSS and conventional methods such as e.g. textbooks, respectively. Both patients had a common disease, in one setting the clinical presentation was a typical one (tonsillitis), in the other setting (pulmonary embolism), however, the patient presented atypically. We used a 2x2x2 between- and within-subjects cluster-randomised controlled trial to assess the diagnostic accuracy in medical students, also by changing the order of the used resources (CDSS first or second).

RESULTS

Medical students in their 4th and 5th year performed equally well using conventional methods or the CDSS across the two cases (t(164) = 1,30; p = 0.197). Diagnostic accuracy and trust in the correct diagnosis were higher in the typical presentation condition than in the atypical presentation condition (t(85) = 19.97; p < .0001 and t(150) = 7.67; p < .0001).These results refute our main hypothesis that students diagnose more accurately when using conventional methods compared to the CDSS.

CONCLUSIONS

Medical students in their 4th and 5th year performed equally well in diagnosing two cases of common diseases with typical or atypical clinical presentations using conventional methods or a CDSS. Students were proficient in diagnosing a common disease with a typical presentation but underestimated their own factual knowledge in this scenario. Also, students were aware of their own diagnostic limitations when presented with a challenging case with an atypical presentation for which the use of a CDSS seemingly provided no additional insights.

Collapse

Bansal M. Clinical Evaluation of 'Computer-Aided Diagnosis InNeuro-Otology (CADINO)' in Terms of Usefulness, Functionality and Effectiveness. Indian J Otolaryngol Head Neck Surg 2022;74:4434-4440. [PMID: 36742689 PMCID: PMC9895670 DOI: 10.1007/s12070-022-03092-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 01/13/2022] [Indexed: 02/07/2023] Open

Painter A, Hayhoe B, Riboli-Sasco E, El-Osta A. Online Symptom Checkers: Recommendations for a Vignette-Based Clinical Evaluation Standard. J Med Internet Res 2022;24:e37408. [DOI: 10.2196/37408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Revised: 09/15/2022] [Accepted: 10/11/2022] [Indexed: 11/13/2022] Open

Fritz P, Kleinhans A, Raoufi R, Sediqi A, Schmid N, Schricker S, Schanz M, Fritz-Kuisle C, Dalquen P, Firooz H, Stauch G, Alscher MD. Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin. BMC Med Inform Decis Mak 2022;22:254. [PMID: 36153527 PMCID: PMC9509605 DOI: 10.1186/s12911-022-01988-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 08/29/2022] [Indexed: 11/10/2022] Open

Abstract Abstract Background Medical decision support systems (CDSSs) are increasingly used in medicine, but their utility in daily medical practice is difficult to evaluate. One variant of CDSS is a generator of differential diagnoses (DDx generator). We performed a feasibility study on three different, publicly available data sets of medical cases in order to identify the frequency in which two different DDx generators provide helpful information (either by providing a list of differential diagnosis or recognizing the expert diagnosis if available) for a given case report. Methods Used data sets were n = 105 cases from a web-based forum of telemedicine with real life cases from Afghanistan (Afghan data set; AD), n = 124 cases discussed in a web-based medical forum (Coliquio data set; CD). Both websites are restricted for medical professionals only. The third data set consisted 50 special case reports published in the New England Journal of Medicine (NEJM). After keyword extraction, data were entered into two different DDx generators (IsabelHealth (IH), Memem7 (M7)) to examine differences in target diagnosis recognition and physician-rated usefulness between DDx generators. Results Both DDx generators detected the target diagnosis equally successfully (all cases: M7, 83/170 (49%); IH 90/170 (53%), NEJM: M7, 28/50 (56%); IH, 34/50 (68%); differences n.s.). Differences occurred in AD, where detection of an expert diagnosis was less successful with IH than with M7 (29.7% vs. 54.1%, p = 0.003). In contrast, in CD IH performed significantly better than M7 (73.9% vs. 32.6%, p = 0.021). Congruent identification of target diagnosis occurred in only 46/170 (27.1%) of cases. However, a qualitative analysis of the DDx results revealed useful complements from using the two systems in parallel. Conclusion Both DDx systems IsabelHealth and Memem7 provided substantial help in finding a helpful list of differential diagnoses or identifying the target diagnosis either in standard cases or complicated and rare cases. Our pilot study highlights the need for different levels of complexity and types of real-world medical test cases, as there are significant differences between DDx generators away from traditional case reports. Combining different results from DDx generators seems to be a possible approach for future review and use of the systems. Collapse

Schmieding ML, Kopka M, Schmidt K, Schulz-Niethammer S, Balzer F, Feufel MA. Triage Accuracy of Symptom Checker Apps: 5-Year Follow-up Evaluation. J Med Internet Res 2022;24:e31810. [PMID: 35536633 PMCID: PMC9131144 DOI: 10.2196/31810] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 11/19/2021] [Accepted: 01/30/2022] [Indexed: 12/16/2022] Open

Abstract

BACKGROUND

Symptom checkers are digital tools assisting laypersons in self-assessing the urgency and potential causes of their medical complaints. They are widely used but face concerns from both patients and health care professionals, especially regarding their accuracy. A 2015 landmark study substantiated these concerns using case vignettes to demonstrate that symptom checkers commonly err in their triage assessment.

OBJECTIVE

This study aims to revisit the landmark index study to investigate whether and how symptom checkers' capabilities have evolved since 2015 and how they currently compare with laypersons' stand-alone triage appraisal.

METHODS

In early 2020, we searched for smartphone and web-based applications providing triage advice. We evaluated these apps on the same 45 case vignettes as the index study. Using descriptive statistics, we compared our findings with those of the index study and with publicly available data on laypersons' triage capability.

RESULTS

We retrieved 22 symptom checkers providing triage advice. The median triage accuracy in 2020 (55.8%, IQR 15.1%) was close to that in 2015 (59.1%, IQR 15.5%). The apps in 2020 were less risk averse (odds 1.11:1, the ratio of overtriage errors to undertriage errors) than those in 2015 (odds 2.82:1), missing >40% of emergencies. Few apps outperformed laypersons in either deciding whether emergency care was required or whether self-care was sufficient. No apps outperformed the laypersons on both decisions.

CONCLUSIONS

Triage performance of symptom checkers has, on average, not improved over the course of 5 years. It decreased in 2 use cases (advice on when emergency care is required and when no health care is needed for the moment). However, triage capability varies widely within the sample of symptom checkers. Whether it is beneficial to seek advice from symptom checkers depends on the app chosen and on the specific question to be answered. Future research should develop resources (eg, case vignette repositories) to audit the capabilities of symptom checkers continuously and independently and provide guidance on when and to whom they should be recommended.

Collapse

Johnson AE, Brewer LC, Echols MR, Mazimba S, Shah RU, Breathett K. Utilizing Artificial Intelligence to Enhance Health Equity Among Patients with Heart Failure. Heart Fail Clin 2022;18:259-273. [PMID: 35341539 PMCID: PMC8988237 DOI: 10.1016/j.hfc.2021.11.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Ginghina O, Hudita A, Zamfir M, Spanu A, Mardare M, Bondoc I, Buburuzan L, Georgescu SE, Costache M, Negrei C, Nitipir C, Galateanu B. Liquid Biopsy and Artificial Intelligence as Tools to Detect Signatures of Colorectal Malignancies: A Modern Approach in Patient's Stratification. Front Oncol 2022;12:856575. [PMID: 35356214 PMCID: PMC8959149 DOI: 10.3389/fonc.2022.856575] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 02/16/2022] [Indexed: 01/19/2023] Open

Abstract

Colorectal cancer (CRC) is the second most frequently diagnosed type of cancer and a major worldwide public health concern. Despite the global efforts in the development of modern therapeutic strategies, CRC prognosis is strongly correlated with the stage of the disease at diagnosis. Early detection of CRC has a huge impact in decreasing mortality while pre-lesion detection significantly reduces the incidence of the pathology. Even though the management of CRC patients is based on robust diagnostic methods such as serum tumor markers analysis, colonoscopy, histopathological analysis of tumor tissue, and imaging methods (computer tomography or magnetic resonance), these strategies still have many limitations and do not fully satisfy clinical needs due to their lack of sensitivity and/or specificity. Therefore, improvements of the current practice would substantially impact the management of CRC patients. In this view, liquid biopsy is a promising approach that could help clinicians screen for disease, stratify patients to the best treatment, and monitor treatment response and resistance mechanisms in the tumor in a regular and minimally invasive manner. Liquid biopsies allow the detection and analysis of different tumor-derived circulating markers such as cell-free nucleic acids (cfNA), circulating tumor cells (CTCs), and extracellular vesicles (EVs) in the bloodstream. The major advantage of this approach is its ability to trace and monitor the molecular profile of the patient's tumor and to predict personalized treatment in real-time. On the other hand, the prospective use of artificial intelligence (AI) in medicine holds great promise in oncology, for the diagnosis, treatment, and prognosis prediction of disease. AI has two main branches in the medical field: (i) a virtual branch that includes medical imaging, clinical assisted diagnosis, and treatment, as well as drug research, and (ii) a physical branch that includes surgical robots. This review summarizes findings relevant to liquid biopsy and AI in CRC for better management and stratification of CRC patients.

Collapse

Brush JE, Hajduk AM, Greene EJ, Dreyer RP, Krumholz HM, Chaudhry SI. Sex Differences in Symptom Phenotypes Among Older Patients with Acute Myocardial Infarction. Am J Med 2022;135:342-349. [PMID: 34715061 PMCID: PMC8901454 DOI: 10.1016/j.amjmed.2021.09.022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 09/21/2021] [Accepted: 09/28/2021] [Indexed: 01/05/2023]

Brush JE, Sherbino J, Norman GR. Diagnostic reasoning in cardiovascular medicine. BMJ 2022;376:e064389. [PMID: 34987062 DOI: 10.1136/bmj-2021-064389] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Yang G, Ye Q, Xia J. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: A mini-review, two showcases and beyond. AN INTERNATIONAL JOURNAL ON INFORMATION FUSION 2022;77:29-52. [PMID: 34980946 PMCID: PMC8459787 DOI: 10.1016/j.inffus.2021.07.016] [Citation(s) in RCA: 140] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 05/25/2021] [Accepted: 07/25/2021] [Indexed: 05/04/2023]

Ben-Shabat N, Sloma A, Weizman T, Kiderman D, Amital H. Diagnostic Performance of a New Artificial-Intelligence Driven Diagnostic Support Tool: Board-Exams Clinical Vignette Study. JMIR Med Inform 2021;9:e32507. [PMID: 34672262 PMCID: PMC8672291 DOI: 10.2196/32507] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 10/20/2021] [Accepted: 10/20/2021] [Indexed: 01/01/2023] Open

Abstract

Background

Diagnostic decision support systems (DDSS) are computer programs aimed to improve health care by supporting clinicians in the process of diagnostic decision-making. Previous studies on DDSS demonstrated their ability to enhance clinicians’ diagnostic skills, prevent diagnostic errors, and reduce hospitalization costs. Despite the potential benefits, their utilization in clinical practice is limited, emphasizing the need for new and improved products.

Objective

The aim of this study was to conduct a preliminary analysis of the diagnostic performance of “Kahun,” a new artificial intelligence-driven diagnostic tool.

Methods

Diagnostic performance was evaluated based on the program’s ability to “solve” clinical cases from the United States Medical Licensing Examination Step 2 Clinical Skills board exam simulations that were drawn from the case banks of 3 leading preparation companies. Each case included 3 expected differential diagnoses. The cases were entered into the Kahun platform by 3 blinded junior physicians. For each case, the presence and the rank of the correct diagnoses within the generated differential diagnoses list were recorded. Each diagnostic performance was measured in two ways: first, as diagnostic sensitivity, and second, as case-specific success rates that represent diagnostic comprehensiveness.

Results

The study included 91 clinical cases with 78 different chief complaints and a mean number of 38 (SD 8) findings for each case. The total number of expected diagnoses was 272, of which 174 were different (some appeared more than once). Of the 272 expected diagnoses, 231 (87.5%; 95% CI 76-99) diagnoses were suggested within the top 20 listed diagnoses, 209 (76.8%; 95% CI 66-87) were suggested within the top 10, and 168 (61.8%; 95% CI 52-71) within the top 5. The median rank of correct diagnoses was 3 (IQR 2-6). Of the 91 expected diagnoses, 62 (68%; 95% CI 59-78) of the cases were suggested within the top 20 listed diagnoses, 44 (48%; 95% CI 38-59) within the top 10, and 24 (26%; 95% CI 17-35) within the top 5. Of the 91 expected diagnoses, in 87 (96%; 95% CI 91-100), at least 2 out of 3 of the cases’ expected diagnoses were suggested within the top 20 listed diagnoses; 78 (86%; 95% CI 79-93) were suggested within the top 10; and 61 (67%; 95% CI 57-77) within the top 5.

Conclusions

The diagnostic support tool evaluated in this study demonstrated good diagnostic accuracy and comprehensiveness; it also had the ability to manage a wide range of clinical findings.

Collapse

Sibbald M, Monteiro S, Sherbino J, LoGiudice A, Friedman C, Norman G. Should electronic differential diagnosis support be used early or late in the diagnostic process? A multicentre experimental study of Isabel. BMJ Qual Saf 2021;31:426-433. [PMID: 34611040 PMCID: PMC9132870 DOI: 10.1136/bmjqs-2021-013493] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 09/09/2021] [Indexed: 12/17/2022]

Vinny PW, Takkar A, Lal V, Padma MV, Sylaja PN, Narasimhan L, Dwivedi SN, Nair PP, Iype T, Gupta A, Vishnu VY. Mobile application as a complementary tool for differential diagnosis in Neuro-ophthalmology: A multicenter cross-sectional study. Indian J Ophthalmol 2021;69:1491-1497. [PMID: 34011726 PMCID: PMC8302325 DOI: 10.4103/ijo.ijo_1929_20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open

Abstract

Purpose:

Drawing differential diagnoses to a Neuro-ophthalmology clinical scenario is a difficult task for a neurology trainee. The authors conducted a study to determine if a mobile application specialized in suggesting differential diagnoses from clinical scenarios can complement clinical reasoning of a neurologist in training.

Methods:

A cross-sectional multicenter study was conducted to compare the accuracy of neurology residents versus a mobile medical app (Neurology Dx) in drawing a comprehensive list of differential diagnoses from Neuro-ophthalmology clinical vignettes. The differentials generated by residents and the App were compared with the Gold standard differential diagnoses adjudicated by experts. The prespecified primary outcome was the proportion of correctly identified high likely gold standard differential diagnosis by residents and App.

Results:

Neurology residents (n = 100) attempted 1500 Neuro-ophthalmology clinical vignettes. Frequency of correctly identified high likely differential diagnosis by residents was 19.42% versus 53.71% by the App (P < 0.0001). The first listed differential diagnosis by the residents matched with that of the first differential diagnosis adjudicated by experts (gold standard differential diagnosis) with a frequency of 26.5% versus 28.3% by the App, whereas the combined output of residents and App scored a frequency of 41.2% in identifying the first gold standard differential correctly. The residents correctly identified the first three and first five gold standard differential diagnosis with a frequency of 17.83% and 19.2%, respectively, as against 22.26% and 30.39% (P < 0.0001) by the App.

Conclusion:

A ruled based app in Neuro-ophthalmology has the potential to complement a neurology resident in drawing a comprehensive list of differential diagnoses.

Collapse

Schmieding ML, Mörgeli R, Schmieding MAL, Feufel MA, Balzer F. Benchmarking Triage Capability of Symptom Checkers Against That of Medical Laypersons: Survey Study. J Med Internet Res 2021;23:e24475. [PMID: 33688845 PMCID: PMC7991983 DOI: 10.2196/24475] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 10/22/2020] [Accepted: 01/18/2021] [Indexed: 12/15/2022] Open

Abstract

BACKGROUND

Symptom checkers (SCs) are tools developed to provide clinical decision support to laypersons. Apart from suggesting probable diagnoses, they commonly advise when users should seek care (triage advice). SCs have become increasingly popular despite prior studies rating their performance as mediocre. To date, it is unclear whether SCs can triage better than those who might choose to use them.

OBJECTIVE

This study aims to compare triage accuracy between SCs and their potential users (ie, laypersons).

METHODS

On Amazon Mechanical Turk, we recruited 91 adults from the United States who had no professional medical background. In a web-based survey, the participants evaluated 45 fictitious clinical case vignettes. Data for 15 SCs that had processed the same vignettes were obtained from a previous study. As main outcome measures, we assessed the accuracy of the triage assessments made by participants and SCs for each of the three triage levels (ie, emergency care, nonemergency care, self-care) and overall, the proportion of participants outperforming each SC in terms of accuracy, and the risk aversion of participants and SCs by comparing the proportion of cases that were overtriaged.

RESULTS

The mean overall triage accuracy was similar for participants (60.9%, SD 6.8%; 95% CI 59.5%-62.3%) and SCs (58%, SD 12.8%). Most participants outperformed all but 5 SCs. On average, SCs more reliably detected emergencies (80.6%, SD 17.9%) than laypersons did (67.5%, SD 16.4%; 95% CI 64.1%-70.8%). Although both SCs and participants struggled with cases requiring self-care (the least urgent triage category), SCs more often wrongly classified these cases as emergencies (43/174, 24.7%) compared with laypersons (56/1365, 4.10%).

CONCLUSIONS

Most SCs had no greater triage capability than an average layperson, although the triage accuracy of the five best SCs was superior to the accuracy of most participants. SCs might improve early detection of emergencies but might also needlessly increase resource utilization in health care. Laypersons sometimes require support in deciding when to rely on self-care but it is in that very situation where SCs perform the worst. Further research is needed to determine how to best combine the strengths of humans and SCs.

Collapse

Jones OT, Calanzani N, Saji S, Duffy SW, Emery J, Hamilton W, Singh H, de Wit NJ, Walter FM. Artificial Intelligence Techniques That May Be Applied to Primary Care Data to Facilitate Earlier Diagnosis of Cancer: Systematic Review. J Med Internet Res 2021;23:e23483. [PMID: 33656443 PMCID: PMC7970165 DOI: 10.2196/23483] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 11/05/2020] [Accepted: 11/30/2020] [Indexed: 12/15/2022] Open

Abstract

BACKGROUND

More than 17 million people worldwide, including 360,000 people in the United Kingdom, were diagnosed with cancer in 2018. Cancer prognosis and disease burden are highly dependent on the disease stage at diagnosis. Most people diagnosed with cancer first present in primary care settings, where improved assessment of the (often vague) presenting symptoms of cancer could lead to earlier detection and improved outcomes for patients. There is accumulating evidence that artificial intelligence (AI) can assist clinicians in making better clinical decisions in some areas of health care.

OBJECTIVE

This study aimed to systematically review AI techniques that may facilitate earlier diagnosis of cancer and could be applied to primary care electronic health record (EHR) data. The quality of the evidence, the phase of development the AI techniques have reached, the gaps that exist in the evidence, and the potential for use in primary care were evaluated.

METHODS

We searched MEDLINE, Embase, SCOPUS, and Web of Science databases from January 01, 2000, to June 11, 2019, and included all studies providing evidence for the accuracy or effectiveness of applying AI techniques for the early detection of cancer, which may be applicable to primary care EHRs. We included all study designs in all settings and languages. These searches were extended through a scoping review of AI-based commercial technologies. The main outcomes assessed were measures of diagnostic accuracy for cancer.

RESULTS

We identified 10,456 studies; 16 studies met the inclusion criteria, representing the data of 3,862,910 patients. A total of 13 studies described the initial development and testing of AI algorithms, and 3 studies described the validation of an AI algorithm in independent data sets. One study was based on prospectively collected data; only 3 studies were based on primary care data. We found no data on implementation barriers or cost-effectiveness. Risk of bias assessment highlighted a wide range of study quality. The additional scoping review of commercial AI technologies identified 21 technologies, only 1 meeting our inclusion criteria. Meta-analysis was not undertaken because of the heterogeneity of AI modalities, data set characteristics, and outcome measures.

CONCLUSIONS

AI techniques have been applied to EHR-type data to facilitate early diagnosis of cancer, but their use in primary care settings is still at an early stage of maturity. Further evidence is needed on their performance using primary care data, implementation barriers, and cost-effectiveness before widespread adoption into routine primary care clinical practice can be recommended.

Collapse

Gilbert S, Mehl A, Baluch A, Cawley C, Challiner J, Fraser H, Millen E, Montazeri M, Multmeier J, Pick F, Richter C, Türk E, Upadhyay S, Virani V, Vona N, Wicks P, Novorol C. How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs. BMJ Open 2020;10:e040269. [PMID: 33328258 PMCID: PMC7745523 DOI: 10.1136/bmjopen-2020-040269] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Abstract

OBJECTIVES

To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps.

DESIGN

Vignettes study.

SETTING

200 primary care vignettes.

INTERVENTION/COMPARATOR

For eight apps and seven general practitioners (GPs): breadth of coverage and condition-suggestion and urgency advice accuracy measured against the vignettes' gold-standard.

PRIMARY OUTCOME MEASURES

(1) Proportion of conditions 'covered' by an app, that is, not excluded because the user was too young/old or pregnant, or not modelled; (2) proportion of vignettes with the correct primary diagnosis among the top 3 conditions suggested; (3) proportion of 'safe' urgency advice (ie, at gold standard level, more conservative, or no more than one level less conservative).

RESULTS

Condition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%; WebMD: 93.0%. Top-3 suggestion accuracy was GPs (average): 82.1%±5.2%; Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%; Your.MD: 23.5%. Some apps excluded certain user demographics or conditions and their performance was generally greater with the exclusion of corresponding vignettes. For safe urgency advice, tested GPs had an average of 97.0%±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 SD of the GPs-Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 SDs of GPs-Your.MD: 92.6%. Three apps had a safety performance outside 2 SDs of GPs-Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3×10^-3).

CONCLUSIONS

The utility of digital symptom assessment apps relies on coverage, accuracy and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care.

Collapse

Khemasuwan D, Sorensen JS, Colt HG. Artificial intelligence in pulmonary medicine: computer vision, predictive model and COVID-19. Eur Respir Rev 2020;29:29/157/200181. [PMID: 33004526 PMCID: PMC7537944 DOI: 10.1183/16000617.0181-2020] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Accepted: 08/20/2020] [Indexed: 12/21/2022] Open

Mathur P, Srivastava S, Xu X, Mehta JL. Artificial Intelligence, Machine Learning, and Cardiovascular Disease. CLINICAL MEDICINE INSIGHTS-CARDIOLOGY 2020;14:1179546820927404. [PMID: 32952403 PMCID: PMC7485162 DOI: 10.1177/1179546820927404] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2019] [Accepted: 04/23/2020] [Indexed: 12/11/2022]

Lin Y, Li Y, Lu K, Ma C, Zhao P, Gao D, Fan Z, Cheng Z, Wang Z, Yu S. Long-distance disorder-disorder relation extraction with bootstrapped noisy data. J Biomed Inform 2020;109:103529. [PMID: 32771539 DOI: 10.1016/j.jbi.2020.103529] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 06/04/2020] [Accepted: 08/04/2020] [Indexed: 11/18/2022]

Nateqi J, Lin S, Krobath H, Gruarin S, Lutz T, Dvorak T, Gruschina A, Ortner R. [From symptom to diagnosis-symptom checkers re-evaluated : Are symptom checkers finally sufficient and accurate to use? An update from the ENT perspective]. HNO 2019;67:334-342. [PMID: 30993374 DOI: 10.1007/s00106-019-0666-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Abstract

BACKGROUND

Every seventh diagnosis is a misdiagnosis. Each year, 1.5 million lives could be saved worldwide with the correct diagnosis. Physicians have to consider over 20,000 diseases. A study from Harvard University published in 2015 tested 19 symptom checkers and found them to be insufficient, with only 29-71% accuracy in diagnosis.

OBJECTIVE

The current study investigates the diagnostic accuracy of new symptom checkers from an ENT perspective.

MATERIALS AND METHODS

The authors update the abovenamed diagnostic accuracy comparison by (1) including the five new symptom checkers Symptoma, Ada, FindZebra, Mediktor, and Babylon; and (2) normalizing results of the previously tested symptom checkers as to reflect each diagnostic accuracy based on the same set of patient vignettes. The winner is then compared to the two symptom checkers with the most scientific evidence, namely Isabel and FindZebra, on the basis of an ENT-specific test with patient vignettes sourced from the British Medical Journal.

RESULTS

Most of the new symptom checkers demonstrated diagnostic accuracy rates within the previously established range, with the exception of Symptoma, which scored the right diagnosis in 82.2% of cases at the top of the list (+38% points), and in 100% of cases in the top 3 (+29% points) and the top 10 (+16% points), thus raising the bar in this field. The cross-validation with ENT cases resulted in a diagnostic accuracy of 64.3 vs. 21.4 vs. 26.2% (top 1), 92.9 vs. 40.5 vs. 42.9% (top 3), and 100 vs. 61.9 vs. 54.8% (top 10) for Symptoma vs. Isabel vs. FindZebra, respectively.

CONCLUSIONS

Symptoma is the first and only viable solution in this market. Large-scale studies should be conducted to further validate these results as well as to assess the actual practical performance of the symptom checkers and their ability to diagnose rare diseases.

Collapse

Wadhwa RR, Park DY, Natowicz MR. The accuracy of computer-based diagnostic tools for the identification of concurrent genetic disorders. Am J Med Genet A 2018;176:2704-2709. [PMID: 30475443 DOI: 10.1002/ajmg.a.40651] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 08/09/2018] [Accepted: 09/08/2018] [Indexed: 11/11/2022]

Yu KH, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat Biomed Eng 2018;2:719-731. [PMID: 31015651 DOI: 10.1038/s41551-018-0305-z] [Citation(s) in RCA: 910] [Impact Index Per Article: 151.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 09/05/2018] [Indexed: 02/07/2023]

Graber ML, Byrne C, Johnston D. The impact of electronic health records on diagnosis. ACTA ACUST UNITED AC 2018. [PMID: 29536944 DOI: 10.1515/dx-2017-0012] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

Sims MH, Hodges Shaw M, Gilbertson S, Storch J, Halterman MW. Legal and ethical issues surrounding the use of crowdsourcing among healthcare providers. Health Informatics J 2018;25:1618-1630. [PMID: 30192688 DOI: 10.1177/1460458218796599] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Jeganathan J, Knio Z, Amador Y, Hai T, Khamooshian A, Matyal R, Khabbaz KR, Mahmood F. Artificial intelligence in mitral valve analysis. Ann Card Anaesth 2017;20:129-134. [PMID: 28393769 PMCID: PMC5408514 DOI: 10.4103/aca.aca_243_16] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open

Cahan A, Cimino JJ. A Learning Health Care System Using Computer-Aided Diagnosis. J Med Internet Res 2017;19:e54. [PMID: 28274905 PMCID: PMC5362695 DOI: 10.2196/jmir.6663] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 01/04/2017] [Accepted: 02/12/2017] [Indexed: 11/13/2022] Open

Segal MM, Athreya B, Son MBF, Tirosh I, Hausmann JS, Ang EYN, Zurakowski D, Feldman LK, Sundel RP. Evidence-based decision support for pediatric rheumatology reduces diagnostic errors. Pediatr Rheumatol Online J 2016;14:67. [PMID: 27964737 PMCID: PMC5155385 DOI: 10.1186/s12969-016-0127-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Accepted: 12/01/2016] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The number of trained specialists world-wide is insufficient to serve all children with pediatric rheumatologic disorders, even in the countries with robust medical resources. We evaluated the potential of diagnostic decision support software (DDSS) to alleviate this shortage by assessing the ability of such software to improve the diagnostic accuracy of non-specialists.

METHODS

Using vignettes of actual clinical cases, clinician testers generated a differential diagnosis before and after using diagnostic decision support software. The evaluation used the SimulConsult® DDSS tool, based on Bayesian pattern matching with temporal onset of each finding in each disease. The tool covered 5405 diseases (averaging 22 findings per disease). Rheumatology content in the database was developed using both primary references and textbooks. The frequency, timing, age of onset and age of disappearance of findings, as well as their incidence, treatability, and heritability were taken into account in order to guide diagnostic decision making. These capabilities allowed key information such as pertinent negatives and evolution over time to be used in the computations. Efficacy was measured by comparing whether the correct condition was included in the differential diagnosis generated by clinicians before using the software ("unaided"), versus after use of the DDSS ("aided").

RESULTS

The 26 clinicians demonstrated a significant reduction in diagnostic errors following introduction of the software, from 28% errors while unaided to 15% using decision support (p < 0.0001). Improvement was greatest for emergency medicine physicians (p = 0.013) and clinicians in practice for less than 10 years (p = 0.012). This error reduction occurred despite the fact that testers employed an "open book" approach to generate their initial lists of potential diagnoses, spending an average of 8.6 min using printed and electronic sources of medical information before using the diagnostic software.

CONCLUSIONS

These findings suggest that decision support can reduce diagnostic errors and improve use of relevant information by generalists. Such assistance could potentially help relieve the shortage of experts in pediatric rheumatology and similarly underserved specialties by improving generalists' ability to evaluate and diagnose patients presenting with musculoskeletal complaints.

TRIAL REGISTRATION

ClinicalTrials.gov ID: NCT02205086.

Collapse

Middleton B, Sittig DF, Wright A. Clinical Decision Support: a 25 Year Retrospective and a 25 Year Vision. Yearb Med Inform 2016;Suppl 1:S103-16. [PMID: 27488402 DOI: 10.15265/iys-2016-s034] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open

Evans RS. Electronic Health Records: Then, Now, and in the Future. Yearb Med Inform 2016;Suppl 1:S48-61. [PMID: 27199197 DOI: 10.15265/iys-2016-s006] [Citation(s) in RCA: 213] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Abstract

OBJECTIVES

Describe the state of Electronic Health Records (EHRs) in 1992 and their evolution by 2015 and where EHRs are expected to be in 25 years. Further to discuss the expectations for EHRs in 1992 and explore which of them were realized and what events accelerated or disrupted/derailed how EHRs evolved.

METHODS

Literature search based on "Electronic Health Record", "Medical Record", and "Medical Chart" using Medline, Google, Wikipedia Medical, and Cochrane Libraries resulted in an initial review of 2,356 abstracts and other information in papers and books. Additional papers and books were identified through the review of references cited in the initial review.

RESULTS

By 1992, hardware had become more affordable, powerful, and compact and the use of personal computers, local area networks, and the Internet provided faster and easier access to medical information. EHRs were initially developed and used at academic medical facilities but since most have been replaced by large vendor EHRs. While EHR use has increased and clinicians are being prepared to practice in an EHR-mediated world, technical issues have been overshadowed by procedural, professional, social, political, and especially ethical issues as well as the need for compliance with standards and information security. There have been enormous advancements that have taken place, but many of the early expectations for EHRs have not been realized and current EHRs still do not meet the needs of today's rapidly changing healthcare environment.

CONCLUSION

The current use of EHRs initiated by new technology would have been hard to foresee. Current and new EHR technology will help to provide international standards for interoperable applications that use health, social, economic, behavioral, and environmental data to communicate, interpret, and act intelligently upon complex healthcare information to foster precision medicine and a learning health system.

Collapse

Grigull L, Lechner W, Petri S, Kollewe K, Dengler R, Mehmecke S, Schumacher U, Lücke T, Schneider-Gold C, Köhler C, Güttsches AK, Kortum X, Klawonn F. Diagnostic support for selected neuromuscular diseases using answer-pattern recognition and data mining techniques: a proof of concept multicenter prospective trial. BMC Med Inform Decis Mak 2016;16:31. [PMID: 26957320 PMCID: PMC4782522 DOI: 10.1186/s12911-016-0268-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2015] [Accepted: 02/26/2016] [Indexed: 01/05/2023] Open

Abstract

BACKGROUND

Diagnosis of neuromuscular diseases in primary care is often challenging. Rare diseases such as Pompe disease are easily overlooked by the general practitioner. We therefore aimed to develop a diagnostic support tool using patient-oriented questions and combined data mining algorithms recognizing answer patterns in individuals with selected neuromuscular diseases. A multicenter prospective study for the proof of concept was conducted thereafter.

METHODS

First, 16 interviews with patients were conducted focusing on their pre-diagnostic observations and experiences. From these interviews, we developed a questionnaire with 46 items. Then, patients with diagnosed neuromuscular diseases as well as patients without such a disease answered the questionnaire to establish a database for data mining. For proof of concept, initially only six diagnoses were chosen (myotonic dystrophy and myotonia (MdMy), Pompe disease (MP), amyotrophic lateral sclerosis (ALS), polyneuropathy (PNP), spinal muscular atrophy (SMA), other neuromuscular diseases, and no neuromuscular disease (NND). A prospective study was performed to validate the automated malleable system, which included six different classification methods combined in a fusion algorithm proposing a final diagnosis. Finally, new diagnoses were incorporated into the system.

RESULTS

In total, questionnaires from 210 individuals were used to train the system. 89.5 % correct diagnoses were achieved during cross-validation. The sensitivity of the system was 93-97 % for individuals with MP, with MdMy and without neuromuscular diseases, but only 69 % in SMA and 81 % in ALS patients. In the prospective trial, 57/64 (89 %) diagnoses were predicted correctly by the computerized system. All questions, or rather all answers, increased the diagnostic accuracy of the system, with the best results reached by the fusion of different classifier methods. Receiver operating curve (ROC) and p-value analyses confirmed the results.

CONCLUSION

A questionnaire-based diagnostic support tool using data mining methods exhibited good results in predicting selected neuromuscular diseases. Due to the variety of neuromuscular diseases, additional studies are required to measure beneficial effects in the clinical setting.

Collapse

Riches N, Panagioti M, Alam R, Cheraghi-Sohi S, Campbell S, Esmail A, Bower P. The Effectiveness of Electronic Differential Diagnoses (DDX) Generators: A Systematic Review and Meta-Analysis. PLoS One 2016;11:e0148991. [PMID: 26954234 PMCID: PMC4782994 DOI: 10.1371/journal.pone.0148991] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 01/25/2016] [Indexed: 01/10/2023] Open

Abstract

Background

Diagnostic errors are costly and they can contribute to adverse patient outcomes, including avoidable deaths. Differential diagnosis (DDX) generators are electronic tools that may facilitate the diagnostic process.

Methods and Findings

We conducted a systematic review and meta-analysis to investigate the efficacy and utility of DDX generators. We undertook a comprehensive search of the literature including 16 databases from inception to May 2015 and specialist patient safety databases. We also searched the reference lists of included studies. Article screening, selection and data extraction were independently conducted by 2 reviewers. 36 articles met the eligibility criteria and the pooled accurate diagnosis retrieval rate of DDX tools was high with high heterogeneity (pooled rate = 0.70, 95% CI = 0.63 to 0.77; I² = 97%, p<0.0001). DDX generators did not demonstrate improved diagnostic retrieval compared to clinicians but small improvements were seen in the before and after studies where clinicians had the opportunity to revisit their diagnoses following DDX generator consultation. Clinical utility data generally indicated high levels of user satisfaction and significant reductions in time taken to use for newer web-based tools. Lengthy differential lists and their low relevance were areas of concern and have the potential to increase diagnostic uncertainty. Data on the number of investigations ordered and on cost-effectiveness remain inconclusive.

Conclusions

DDX generators have the potential to improve diagnostic practice among clinicians. However, the high levels of heterogeneity, the variable quality of the reported data and the minimal benefits observed for complex cases suggest caution. Further research needs to be undertaken in routine clinical settings with greater consideration of enablers and barriers which are likely to impact on DDX use before their use in routine clinical practice can be recommended.

Collapse

Goodman KW. Ethical and Legal Issues in Decision Support. HEALTH INFORMATICS 2016. [DOI: 10.1007/978-3-319-31913-1_8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Wright A, Maloney FL, Wien M, Samal L, Emani S, Zuccotti G. Assessing information system readiness for mitigating malpractice risk through simulation: results of a multi-site study. J Am Med Inform Assoc 2015;22:1020-8. [PMID: 26017230 DOI: 10.1093/jamia/ocv041] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2015] [Accepted: 04/08/2015] [Indexed: 11/13/2022] Open

Dhiman GJ, Amber KT, Goodman KW. Comparative outcome studies of clinical decision support software: limitations to the practice of evidence-based system acquisition. J Am Med Inform Assoc 2015;22:e13-20. [PMID: 25665704 PMCID: PMC7659211 DOI: 10.1093/jamia/ocu033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2014] [Revised: 11/21/2014] [Accepted: 11/24/2014] [Indexed: 11/14/2022] Open

Segal MM, Williams MS, Gropman AL, Torres AR, Forsyth R, Connolly AM, El-Hattab AW, Perlman SJ, Samanta D, Parikh S, Pavlakis SG, Feldman LK, Betensky RA, Gospe SM. Evidence-based decision support for neurological diagnosis reduces errors and unnecessary workup. J Child Neurol 2014;29:487-92. [PMID: 23576414 DOI: 10.1177/0883073813483365] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Electronic Health Records and Patient Safety. Patient Saf Surg 2014. [DOI: 10.1007/978-1-4471-4369-7_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open

Williams CN, Bratton SL, Hirshberg EL. Computerized decision support in adult and pediatric critical care. World J Crit Care Med 2013;2:21-8. [PMID: 24701413 PMCID: PMC3953873 DOI: 10.5492/wjccm.v2.i4.21] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Revised: 08/02/2013] [Accepted: 08/20/2013] [Indexed: 02/06/2023] Open

Braithwaite RS, Scotch M. Using value of information to guide evaluation of decision supports for differential diagnosis: is it time for a new look? BMC Med Inform Decis Mak 2013;13:105. [PMID: 24020989 PMCID: PMC3846909 DOI: 10.1186/1472-6947-13-105] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Accepted: 09/06/2013] [Indexed: 11/10/2022] Open

Cognition and decision in biomedical artificial intelligence: From symbolic representation to emergence. AI & SOCIETY 2013. [DOI: 10.1007/bf01210601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Papier A. Decision support in dermatology and medicine: history and recent developments. ACTA ACUST UNITED AC 2013;31:153-9. [PMID: 22929351 DOI: 10.1016/j.sder.2012.06.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Revised: 06/06/2012] [Accepted: 06/19/2012] [Indexed: 11/29/2022]

Bogich TL, Funk S, Malcolm TR, Chhun N, Epstein JH, Chmura AA, Kilpatrick AM, Brownstein JS, Hutchison OC, Doyle-Capitman C, Deaville R, Morse SS, Cunningham AA, Daszak P. Using network theory to identify the causes of disease outbreaks of unknown origin. J R Soc Interface 2013;10:20120904. [PMID: 23389893 DOI: 10.1098/rsif.2012.0904] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Belle A, Kon MA, Najarian K. Biomedical informatics for computer-aided decision support systems: a survey. ScientificWorldJournal 2013;2013:769639. [PMID: 23431259 PMCID: PMC3575619 DOI: 10.1155/2013/769639] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2012] [Accepted: 01/09/2013] [Indexed: 11/18/2022] Open

Yuan MJ, Finley GM, Long J, Mills C, Johnson RK. Evaluation of user interface and workflow design of a bedside nursing clinical decision support system. Interact J Med Res 2013;2:e4. [PMID: 23612350 PMCID: PMC3628119 DOI: 10.2196/ijmr.2402] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2012] [Revised: 12/10/2012] [Accepted: 12/29/2012] [Indexed: 11/20/2022] Open

Abstract

Background

Clinical decision support systems (CDSS) are important tools to improve health care outcomes and reduce preventable medical adverse events. However, the effectiveness and success of CDSS depend on their implementation context and usability in complex health care settings. As a result, usability design and validation, especially in real world clinical settings, are crucial aspects of successful CDSS implementations.

Objective

Our objective was to develop a novel CDSS to help frontline nurses better manage critical symptom changes in hospitalized patients, hence reducing preventable failure to rescue cases. A robust user interface and implementation strategy that fit into existing workflows was key for the success of the CDSS.

Methods

Guided by a formal usability evaluation framework, UFuRT (user, function, representation, and task analysis), we developed a high-level specification of the product that captures key usability requirements and is flexible to implement. We interviewed users of the proposed CDSS to identify requirements, listed functions, and operations the system must perform. We then designed visual and workflow representations of the product to perform the operations. The user interface and workflow design were evaluated via heuristic and end user performance evaluation. The heuristic evaluation was done after the first prototype, and its results were incorporated into the product before the end user evaluation was conducted. First, we recruited 4 evaluators with strong domain expertise to study the initial prototype. Heuristic violations were coded and rated for severity. Second, after development of the system, we assembled a panel of nurses, consisting of 3 licensed vocational nurses and 7 registered nurses, to evaluate the user interface and workflow via simulated use cases. We recorded whether each session was successfully completed and its completion time. Each nurse was asked to use the National Aeronautics and Space Administration (NASA) Task Load Index to self-evaluate the amount of cognitive and physical burden associated with using the device.

Results

A total of 83 heuristic violations were identified in the studies. The distribution of the heuristic violations and their average severity are reported. The nurse evaluators successfully completed all 30 sessions of the performance evaluations. All nurses were able to use the device after a single training session. On average, the nurses took 111 seconds (SD 30 seconds) to complete the simulated task. The NASA Task Load Index results indicated that the work overhead on the nurses was low. In fact, most of the burden measures were consistent with zero. The only potentially significant burden was temporal demand, which was consistent with the primary use case of the tool.

Conclusions

The evaluation has shown that our design was functional and met the requirements demanded by the nurses’ tight schedules and heavy workloads. The user interface embedded in the tool provided compelling utility to the nurse with minimal distraction.

Collapse