1
|
Payton EM, Graber ML, Bachiashvili V, Mehta T, Dissanayake PI, Berner ES. Impact of clinical note format on diagnostic accuracy and efficiency. HEALTH INF MANAG J 2024; 53:183-188. [PMID: 37129041 DOI: 10.1177/18333583231151979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
BACKGROUND Clinician notes are structured in a variety of ways. This research pilot tested an innovative study design and explored the impact of note formats on diagnostic accuracy and documentation review time. OBJECTIVE To compare two formats for clinical documentation (narrative format vs. list of findings) on clinician diagnostic accuracy and documentation review time. METHOD Participants diagnosed written clinical cases, half in narrative format, and half in list format. Diagnostic accuracy (defined as including correct case diagnosis among top three diagnoses) and time spent processing the case scenario were measured for each format. Generalised linear mixed regression models and bias-corrected bootstrap percentile confidence intervals for mean paired differences were used to analyse the primary research questions. RESULTS Odds of correctly diagnosing list format notes were 26% greater than with narrative notes. However, there is insufficient evidence that this difference is significant (75% CI 0.8-1.99). On average the list format notes required 85.6 more seconds to process and arrive at a diagnosis compared to narrative notes (95% CI -162.3, -2.77). Of cases where participants included the correct diagnosis, on average the list format notes required 94.17 more seconds compared to narrative notes (75% CI -195.9, -8.83). CONCLUSION This study offers note format considerations for those interested in improving clinical documentation and suggests directions for future research. Balancing the priority of clinician preference with value of structured data may be necessary. IMPLICATIONS This study provides a method and suggestive results for further investigation in usability of electronic documentation formats.
Collapse
Affiliation(s)
- Evita M Payton
- University of Alabama at Birmingham, Birmingham, AL, USA
| | - Mark L Graber
- Society to Improve Diagnosis in Medicine, Alpharetta, MD, USA
| | | | - Tapan Mehta
- University of Alabama at Birmingham, Birmingham, AL, USA
| | | | - Eta S Berner
- University of Alabama at Birmingham, Birmingham, AL, USA
| |
Collapse
|
2
|
Wieben AM, Alreshidi BG, Douthit BJ, Sileo M, Vyas P, Steege L, Gilmore-Bykovskyi A. Nurses' perceptions of the design, implementation, and adoption of machine learning clinical decision support: A descriptive qualitative study. J Nurs Scholarsh 2024. [PMID: 38898636 DOI: 10.1111/jnu.13001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 05/06/2024] [Accepted: 06/07/2024] [Indexed: 06/21/2024]
Abstract
INTRODUCTION The purpose of this study was to explore nurses' perspectives on Machine Learning Clinical Decision Support (ML CDS) design, development, implementation, and adoption. DESIGN Qualitative descriptive study. METHODS Nurses (n = 17) participated in semi-structured interviews. Data were transcribed, coded, and analyzed using Thematic analysis methods as described by Braun and Clarke. RESULTS Four major themes and 14 sub-themes highlight nurses' perspectives on autonomy in decision-making, the influence of prior experience in shaping their preferences for use of novel CDS tools, the need for clarity in why ML CDS is useful in improving practice/outcomes, and their desire to have nursing integrated in design and implementation of these tools. CONCLUSION This study provided insights into nurse perceptions regarding the utility and usability of ML CDS as well as the influence of previous experiences with technology and CDS, change management strategies needed at the time of implementation of ML CDS, the importance of nurse-perceived engagement in the development process, nurse information needs at the time of ML CDS deployment, and the perceived impact of ML CDS on nurse decision making autonomy. CLINICAL RELEVANCE This study contributes to the body of knowledge about the use of AI and machine learning (ML) in nursing practice. Through generation of insights drawn from nurses' perspectives, these findings can inform successful design and adoption of ML Clinical Decision Support.
Collapse
Affiliation(s)
- Ann M Wieben
- University of Wisconsin-Madison School of Nursing, Madison, Wisconsin, USA
| | - Bader G Alreshidi
- Department of Medical Surgical Nursing, University of Hail College of Nursing, Hail, Saudi Arabia
| | - Brian J Douthit
- United States Department of Veterans Affairs, Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, USA
| | - Marisa Sileo
- Boston Children's Hospital, Boston, Massachusetts, USA
| | | | - Linsey Steege
- University of Wisconsin-Madison School of Nursing, Madison, Wisconsin, USA
| | - Andrea Gilmore-Bykovskyi
- BerbeeWalsh Department of Emergency Medicine, University of Wisconsin-Madison School of Medicine & Public Health, Madison, Wisconsin, USA
| |
Collapse
|
3
|
Molinet B, Marro S, Cabrio E, Villata S. Explanatory argumentation in natural language for correct and incorrect medical diagnoses. J Biomed Semantics 2024; 15:8. [PMID: 38816758 PMCID: PMC11138001 DOI: 10.1186/s13326-024-00306-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 04/12/2024] [Indexed: 06/01/2024] Open
Abstract
BACKGROUND A huge amount of research is carried out nowadays in Artificial Intelligence to propose automated ways to analyse medical data with the aim to support doctors in delivering medical diagnoses. However, a main issue of these approaches is the lack of transparency and interpretability of the achieved results, making it hard to employ such methods for educational purposes. It is therefore necessary to develop new frameworks to enhance explainability in these solutions. RESULTS In this paper, we present a novel full pipeline to generate automatically natural language explanations for medical diagnoses. The proposed solution starts from a clinical case description associated with a list of correct and incorrect diagnoses and, through the extraction of the relevant symptoms and findings, enriches the information contained in the description with verified medical knowledge from an ontology. Finally, the system returns a pattern-based explanation in natural language which elucidates why the correct (incorrect) diagnosis is the correct (incorrect) one. The main contribution of the paper is twofold: first, we propose two novel linguistic resources for the medical domain (i.e, a dataset of 314 clinical cases annotated with the medical entities from UMLS, and a database of biological boundaries for common findings), and second, a full Information Extraction pipeline to extract symptoms and findings from the clinical cases and match them with the terms in a medical ontology and to the biological boundaries. An extensive evaluation of the proposed approach shows the our method outperforms comparable approaches. CONCLUSIONS Our goal is to offer AI-assisted educational support framework to form clinical residents to formulate sound and exhaustive explanations for their diagnoses to patients.
Collapse
Affiliation(s)
- Benjamin Molinet
- Université Côte d'Azur, CNRS, Inria, I3S, Rte des Lucioles, Sophia Antipolis, 06900, Alpes-Maritimes, France.
| | - Santiago Marro
- Université Côte d'Azur, CNRS, Inria, I3S, Rte des Lucioles, Sophia Antipolis, 06900, Alpes-Maritimes, France
| | - Elena Cabrio
- Université Côte d'Azur, CNRS, Inria, I3S, Rte des Lucioles, Sophia Antipolis, 06900, Alpes-Maritimes, France
| | - Serena Villata
- Université Côte d'Azur, CNRS, Inria, I3S, Rte des Lucioles, Sophia Antipolis, 06900, Alpes-Maritimes, France
| |
Collapse
|
4
|
Goh E, Gallo R, Hom J, Strong E, Weng Y, Kerman H, Cool J, Kanjee Z, Parsons AS, Ahuja N, Horvitz E, Yang D, Milstein A, Olson APJ, Rodman A, Chen JH. Influence of a Large Language Model on Diagnostic Reasoning: A Randomized Clinical Vignette Study. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.12.24303785. [PMID: 38559045 PMCID: PMC10980135 DOI: 10.1101/2024.03.12.24303785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Importance Diagnostic errors are common and cause significant morbidity. Large language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves diagnostic reasoning. Objective To assess the impact of the GPT-4 LLM on physicians' diagnostic reasoning compared to conventional resources. Design Multi-center, randomized clinical vignette study. Setting The study was conducted using remote video conferencing with physicians across the country and in-person participation across multiple academic medical institutions. Participants Resident and attending physicians with training in family medicine, internal medicine, or emergency medicine. Interventions Participants were randomized to access GPT-4 in addition to conventional diagnostic resources or to just conventional resources. They were allocated 60 minutes to review up to six clinical vignettes adapted from established diagnostic reasoning exams. Main Outcomes and Measures The primary outcome was diagnostic performance based on differential diagnosis accuracy, appropriateness of supporting and opposing factors, and next diagnostic evaluation steps. Secondary outcomes included time spent per case and final diagnosis. Results 50 physicians (26 attendings, 24 residents) participated, with an average of 5.2 cases completed per participant. The median diagnostic reasoning score per case was 76.3 percent (IQR 65.8 to 86.8) for the GPT-4 group and 73.7 percent (IQR 63.2 to 84.2) for the conventional resources group, with an adjusted difference of 1.6 percentage points (95% CI -4.4 to 7.6; p=0.60). The median time spent on cases for the GPT-4 group was 519 seconds (IQR 371 to 668 seconds), compared to 565 seconds (IQR 456 to 788 seconds) for the conventional resources group, with a time difference of -82 seconds (95% CI -195 to 31; p=0.20). GPT-4 alone scored 15.5 percentage points (95% CI 1.5 to 29, p=0.03) higher than the conventional resources group. Conclusions and Relevance In a clinical vignette-based study, the availability of GPT-4 to physicians as a diagnostic aid did not significantly improve clinical reasoning compared to conventional resources, although it may improve components of clinical reasoning such as efficiency. GPT-4 alone demonstrated higher performance than both physician groups, suggesting opportunities for further improvement in physician-AI collaboration in clinical practice.
Collapse
Affiliation(s)
- Ethan Goh
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA
- Stanford Clinical Excellence Research Center, Stanford University, Stanford, CA
| | - Robert Gallo
- Center for Innovation to Implementation, VA Palo Alto Health Care System, PA, CA
| | - Jason Hom
- Stanford University School of Medicine, Stanford, CA
| | - Eric Strong
- Stanford University School of Medicine, Stanford, CA
| | - Yingjie Weng
- Quantitative Sciences Unit, Stanford University School of Medicine, Stanford, CA
| | - Hannah Kerman
- Beth Israel Deaconess Medical Center, Boston, MA
- Harvard Medical School, Boston, MA
| | - Josephine Cool
- Beth Israel Deaconess Medical Center, Boston, MA
- Harvard Medical School, Boston, MA
| | - Zahir Kanjee
- Beth Israel Deaconess Medical Center, Boston, MA
- Harvard Medical School, Boston, MA
| | | | - Neera Ahuja
- Stanford University School of Medicine, Stanford, CA
| | - Eric Horvitz
- Microsoft, Redmond, WA
- Stanford HAI, Stanford, CA
| | | | - Arnold Milstein
- Stanford Clinical Excellence Research Center, Stanford University, Stanford, CA
| | | | - Adam Rodman
- Beth Israel Deaconess Medical Center, Boston, MA
- Harvard Medical School, Boston, MA
| | - Jonathan H Chen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA
- Stanford Clinical Excellence Research Center, Stanford University, Stanford, CA
- Division of Hospital Medicine, Stanford University, Stanford, CA
| |
Collapse
|
5
|
Bakken S, Cimino JJ, Feldman S, Lorenzi NM. Celebrating Eta Berner and her influence on biomedical and health informatics. J Am Med Inform Assoc 2024; 31:549-551. [PMID: 38366906 PMCID: PMC10873777 DOI: 10.1093/jamia/ocae011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 01/11/2024] [Indexed: 02/18/2024] Open
Affiliation(s)
- Suzanne Bakken
- School of Nursing, Columbia University, New York, NY 10032, United States
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States
- Data Science Institute, Columbia University, New York, NY 10027, United States
| | - James J Cimino
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States
- Department of Biomedical Informatics and Data Science, Heersink School of Medicine, University of Alabama, Birmingham, AL 35233, United States
| | - Sue Feldman
- Department of Health Services Administration, School of Health Professions, University of Alabama, Birmingham, AL 35233, United States
- Department of Medical Education, Heersink School of Medicine, University of Alabama, Birmingham, AL 35233, United States
| | - Nancy M Lorenzi
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| |
Collapse
|
6
|
Benjamin MM, Rabbat MG. Artificial Intelligence in Transcatheter Aortic Valve Replacement: Its Current Role and Ongoing Challenges. Diagnostics (Basel) 2024; 14:261. [PMID: 38337777 PMCID: PMC10855497 DOI: 10.3390/diagnostics14030261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/18/2024] [Accepted: 01/20/2024] [Indexed: 02/12/2024] Open
Abstract
Transcatheter aortic valve replacement (TAVR) has emerged as a viable alternative to surgical aortic valve replacement, as accumulating clinical evidence has demonstrated its safety and efficacy. TAVR indications have expanded beyond high-risk or inoperable patients to include intermediate and low-risk patients with severe aortic stenosis. Artificial intelligence (AI) is revolutionizing the field of cardiology, aiding in the interpretation of medical imaging and developing risk models for at-risk individuals and those with cardiac disease. This article explores the growing role of AI in TAVR procedures and assesses its potential impact, with particular focus on its ability to improve patient selection, procedural planning, post-implantation monitoring and contribute to optimized patient outcomes. In addition, current challenges and future directions in AI implementation are highlighted.
Collapse
Affiliation(s)
- Mina M. Benjamin
- Division of Cardiovascular Medicine, SSM—Saint Louis University Hospital, Saint Louis University, Saint Louis, MO 63104, USA
| | - Mark G. Rabbat
- Department of Cardiovascular Medicine, Loyola University Medical Center, Maywood, IL 60153, USA;
- Department of Cardiology, Edward Hines Jr. VA Hospital, Hines, IL 60141, USA
| |
Collapse
|
7
|
Zhang H, Ogasawara K. Grad-CAM-Based Explainable Artificial Intelligence Related to Medical Text Processing. Bioengineering (Basel) 2023; 10:1070. [PMID: 37760173 PMCID: PMC10525184 DOI: 10.3390/bioengineering10091070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/28/2023] [Accepted: 09/06/2023] [Indexed: 09/29/2023] Open
Abstract
The opacity of deep learning makes its application challenging in the medical field. Therefore, there is a need to enable explainable artificial intelligence (XAI) in the medical field to ensure that models and their results can be explained in a manner that humans can understand. This study uses a high-accuracy computer vision algorithm model to transfer learning to medical text tasks and uses the explanatory visualization method known as gradient-weighted class activation mapping (Grad-CAM) to generate heat maps to ensure that the basis for decision-making can be provided intuitively or via the model. The system comprises four modules: pre-processing, word embedding, classifier, and visualization. We used Word2Vec and BERT to compare word embeddings and use ResNet and 1Dimension convolutional neural networks (CNN) to compare classifiers. Finally, the Bi-LSTM was used to perform text classification for direct comparison. With 25 epochs, the model that used pre-trained ResNet on the formalized text presented the best performance (recall of 90.9%, precision of 91.1%, and an F1 score of 90.2% weighted). This study uses ResNet to process medical texts through Grad-CAM-based explainable artificial intelligence and obtains a high-accuracy classification effect; at the same time, through Grad-CAM visualization, it intuitively shows the words to which the model pays attention when making predictions.
Collapse
Affiliation(s)
| | - Katsuhiko Ogasawara
- Graduate School of Health Science, Hokkaido University, N12-W5, Kitaku, Sapporo 060-0812, Japan
| |
Collapse
|
8
|
Kafke SD, Kuhlmey A, Schuster J, Blüher S, Czimmeck C, Zoellick JC, Grosse P. Can clinical decision support systems be an asset in medical education? An experimental approach. BMC MEDICAL EDUCATION 2023; 23:570. [PMID: 37568144 PMCID: PMC10416486 DOI: 10.1186/s12909-023-04568-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 08/04/2023] [Indexed: 08/13/2023]
Abstract
BACKGROUND Diagnostic accuracy is one of the major cornerstones of appropriate and successful medical decision-making. Clinical decision support systems (CDSSs) have recently been used to facilitate physician's diagnostic considerations. However, to date, little is known about the potential assets of CDSS for medical students in an educational setting. The purpose of our study was to explore the usefulness of CDSSs for medical students assessing their diagnostic performances and the influence of such software on students' trust in their own diagnostic abilities. METHODS Based on paper cases students had to diagnose two different patients using a CDSS and conventional methods such as e.g. textbooks, respectively. Both patients had a common disease, in one setting the clinical presentation was a typical one (tonsillitis), in the other setting (pulmonary embolism), however, the patient presented atypically. We used a 2x2x2 between- and within-subjects cluster-randomised controlled trial to assess the diagnostic accuracy in medical students, also by changing the order of the used resources (CDSS first or second). RESULTS Medical students in their 4th and 5th year performed equally well using conventional methods or the CDSS across the two cases (t(164) = 1,30; p = 0.197). Diagnostic accuracy and trust in the correct diagnosis were higher in the typical presentation condition than in the atypical presentation condition (t(85) = 19.97; p < .0001 and t(150) = 7.67; p < .0001).These results refute our main hypothesis that students diagnose more accurately when using conventional methods compared to the CDSS. CONCLUSIONS Medical students in their 4th and 5th year performed equally well in diagnosing two cases of common diseases with typical or atypical clinical presentations using conventional methods or a CDSS. Students were proficient in diagnosing a common disease with a typical presentation but underestimated their own factual knowledge in this scenario. Also, students were aware of their own diagnostic limitations when presented with a challenging case with an atypical presentation for which the use of a CDSS seemingly provided no additional insights.
Collapse
Affiliation(s)
- Sean D Kafke
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany.
| | - Adelheid Kuhlmey
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Johanna Schuster
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Stefan Blüher
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Constanze Czimmeck
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Jan C Zoellick
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Pascal Grosse
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
9
|
Bansal M. Clinical Evaluation of 'Computer-Aided Diagnosis InNeuro-Otology (CADINO)' in Terms of Usefulness, Functionality and Effectiveness. Indian J Otolaryngol Head Neck Surg 2022; 74:4434-4440. [PMID: 36742689 PMCID: PMC9895670 DOI: 10.1007/s12070-022-03092-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 01/13/2022] [Indexed: 02/07/2023] Open
Abstract
Computer-based medical diagnosis expert systems are considered both accurate and educationally helpful in most cases. Dizziness and vertigo are among the most common complaints however E.N.T. surgeons and neuro-otologists are not available in the peripheral areas. Computer-Aided Diagnosis In NeurOtology (CADINO) can be of immense value for these unprivileged dizzy patients of remote and rural areas. The study aimed to document the strength, weaknesses and capabilities of CADINO in terms of accuracy, educational usefulness, functionality and effectiveness. Design Hospital-based observational study of a diagnostic tool. Settings Otorhinolaryngology Department of a tertiary care medical college hospital. This prospective study was conducted in 70 patients, 24 simulated cases and 6 case reports from journals. The study included even the feedback of the clinicians before and after consultation. Eleven ENT residents, 14 ENT surgeons [8 teachers and 6 consultants] participated in the study. The overall diagnostic accuracy of the CADINO was found 86%. While in the patients, CADINO accuracy was found 84% approximately similar to faculties/consultants (80%) but it was significantly better than that of residents (57%). Most of the clinicians (84%), rated the CADINO consultation as being educationally helpful, and useful for patient management. CADINO was found very effective and convenient as it could be operated in the OPD simultaneously while evaluating the dizzy patients. CADINO provided accurate diagnostic suggestions. It was found improving patient safety and quality of care by enhancing knowledge and cognitive skills of the clinicians.
Collapse
Affiliation(s)
- Mohan Bansal
- Department of Otorhinolaryngology Head and Neck Surgery, Parul Institute of Medical Sciences and Research, Parul University, Limda, Waghodia, Vadodara, Gujarat India
| |
Collapse
|
10
|
Painter A, Hayhoe B, Riboli-Sasco E, El-Osta A. Online Symptom Checkers: Recommendations for a Vignette-Based Clinical Evaluation Standard. J Med Internet Res 2022; 24:e37408. [DOI: 10.2196/37408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Revised: 09/15/2022] [Accepted: 10/11/2022] [Indexed: 11/13/2022] Open
Abstract
The use of patient-facing online symptom checkers (OSCs) has expanded in recent years, but their accuracy, safety, and impact on patient behaviors and health care systems remain unclear. The lack of a standardized process of clinical evaluation has resulted in significant variation in approaches to OSC validation and evaluation. The aim of this paper is to characterize a set of congruent requirements for a standardized vignette-based clinical evaluation process of OSCs. Discrepancies in the findings of comparative studies to date suggest that different steps in OSC evaluation methodology can significantly influence outcomes. A standardized process with a clear specification for vignette-based clinical evaluation is urgently needed to guide developers and facilitate the objective comparison of OSCs. We propose 15 recommendation requirements for an OSC evaluation standard. A third-party evaluation process and protocols for prospective real-world evidence studies should also be prioritized to quality assure OSC assessment.
Collapse
|
11
|
Fritz P, Kleinhans A, Raoufi R, Sediqi A, Schmid N, Schricker S, Schanz M, Fritz-Kuisle C, Dalquen P, Firooz H, Stauch G, Alscher MD. Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin. BMC Med Inform Decis Mak 2022; 22:254. [PMID: 36153527 PMCID: PMC9509605 DOI: 10.1186/s12911-022-01988-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 08/29/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Medical decision support systems (CDSSs) are increasingly used in medicine, but their utility in daily medical practice is difficult to evaluate. One variant of CDSS is a generator of differential diagnoses (DDx generator). We performed a feasibility study on three different, publicly available data sets of medical cases in order to identify the frequency in which two different DDx generators provide helpful information (either by providing a list of differential diagnosis or recognizing the expert diagnosis if available) for a given case report.
Methods
Used data sets were n = 105 cases from a web-based forum of telemedicine with real life cases from Afghanistan (Afghan data set; AD), n = 124 cases discussed in a web-based medical forum (Coliquio data set; CD). Both websites are restricted for medical professionals only. The third data set consisted 50 special case reports published in the New England Journal of Medicine (NEJM). After keyword extraction, data were entered into two different DDx generators (IsabelHealth (IH), Memem7 (M7)) to examine differences in target diagnosis recognition and physician-rated usefulness between DDx generators.
Results
Both DDx generators detected the target diagnosis equally successfully (all cases: M7, 83/170 (49%); IH 90/170 (53%), NEJM: M7, 28/50 (56%); IH, 34/50 (68%); differences n.s.). Differences occurred in AD, where detection of an expert diagnosis was less successful with IH than with M7 (29.7% vs. 54.1%, p = 0.003). In contrast, in CD IH performed significantly better than M7 (73.9% vs. 32.6%, p = 0.021). Congruent identification of target diagnosis occurred in only 46/170 (27.1%) of cases. However, a qualitative analysis of the DDx results revealed useful complements from using the two systems in parallel.
Conclusion
Both DDx systems IsabelHealth and Memem7 provided substantial help in finding a helpful list of differential diagnoses or identifying the target diagnosis either in standard cases or complicated and rare cases. Our pilot study highlights the need for different levels of complexity and types of real-world medical test cases, as there are significant differences between DDx generators away from traditional case reports. Combining different results from DDx generators seems to be a possible approach for future review and use of the systems.
Collapse
|
12
|
Schmieding ML, Kopka M, Schmidt K, Schulz-Niethammer S, Balzer F, Feufel MA. Triage Accuracy of Symptom Checker Apps: 5-Year Follow-up Evaluation. J Med Internet Res 2022; 24:e31810. [PMID: 35536633 PMCID: PMC9131144 DOI: 10.2196/31810] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 11/19/2021] [Accepted: 01/30/2022] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Symptom checkers are digital tools assisting laypersons in self-assessing the urgency and potential causes of their medical complaints. They are widely used but face concerns from both patients and health care professionals, especially regarding their accuracy. A 2015 landmark study substantiated these concerns using case vignettes to demonstrate that symptom checkers commonly err in their triage assessment. OBJECTIVE This study aims to revisit the landmark index study to investigate whether and how symptom checkers' capabilities have evolved since 2015 and how they currently compare with laypersons' stand-alone triage appraisal. METHODS In early 2020, we searched for smartphone and web-based applications providing triage advice. We evaluated these apps on the same 45 case vignettes as the index study. Using descriptive statistics, we compared our findings with those of the index study and with publicly available data on laypersons' triage capability. RESULTS We retrieved 22 symptom checkers providing triage advice. The median triage accuracy in 2020 (55.8%, IQR 15.1%) was close to that in 2015 (59.1%, IQR 15.5%). The apps in 2020 were less risk averse (odds 1.11:1, the ratio of overtriage errors to undertriage errors) than those in 2015 (odds 2.82:1), missing >40% of emergencies. Few apps outperformed laypersons in either deciding whether emergency care was required or whether self-care was sufficient. No apps outperformed the laypersons on both decisions. CONCLUSIONS Triage performance of symptom checkers has, on average, not improved over the course of 5 years. It decreased in 2 use cases (advice on when emergency care is required and when no health care is needed for the moment). However, triage capability varies widely within the sample of symptom checkers. Whether it is beneficial to seek advice from symptom checkers depends on the app chosen and on the specific question to be answered. Future research should develop resources (eg, case vignette repositories) to audit the capabilities of symptom checkers continuously and independently and provide guidance on when and to whom they should be recommended.
Collapse
Affiliation(s)
- Malte L Schmieding
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Marvin Kopka
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Cognitive Psychology and Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| | - Konrad Schmidt
- Institute of General Practice and Family Medicine, Jena University Hospital, Germany, Jena, Germany
- Institute of General Practice and Family Medicine, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Sven Schulz-Niethammer
- Division of Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| | - Felix Balzer
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Markus A Feufel
- Division of Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| |
Collapse
|
13
|
Johnson AE, Brewer LC, Echols MR, Mazimba S, Shah RU, Breathett K. Utilizing Artificial Intelligence to Enhance Health Equity Among Patients with Heart Failure. Heart Fail Clin 2022; 18:259-273. [PMID: 35341539 PMCID: PMC8988237 DOI: 10.1016/j.hfc.2021.11.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Patients with heart failure (HF) are heterogeneous with various intrapersonal and interpersonal characteristics contributing to clinical outcomes. Bias, structural racism, and social determinants of health have been implicated in unequal treatment of patients with HF. Through several methodologies, artificial intelligence (AI) can provide models in HF prediction, prognostication, and provision of care, which may help prevent unequal outcomes. This review highlights AI as a strategy to address racial inequalities in HF; discusses key AI definitions within a health equity context; describes the current uses of AI in HF, strengths and harms in using AI; and offers recommendations for future directions.
Collapse
Affiliation(s)
- Amber E Johnson
- University of Pittsburgh School of Medicine, Heart and Vascular Institute, Veterans Affairs Pittsburgh Health System, 200 Lothrop Street, Pittsburgh, PA 15213, USA
| | - LaPrincess C Brewer
- Division of Preventive Cardiology, Department of Cardiovascular Medicine, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905, USA
| | - Melvin R Echols
- Division of Cardiovascular Medicine, Morehouse School of Medicine, 720 Westview Drive, Atlanta, GA 30310, USA
| | - Sula Mazimba
- Division of Cardiovascular Medicine, Advanced Heart Failure and Transplant Center, University of Virginia, 2nd Floor, 1221 Lee Street, Charlottesville, VA 22903, USA
| | - Rashmee U Shah
- Division of Cardiovascular Medicine, University of Utah, 30 N 1900 E, Cardiology, 4A100, Salt Lake City, UT 84132, USA
| | - Khadijah Breathett
- Division of Cardiovascular Medicine, Sarver Heart Center, University of Arizona, 1501 North Campbell Avenue, PO Box 245046, Tucson, AZ 85724, USA.
| |
Collapse
|
14
|
Ginghina O, Hudita A, Zamfir M, Spanu A, Mardare M, Bondoc I, Buburuzan L, Georgescu SE, Costache M, Negrei C, Nitipir C, Galateanu B. Liquid Biopsy and Artificial Intelligence as Tools to Detect Signatures of Colorectal Malignancies: A Modern Approach in Patient's Stratification. Front Oncol 2022; 12:856575. [PMID: 35356214 PMCID: PMC8959149 DOI: 10.3389/fonc.2022.856575] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 02/16/2022] [Indexed: 01/19/2023] Open
Abstract
Colorectal cancer (CRC) is the second most frequently diagnosed type of cancer and a major worldwide public health concern. Despite the global efforts in the development of modern therapeutic strategies, CRC prognosis is strongly correlated with the stage of the disease at diagnosis. Early detection of CRC has a huge impact in decreasing mortality while pre-lesion detection significantly reduces the incidence of the pathology. Even though the management of CRC patients is based on robust diagnostic methods such as serum tumor markers analysis, colonoscopy, histopathological analysis of tumor tissue, and imaging methods (computer tomography or magnetic resonance), these strategies still have many limitations and do not fully satisfy clinical needs due to their lack of sensitivity and/or specificity. Therefore, improvements of the current practice would substantially impact the management of CRC patients. In this view, liquid biopsy is a promising approach that could help clinicians screen for disease, stratify patients to the best treatment, and monitor treatment response and resistance mechanisms in the tumor in a regular and minimally invasive manner. Liquid biopsies allow the detection and analysis of different tumor-derived circulating markers such as cell-free nucleic acids (cfNA), circulating tumor cells (CTCs), and extracellular vesicles (EVs) in the bloodstream. The major advantage of this approach is its ability to trace and monitor the molecular profile of the patient's tumor and to predict personalized treatment in real-time. On the other hand, the prospective use of artificial intelligence (AI) in medicine holds great promise in oncology, for the diagnosis, treatment, and prognosis prediction of disease. AI has two main branches in the medical field: (i) a virtual branch that includes medical imaging, clinical assisted diagnosis, and treatment, as well as drug research, and (ii) a physical branch that includes surgical robots. This review summarizes findings relevant to liquid biopsy and AI in CRC for better management and stratification of CRC patients.
Collapse
Affiliation(s)
- Octav Ginghina
- Department II, University of Medicine and Pharmacy “Carol Davila” Bucharest, Bucharest, Romania
- Department of Surgery, “Sf. Ioan” Clinical Emergency Hospital, Bucharest, Romania
| | - Ariana Hudita
- Department of Biochemistry and Molecular Biology, University of Bucharest, Bucharest, Romania
| | - Marius Zamfir
- Department of Surgery, “Sf. Ioan” Clinical Emergency Hospital, Bucharest, Romania
| | - Andrada Spanu
- Department of Surgery, “Sf. Ioan” Clinical Emergency Hospital, Bucharest, Romania
| | - Mara Mardare
- Department of Surgery, “Sf. Ioan” Clinical Emergency Hospital, Bucharest, Romania
| | - Irina Bondoc
- Department of Surgery, “Sf. Ioan” Clinical Emergency Hospital, Bucharest, Romania
| | | | - Sergiu Emil Georgescu
- Department of Biochemistry and Molecular Biology, University of Bucharest, Bucharest, Romania
| | - Marieta Costache
- Department of Biochemistry and Molecular Biology, University of Bucharest, Bucharest, Romania
| | - Carolina Negrei
- Department of Toxicology, University of Medicine and Pharmacy “Carol Davila” Bucharest, Bucharest, Romania
| | - Cornelia Nitipir
- Department II, University of Medicine and Pharmacy “Carol Davila” Bucharest, Bucharest, Romania
- Department of Oncology, Elias University Emergency Hospital, Bucharest, Romania
| | - Bianca Galateanu
- Department of Biochemistry and Molecular Biology, University of Bucharest, Bucharest, Romania
| |
Collapse
|
15
|
Brush JE, Hajduk AM, Greene EJ, Dreyer RP, Krumholz HM, Chaudhry SI. Sex Differences in Symptom Phenotypes Among Older Patients with Acute Myocardial Infarction. Am J Med 2022; 135:342-349. [PMID: 34715061 PMCID: PMC8901454 DOI: 10.1016/j.amjmed.2021.09.022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 09/21/2021] [Accepted: 09/28/2021] [Indexed: 01/05/2023]
Abstract
BACKGROUND Clinicians make a medical diagnosis by recognizing diagnostic possibilities, often using memories of prior examples. These memories, called "exemplars," reflect specific symptom combinations in individual patients, yet most clinical studies report how symptoms aggregate in populations. We studied how symptoms of acute myocardial infarction combine in individuals as symptom phenotypes and how symptom phenotypes are distributed in women and men. METHODS In this analysis of the SILVER-AMI Study, we studied 3041 patients (1346 women and 1645 men) 75 years of age or older with acute myocardial infarction. Each patient had a standardized in-person interview during the acute myocardial infarction admission to document the presenting symptoms, which enabled a thorough examination of symptom combinations in individuals. Specific symptom combinations defined symptom phenotypes and distributions of symptom phenotypes were compared in women and men using Monte Carlo permutation testing and repeated subsampling. RESULTS There were 1469 unique symptom phenotypes in the entire SILVER-AMI cohort of patients with acute myocardial infarction. There were 831 unique symptom phenotypes in women, as compared with 819 in men, which was highly significant, given the larger number of men than women in the study (P < .0001). Women had significantly more symptom phenotypes than men in almost all acute myocardial infarction subgroups. CONCLUSIONS Older patients with acute myocardial infarction have enormous variation in symptom phenotypes. Women reported more symptoms and had significantly more symptom phenotypes than men. Appreciation of the diversity of symptom phenotypes may help clinicians recognize the less common phenotypes that occur more often in women.
Collapse
Affiliation(s)
- John E Brush
- Sentara Healthcare and Eastern Virginia Medical School, Norfolk.
| | - Alexandra M Hajduk
- Section of Geriatrics, Department of Internal Medicine, Yale School of Medicine, New Haven, Conn
| | - Erich J Greene
- Department of Health Policy and Management and Department of Biostatistics, Yale School of Medicine, New Haven, Conn
| | - Rachel P Dreyer
- Section of Cardiovascular Medicine, Department of Internal Medicine and Department of Emergency Medicine, Yale School of Medicine, New Haven, Conn; Yale School of Public Health; Center for Outcomes Research and Evaluation, Yale-New Haven Hospital, New Haven, Conn
| | - Harlan M Krumholz
- Department of Health Policy and Management and Department of Biostatistics, Yale School of Medicine, New Haven, Conn; Section of Cardiovascular Medicine, Department of Internal Medicine and Department of Emergency Medicine, Yale School of Medicine, New Haven, Conn; Yale School of Public Health; Center for Outcomes Research and Evaluation, Yale-New Haven Hospital, New Haven, Conn
| | - Sarwat I Chaudhry
- Section of General Internal Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, Conn
| |
Collapse
|
16
|
Abstract
Research in cognitive psychology shows that expert clinicians make a medical diagnosis through a two step process of hypothesis generation and hypothesis testing. Experts generate a list of possible diagnoses quickly and intuitively, drawing on previous experience. Experts remember specific examples of various disease categories as exemplars, which enables rapid access to diagnostic possibilities and gives them an intuitive sense of the base rates of various diagnoses. After generating diagnostic hypotheses, clinicians then test the hypotheses and subjectively estimate the probability of each diagnostic possibility by using a heuristic called anchoring and adjusting. Although both novices and experts use this two step diagnostic process, experts distinguish themselves as better diagnosticians through their ability to mobilize experiential knowledge in a manner that is content specific. Experience is clearly the best teacher, but some educational strategies have been shown to modestly improve diagnostic accuracy. Increased knowledge about the cognitive psychology of the diagnostic process and the pitfalls inherent in the process may inform clinical teachers and help learners and clinicians to improve the accuracy of diagnostic reasoning. This article reviews the literature on the cognitive psychology of diagnostic reasoning in the context of cardiovascular disease.
Collapse
Affiliation(s)
- John E Brush
- Sentara Health Research Center, Norfolk, VA, USA
- Eastern Virginia Medical School, Norfolk, VA, USA
| | - Jonathan Sherbino
- McMaster Education Research, Innovation and Theory (MERIT) Program, McMaster University, Hamilton, ON, Canada
- Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - Geoffrey R Norman
- McMaster Education Research, Innovation and Theory (MERIT) Program, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
17
|
Yang G, Ye Q, Xia J. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: A mini-review, two showcases and beyond. AN INTERNATIONAL JOURNAL ON INFORMATION FUSION 2022; 77:29-52. [PMID: 34980946 PMCID: PMC8459787 DOI: 10.1016/j.inffus.2021.07.016] [Citation(s) in RCA: 140] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 05/25/2021] [Accepted: 07/25/2021] [Indexed: 05/04/2023]
Abstract
Explainable Artificial Intelligence (XAI) is an emerging research topic of machine learning aimed at unboxing how AI systems' black-box choices are made. This research field inspects the measures and models involved in decision-making and seeks solutions to explain them explicitly. Many of the machine learning algorithms cannot manifest how and why a decision has been cast. This is particularly true of the most popular deep neural network approaches currently in use. Consequently, our confidence in AI systems can be hindered by the lack of explainability in these black-box models. The XAI becomes more and more crucial for deep learning powered applications, especially for medical and healthcare studies, although in general these deep neural networks can return an arresting dividend in performance. The insufficient explainability and transparency in most existing AI systems can be one of the major reasons that successful implementation and integration of AI tools into routine clinical practice are uncommon. In this study, we first surveyed the current progress of XAI and in particular its advances in healthcare applications. We then introduced our solutions for XAI leveraging multi-modal and multi-centre data fusion, and subsequently validated in two showcases following real clinical scenarios. Comprehensive quantitative and qualitative analyses can prove the efficacy of our proposed XAI solutions, from which we can envisage successful applications in a broader range of clinical questions.
Collapse
Affiliation(s)
- Guang Yang
- National Heart and Lung Institute, Imperial College London, London, UK
- Royal Brompton Hospital, London, UK
- Imperial Institute of Advanced Technology, Hangzhou, China
| | - Qinghao Ye
- Hangzhou Ocean’s Smart Boya Co., Ltd, China
- University of California, San Diego, La Jolla, CA, USA
| | - Jun Xia
- Radiology Department, Shenzhen Second People’s Hospital, Shenzhen, China
| |
Collapse
|
18
|
Ben-Shabat N, Sloma A, Weizman T, Kiderman D, Amital H. Diagnostic Performance of a New Artificial-Intelligence Driven Diagnostic Support Tool: Board-Exams Clinical Vignette Study. JMIR Med Inform 2021; 9:e32507. [PMID: 34672262 PMCID: PMC8672291 DOI: 10.2196/32507] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 10/20/2021] [Accepted: 10/20/2021] [Indexed: 01/01/2023] Open
Abstract
Background Diagnostic decision support systems (DDSS) are computer programs aimed to improve health care by supporting clinicians in the process of diagnostic decision-making. Previous studies on DDSS demonstrated their ability to enhance clinicians’ diagnostic skills, prevent diagnostic errors, and reduce hospitalization costs. Despite the potential benefits, their utilization in clinical practice is limited, emphasizing the need for new and improved products. Objective The aim of this study was to conduct a preliminary analysis of the diagnostic performance of “Kahun,” a new artificial intelligence-driven diagnostic tool. Methods Diagnostic performance was evaluated based on the program’s ability to “solve” clinical cases from the United States Medical Licensing Examination Step 2 Clinical Skills board exam simulations that were drawn from the case banks of 3 leading preparation companies. Each case included 3 expected differential diagnoses. The cases were entered into the Kahun platform by 3 blinded junior physicians. For each case, the presence and the rank of the correct diagnoses within the generated differential diagnoses list were recorded. Each diagnostic performance was measured in two ways: first, as diagnostic sensitivity, and second, as case-specific success rates that represent diagnostic comprehensiveness. Results The study included 91 clinical cases with 78 different chief complaints and a mean number of 38 (SD 8) findings for each case. The total number of expected diagnoses was 272, of which 174 were different (some appeared more than once). Of the 272 expected diagnoses, 231 (87.5%; 95% CI 76-99) diagnoses were suggested within the top 20 listed diagnoses, 209 (76.8%; 95% CI 66-87) were suggested within the top 10, and 168 (61.8%; 95% CI 52-71) within the top 5. The median rank of correct diagnoses was 3 (IQR 2-6). Of the 91 expected diagnoses, 62 (68%; 95% CI 59-78) of the cases were suggested within the top 20 listed diagnoses, 44 (48%; 95% CI 38-59) within the top 10, and 24 (26%; 95% CI 17-35) within the top 5. Of the 91 expected diagnoses, in 87 (96%; 95% CI 91-100), at least 2 out of 3 of the cases’ expected diagnoses were suggested within the top 20 listed diagnoses; 78 (86%; 95% CI 79-93) were suggested within the top 10; and 61 (67%; 95% CI 57-77) within the top 5. Conclusions The diagnostic support tool evaluated in this study demonstrated good diagnostic accuracy and comprehensiveness; it also had the ability to manage a wide range of clinical findings.
Collapse
Affiliation(s)
- Niv Ben-Shabat
- Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv, IL.,Department of Medicine 'B', Sheba Medical Center, Sheba Road 2, Ramat Gan, IL.,Kahun Medical Ltd, Tel-Aviv, IL
| | - Arial Sloma
- Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv, IL.,Kahun Medical Ltd, Tel-Aviv, IL
| | - Tomer Weizman
- The Ruth and Bruce Rappaport Faculty of Medicine, Technion Israel Institute of Technology, Haifa, IL.,Kahun Medical Ltd, Tel-Aviv, IL
| | - David Kiderman
- Hadassah Faculty of Medicine, The Hebrew University, Jerusalem, IL
| | - Howard Amital
- Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv, IL.,Department of Medicine 'B', Sheba Medical Center, Sheba Road 2, Ramat Gan, IL.,Siaal Research Center for Family Medicine and Primary Care, Faculty of Health Sciences, Ben Gurion University of the Negev, Beer-Sheva, IL
| |
Collapse
|
19
|
Sibbald M, Monteiro S, Sherbino J, LoGiudice A, Friedman C, Norman G. Should electronic differential diagnosis support be used early or late in the diagnostic process? A multicentre experimental study of Isabel. BMJ Qual Saf 2021; 31:426-433. [PMID: 34611040 PMCID: PMC9132870 DOI: 10.1136/bmjqs-2021-013493] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 09/09/2021] [Indexed: 12/17/2022]
Abstract
Background Diagnostic errors unfortunately remain common. Electronic differential diagnostic support (EDS) systems may help, but it is unclear when and how they ought to be integrated into the diagnostic process. Objective To explore how much EDS improves diagnostic accuracy, and whether EDS should be used early or late in the diagnostic process. Setting 6 Canadian medical schools. A volunteer sample of 67 medical students, 62 residents in internal medicine or emergency medicine, and 61 practising internists or emergency medicine physicians were recruited in May through June 2020. Intervention Participants were randomised to make use of EDS either early (after the chief complaint) or late (after the complete history and physical is available) in the diagnostic process while solving each of 16 written cases. For each case, we measured the number of diagnoses proposed in the differential diagnosis and how often the correct diagnosis was present within the differential. Results EDS increased the number of diagnostic hypotheses by 2.32 (95% CI 2.10 to 2.49) when used early in the process and 0.89 (95% CI 0.69 to 1.10) when used late in the process (both p<0.001). Both early and late use of EDS increased the likelihood of the correct diagnosis being present in the differential (7% and 8%, respectively, both p<0.001). Whereas early use increased the number of diagnostic hypotheses (most notably for students and residents), late use increased the likelihood of the correct diagnosis being present in the differential regardless of one’s experience level. Conclusions and relevance EDS increased the number of diagnostic hypotheses and the likelihood of the correct diagnosis appearing in the differential, and these effects persisted irrespective of whether EDS was used early or late in the diagnostic process.
Collapse
Affiliation(s)
- Matt Sibbald
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Sandra Monteiro
- Department of Health Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Jonathan Sherbino
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | | | | | - Geoffrey Norman
- Department of Health Evidence and Impact, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
20
|
Vinny PW, Takkar A, Lal V, Padma MV, Sylaja PN, Narasimhan L, Dwivedi SN, Nair PP, Iype T, Gupta A, Vishnu VY. Mobile application as a complementary tool for differential diagnosis in Neuro-ophthalmology: A multicenter cross-sectional study. Indian J Ophthalmol 2021; 69:1491-1497. [PMID: 34011726 PMCID: PMC8302325 DOI: 10.4103/ijo.ijo_1929_20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Purpose: Drawing differential diagnoses to a Neuro-ophthalmology clinical scenario is a difficult task for a neurology trainee. The authors conducted a study to determine if a mobile application specialized in suggesting differential diagnoses from clinical scenarios can complement clinical reasoning of a neurologist in training. Methods: A cross-sectional multicenter study was conducted to compare the accuracy of neurology residents versus a mobile medical app (Neurology Dx) in drawing a comprehensive list of differential diagnoses from Neuro-ophthalmology clinical vignettes. The differentials generated by residents and the App were compared with the Gold standard differential diagnoses adjudicated by experts. The prespecified primary outcome was the proportion of correctly identified high likely gold standard differential diagnosis by residents and App. Results: Neurology residents (n = 100) attempted 1500 Neuro-ophthalmology clinical vignettes. Frequency of correctly identified high likely differential diagnosis by residents was 19.42% versus 53.71% by the App (P < 0.0001). The first listed differential diagnosis by the residents matched with that of the first differential diagnosis adjudicated by experts (gold standard differential diagnosis) with a frequency of 26.5% versus 28.3% by the App, whereas the combined output of residents and App scored a frequency of 41.2% in identifying the first gold standard differential correctly. The residents correctly identified the first three and first five gold standard differential diagnosis with a frequency of 17.83% and 19.2%, respectively, as against 22.26% and 30.39% (P < 0.0001) by the App. Conclusion: A ruled based app in Neuro-ophthalmology has the potential to complement a neurology resident in drawing a comprehensive list of differential diagnoses.
Collapse
Affiliation(s)
| | - Aastha Takkar
- Neurology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Vivek Lal
- Neurology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | | | - P N Sylaja
- Neurology, Sree Chitra Tirunal Institute of Medical Sciences and Technology, Thiruvananthapuram, Kerala, India
| | | | - Sada Nand Dwivedi
- Biostatistics, All India Institute of Medical Sciences, New Delhi, India
| | - Pradeep P Nair
- Neurology, Jawaharlal Nehru Institute of Postgraduate Medical Education and Research, Puducherry, India
| | - Thomas Iype
- Neurology, Government Medical College Trivandrum, Kerala, India
| | - Anu Gupta
- Neurology, Govind Ballabh Pant Institute of Postgraduate Medical Education and Research, New Delhi, India
| | | |
Collapse
|
21
|
Schmieding ML, Mörgeli R, Schmieding MAL, Feufel MA, Balzer F. Benchmarking Triage Capability of Symptom Checkers Against That of Medical Laypersons: Survey Study. J Med Internet Res 2021; 23:e24475. [PMID: 33688845 PMCID: PMC7991983 DOI: 10.2196/24475] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 10/22/2020] [Accepted: 01/18/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Symptom checkers (SCs) are tools developed to provide clinical decision support to laypersons. Apart from suggesting probable diagnoses, they commonly advise when users should seek care (triage advice). SCs have become increasingly popular despite prior studies rating their performance as mediocre. To date, it is unclear whether SCs can triage better than those who might choose to use them. OBJECTIVE This study aims to compare triage accuracy between SCs and their potential users (ie, laypersons). METHODS On Amazon Mechanical Turk, we recruited 91 adults from the United States who had no professional medical background. In a web-based survey, the participants evaluated 45 fictitious clinical case vignettes. Data for 15 SCs that had processed the same vignettes were obtained from a previous study. As main outcome measures, we assessed the accuracy of the triage assessments made by participants and SCs for each of the three triage levels (ie, emergency care, nonemergency care, self-care) and overall, the proportion of participants outperforming each SC in terms of accuracy, and the risk aversion of participants and SCs by comparing the proportion of cases that were overtriaged. RESULTS The mean overall triage accuracy was similar for participants (60.9%, SD 6.8%; 95% CI 59.5%-62.3%) and SCs (58%, SD 12.8%). Most participants outperformed all but 5 SCs. On average, SCs more reliably detected emergencies (80.6%, SD 17.9%) than laypersons did (67.5%, SD 16.4%; 95% CI 64.1%-70.8%). Although both SCs and participants struggled with cases requiring self-care (the least urgent triage category), SCs more often wrongly classified these cases as emergencies (43/174, 24.7%) compared with laypersons (56/1365, 4.10%). CONCLUSIONS Most SCs had no greater triage capability than an average layperson, although the triage accuracy of the five best SCs was superior to the accuracy of most participants. SCs might improve early detection of emergencies but might also needlessly increase resource utilization in health care. Laypersons sometimes require support in deciding when to rely on self-care but it is in that very situation where SCs perform the worst. Further research is needed to determine how to best combine the strengths of humans and SCs.
Collapse
Affiliation(s)
- Malte L Schmieding
- Department of Anesthesiology and Operative Intensive Care, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Rudolf Mörgeli
- Department of Anesthesiology and Operative Intensive Care, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Maike A L Schmieding
- Department of Biology, Chemistry, and Pharmacy, Institute of Pharmacy, Freie Universität Berlin, Berlin, Germany
| | - Markus A Feufel
- Department of Psychology and Ergonomics (IPA), Division of Ergonomics, Technische Universität Berlin, Berlin, Germany
| | - Felix Balzer
- Department of Anesthesiology and Operative Intensive Care, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
22
|
Jones OT, Calanzani N, Saji S, Duffy SW, Emery J, Hamilton W, Singh H, de Wit NJ, Walter FM. Artificial Intelligence Techniques That May Be Applied to Primary Care Data to Facilitate Earlier Diagnosis of Cancer: Systematic Review. J Med Internet Res 2021; 23:e23483. [PMID: 33656443 PMCID: PMC7970165 DOI: 10.2196/23483] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 11/05/2020] [Accepted: 11/30/2020] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND More than 17 million people worldwide, including 360,000 people in the United Kingdom, were diagnosed with cancer in 2018. Cancer prognosis and disease burden are highly dependent on the disease stage at diagnosis. Most people diagnosed with cancer first present in primary care settings, where improved assessment of the (often vague) presenting symptoms of cancer could lead to earlier detection and improved outcomes for patients. There is accumulating evidence that artificial intelligence (AI) can assist clinicians in making better clinical decisions in some areas of health care. OBJECTIVE This study aimed to systematically review AI techniques that may facilitate earlier diagnosis of cancer and could be applied to primary care electronic health record (EHR) data. The quality of the evidence, the phase of development the AI techniques have reached, the gaps that exist in the evidence, and the potential for use in primary care were evaluated. METHODS We searched MEDLINE, Embase, SCOPUS, and Web of Science databases from January 01, 2000, to June 11, 2019, and included all studies providing evidence for the accuracy or effectiveness of applying AI techniques for the early detection of cancer, which may be applicable to primary care EHRs. We included all study designs in all settings and languages. These searches were extended through a scoping review of AI-based commercial technologies. The main outcomes assessed were measures of diagnostic accuracy for cancer. RESULTS We identified 10,456 studies; 16 studies met the inclusion criteria, representing the data of 3,862,910 patients. A total of 13 studies described the initial development and testing of AI algorithms, and 3 studies described the validation of an AI algorithm in independent data sets. One study was based on prospectively collected data; only 3 studies were based on primary care data. We found no data on implementation barriers or cost-effectiveness. Risk of bias assessment highlighted a wide range of study quality. The additional scoping review of commercial AI technologies identified 21 technologies, only 1 meeting our inclusion criteria. Meta-analysis was not undertaken because of the heterogeneity of AI modalities, data set characteristics, and outcome measures. CONCLUSIONS AI techniques have been applied to EHR-type data to facilitate early diagnosis of cancer, but their use in primary care settings is still at an early stage of maturity. Further evidence is needed on their performance using primary care data, implementation barriers, and cost-effectiveness before widespread adoption into routine primary care clinical practice can be recommended.
Collapse
Affiliation(s)
- Owain T Jones
- Primary Care Unit, Department of Public Health & Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Natalia Calanzani
- Primary Care Unit, Department of Public Health & Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Smiji Saji
- Primary Care Unit, Department of Public Health & Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Stephen W Duffy
- Wolfson Institute for Preventive Medicine, Queen Mary University of London, London, United Kingdom
| | - Jon Emery
- Centre for Cancer Research and Department of General Practice, University of Melbourne, Victoria, Australia
| | - Willie Hamilton
- College of Medicine and Health, University of Exeter, Exeter, United Kingdom
| | - Hardeep Singh
- Center for Innovations in Quality, Effectiveness and Safety, Michael E DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, TX, United States
| | - Niek J de Wit
- Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht, Netherlands
| | - Fiona M Walter
- Primary Care Unit, Department of Public Health & Primary Care, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
23
|
Gilbert S, Mehl A, Baluch A, Cawley C, Challiner J, Fraser H, Millen E, Montazeri M, Multmeier J, Pick F, Richter C, Türk E, Upadhyay S, Virani V, Vona N, Wicks P, Novorol C. How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs. BMJ Open 2020; 10:e040269. [PMID: 33328258 PMCID: PMC7745523 DOI: 10.1136/bmjopen-2020-040269] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
OBJECTIVES To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps. DESIGN Vignettes study. SETTING 200 primary care vignettes. INTERVENTION/COMPARATOR For eight apps and seven general practitioners (GPs): breadth of coverage and condition-suggestion and urgency advice accuracy measured against the vignettes' gold-standard. PRIMARY OUTCOME MEASURES (1) Proportion of conditions 'covered' by an app, that is, not excluded because the user was too young/old or pregnant, or not modelled; (2) proportion of vignettes with the correct primary diagnosis among the top 3 conditions suggested; (3) proportion of 'safe' urgency advice (ie, at gold standard level, more conservative, or no more than one level less conservative). RESULTS Condition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%; WebMD: 93.0%. Top-3 suggestion accuracy was GPs (average): 82.1%±5.2%; Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%; Your.MD: 23.5%. Some apps excluded certain user demographics or conditions and their performance was generally greater with the exclusion of corresponding vignettes. For safe urgency advice, tested GPs had an average of 97.0%±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 SD of the GPs-Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 SDs of GPs-Your.MD: 92.6%. Three apps had a safety performance outside 2 SDs of GPs-Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3×10-3). CONCLUSIONS The utility of digital symptom assessment apps relies on coverage, accuracy and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care.
Collapse
Affiliation(s)
| | | | | | | | | | - Hamish Fraser
- Brown Center for Biomedical Informatics, Brown University, Rhode Island, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Khemasuwan D, Sorensen JS, Colt HG. Artificial intelligence in pulmonary medicine: computer vision, predictive model and COVID-19. Eur Respir Rev 2020; 29:29/157/200181. [PMID: 33004526 PMCID: PMC7537944 DOI: 10.1183/16000617.0181-2020] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Accepted: 08/20/2020] [Indexed: 12/21/2022] Open
Abstract
Artificial intelligence (AI) is transforming healthcare delivery. The digital revolution in medicine and healthcare information is prompting a staggering growth of data intertwined with elements from many digital sources such as genomics, medical imaging and electronic health records. Such massive growth has sparked the development of an increasing number of AI-based applications that can be deployed in clinical practice. Pulmonary specialists who are familiar with the principles of AI and its applications will be empowered and prepared to seize future practice and research opportunities. The goal of this review is to provide pulmonary specialists and other readers with information pertinent to the use of AI in pulmonary medicine. First, we describe the concept of AI and some of the requisites of machine learning and deep learning. Next, we review some of the literature relevant to the use of computer vision in medical imaging, predictive modelling with machine learning, and the use of AI for battling the novel severe acute respiratory syndrome-coronavirus-2 pandemic. We close our review with a discussion of limitations and challenges pertaining to the further incorporation of AI into clinical pulmonary practice. Artificial intelligence (AI) is changing the landscape in medicine. AI-based applications will empower pulmonary specialists to seize modern practice and research opportunities. Data-driven precision medicine is already here.https://bit.ly/324tl2m
Collapse
Affiliation(s)
- Danai Khemasuwan
- Division of Pulmonary and Critical Care Medicine, Virginia Commonwealth University, Richmond, VA, USA
| | | | - Henri G Colt
- Division of Pulmonary and Critical Care Medicine, University of California Irvine, Irvine, CA, USA
| |
Collapse
|
25
|
Mathur P, Srivastava S, Xu X, Mehta JL. Artificial Intelligence, Machine Learning, and Cardiovascular Disease. CLINICAL MEDICINE INSIGHTS-CARDIOLOGY 2020; 14:1179546820927404. [PMID: 32952403 PMCID: PMC7485162 DOI: 10.1177/1179546820927404] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2019] [Accepted: 04/23/2020] [Indexed: 12/11/2022]
Abstract
Artificial intelligence (AI)-based applications have found widespread
applications in many fields of science, technology, and medicine. The use of
enhanced computing power of machines in clinical medicine and diagnostics has
been under exploration since the 1960s. More recently, with the advent of
advances in computing, algorithms enabling machine learning, especially deep
learning networks that mimic the human brain in function, there has been renewed
interest to use them in clinical medicine. In cardiovascular medicine, AI-based
systems have found new applications in cardiovascular imaging, cardiovascular
risk prediction, and newer drug targets. This article aims to describe different
AI applications including machine learning and deep learning and their
applications in cardiovascular medicine. AI-based applications have enhanced our
understanding of different phenotypes of heart failure and congenital heart
disease. These applications have led to newer treatment strategies for different
types of cardiovascular diseases, newer approach to cardiovascular drug therapy
and postmarketing survey of prescription drugs. However, there are several
challenges in the clinical use of AI-based applications and interpretation of
the results including data privacy, poorly selected/outdated data, selection
bias, and unintentional continuance of historical biases/stereotypes in the data
which can lead to erroneous conclusions. Still, AI is a transformative
technology and has immense potential in health care.
Collapse
Affiliation(s)
- Pankaj Mathur
- Department of Internal Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - Shweta Srivastava
- Department of Radiology, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - Xiaowei Xu
- Department of Information Science, University of Arkansas at Little Rock, Little Rock, AR USA
| | - Jawahar L Mehta
- Division of Cardiology, Department of Internal Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| |
Collapse
|
26
|
Lin Y, Li Y, Lu K, Ma C, Zhao P, Gao D, Fan Z, Cheng Z, Wang Z, Yu S. Long-distance disorder-disorder relation extraction with bootstrapped noisy data. J Biomed Inform 2020; 109:103529. [PMID: 32771539 DOI: 10.1016/j.jbi.2020.103529] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 06/04/2020] [Accepted: 08/04/2020] [Indexed: 11/18/2022]
Abstract
OBJECTIVE Artificial intelligence in healthcare increasingly relies on relations in knowledge graphs for algorithm development. However, many important relations are not well covered in existing knowledge graphs. We aim to develop a novel long-distance relation extraction algorithm that leverages the article section structure and is trained with bootstrapped noisy data to identify important relations for diagnosis, including may cause, may be caused by, and differential diagnosis. METHODS Known relations were extracted from semistructured web pages and a relational database and were paired with sentences containing corresponding medical concepts to form training data. The sentence form was extended to allow one concept to be in the title. An attention mechanism was applied to reduce the effect of noisily labeled sentences. Section structure embedding was added to provide additional context for relation expressions. Graph information was further incorporated into the model to differentiate the target relations whose expressions were often similar and interwoven. RESULTS The extended sentence form allowed 1.75 times as many relations and 2.17 times as many sentences to be found compared to the conventional form. The various components of the proposed model all added to the accuracy. Overall, the positive sample accuracy of the proposed model was 9 percentage points higher than baseline deep learning models and 13 percentage points higher than naïve Bayes and support vector machines. CONCLUSION Our bootstrap data preparation method and the extended sentence form could form a large training dataset to enable algorithm development and data mining efforts. Section structure embedding and graph information significantly increased prediction accuracy.
Collapse
Affiliation(s)
- Yucong Lin
- Center for Statistical Science, Tsinghua University, Beijing, Beijing, China; Department of Industrial Engineering, Tsinghua University, Beijing, Beijing, China
| | - Yang Li
- Department of Statistics, University of Michigan, Ann Arbor, MI, USA
| | - Keming Lu
- Department of Automation, Tsinghua University, Beijing, Beijing, China
| | - Cheng Ma
- Department of Statistics, University of Michigan, Ann Arbor, MI, USA
| | - Peng Zhao
- Department of Industrial Engineering, Tsinghua University, Beijing, Beijing, China
| | - Daiqi Gao
- Department of Industrial Engineering, Tsinghua University, Beijing, Beijing, China
| | - Zihao Fan
- School of Information, University of California, Berkeley, CA, USA
| | - Zijie Cheng
- Department of Computer Science and Technology, Tsinghua University, Beijing, Beijing, China
| | - Zheyu Wang
- Department of Automation, Tsinghua University, Beijing, Beijing, China
| | - Sheng Yu
- Center for Statistical Science, Tsinghua University, Beijing, Beijing, China; Department of Industrial Engineering, Tsinghua University, Beijing, Beijing, China; Institute for Data Science, Tsinghua University, Beijing, Beijing, China.
| |
Collapse
|
27
|
Nateqi J, Lin S, Krobath H, Gruarin S, Lutz T, Dvorak T, Gruschina A, Ortner R. [From symptom to diagnosis-symptom checkers re-evaluated : Are symptom checkers finally sufficient and accurate to use? An update from the ENT perspective]. HNO 2019; 67:334-342. [PMID: 30993374 DOI: 10.1007/s00106-019-0666-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
BACKGROUND Every seventh diagnosis is a misdiagnosis. Each year, 1.5 million lives could be saved worldwide with the correct diagnosis. Physicians have to consider over 20,000 diseases. A study from Harvard University published in 2015 tested 19 symptom checkers and found them to be insufficient, with only 29-71% accuracy in diagnosis. OBJECTIVE The current study investigates the diagnostic accuracy of new symptom checkers from an ENT perspective. MATERIALS AND METHODS The authors update the abovenamed diagnostic accuracy comparison by (1) including the five new symptom checkers Symptoma, Ada, FindZebra, Mediktor, and Babylon; and (2) normalizing results of the previously tested symptom checkers as to reflect each diagnostic accuracy based on the same set of patient vignettes. The winner is then compared to the two symptom checkers with the most scientific evidence, namely Isabel and FindZebra, on the basis of an ENT-specific test with patient vignettes sourced from the British Medical Journal. RESULTS Most of the new symptom checkers demonstrated diagnostic accuracy rates within the previously established range, with the exception of Symptoma, which scored the right diagnosis in 82.2% of cases at the top of the list (+38% points), and in 100% of cases in the top 3 (+29% points) and the top 10 (+16% points), thus raising the bar in this field. The cross-validation with ENT cases resulted in a diagnostic accuracy of 64.3 vs. 21.4 vs. 26.2% (top 1), 92.9 vs. 40.5 vs. 42.9% (top 3), and 100 vs. 61.9 vs. 54.8% (top 10) for Symptoma vs. Isabel vs. FindZebra, respectively. CONCLUSIONS Symptoma is the first and only viable solution in this market. Large-scale studies should be conducted to further validate these results as well as to assess the actual practical performance of the symptom checkers and their ability to diagnose rare diseases.
Collapse
Affiliation(s)
- J Nateqi
- Symptoma GmbH, Neuhofen 5, 4864, Attersee am Attersee, Österreich.
| | - S Lin
- Symptoma GmbH, Neuhofen 5, 4864, Attersee am Attersee, Österreich
| | - H Krobath
- Symptoma GmbH, Neuhofen 5, 4864, Attersee am Attersee, Österreich
| | - S Gruarin
- Symptoma GmbH, Neuhofen 5, 4864, Attersee am Attersee, Österreich
| | - T Lutz
- Symptoma GmbH, Neuhofen 5, 4864, Attersee am Attersee, Österreich
| | - T Dvorak
- Symptoma GmbH, Neuhofen 5, 4864, Attersee am Attersee, Österreich
| | - A Gruschina
- Symptoma GmbH, Neuhofen 5, 4864, Attersee am Attersee, Österreich
| | - R Ortner
- Symptoma GmbH, Neuhofen 5, 4864, Attersee am Attersee, Österreich
| |
Collapse
|
28
|
Wadhwa RR, Park DY, Natowicz MR. The accuracy of computer-based diagnostic tools for the identification of concurrent genetic disorders. Am J Med Genet A 2018; 176:2704-2709. [PMID: 30475443 DOI: 10.1002/ajmg.a.40651] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 08/09/2018] [Accepted: 09/08/2018] [Indexed: 11/11/2022]
Abstract
The increasing use of next-generation sequencing, especially clinical exome sequencing, has revealed that individuals having two coexisting genetic conditions are not uncommon occurrences. This pilot study evaluates the efficacy of two methodologically distinct computational differential diagnosis generating tools-FindZebra and SimulConsult-in identifying multiple genetic conditions in a single patient. Clinical query terms were generated for each of 15 monogenic disorders that were effective in resulting in the top 10 list of differential diagnoses for each of the 15 monogenic conditions when entered into these bioinformatics tools. Then, the terms of over 125 pairings of these conditions were entered using each tool and the resulting list of diagnoses evaluated to determine how often both diagnoses of a pair were represented in that list. Neither tool was successful in identifying both members of a pair of conditions in greater than 40% of test cases. Disorder detection sensitivity was not homogeneous within a tool, with each tool favoring the identification of a subset of genetic conditions. In view of recent exome sequencing data showing an unexpectedly high prevalence of coexistent monogenic conditions, the results from this pilot study highlight a need for the development of computational tools designed to effectively generate differential diagnoses with consideration of the possibility of coexisting conditions.
Collapse
Affiliation(s)
- Raoul R Wadhwa
- Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio
| | - Deborah Y Park
- Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio
| | - Marvin R Natowicz
- Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio.,Pathology and Laboratory Medicine, Genomic Medicine, Neurological and Pediatrics Institutes, Cleveland Clinic, Cleveland, Ohio
| |
Collapse
|
29
|
Yu KH, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat Biomed Eng 2018; 2:719-731. [PMID: 31015651 DOI: 10.1038/s41551-018-0305-z] [Citation(s) in RCA: 910] [Impact Index Per Article: 151.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 09/05/2018] [Indexed: 02/07/2023]
Abstract
Artificial intelligence (AI) is gradually changing medical practice. With recent progress in digitized data acquisition, machine learning and computing infrastructure, AI applications are expanding into areas that were previously thought to be only the province of human experts. In this Review Article, we outline recent breakthroughs in AI technologies and their biomedical applications, identify the challenges for further progress in medical AI systems, and summarize the economic, legal and social implications of AI in healthcare.
Collapse
Affiliation(s)
- Kun-Hsing Yu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Andrew L Beam
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Isaac S Kohane
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. .,Boston Children's Hospital, Boston, MA, USA.
| |
Collapse
|
30
|
Abstract
Diagnostic error may be the largest unaddressed patient safety concern in the United States, responsible for an estimated 40,000-80,000 deaths annually. With the electronic health record (EHR) now in near universal use, the goal of this narrative review is to synthesize evidence and opinion regarding the impact of the EHR and health care information technology (health IT) on the diagnostic process and its outcomes. We consider the many ways in which the EHR and health IT facilitate diagnosis and improve the diagnostic process, and conversely the major ways in which it is problematic, including the unintended consequences that contribute to diagnostic error and sometimes patient deaths. We conclude with a summary of suggestions for improving the safety and safe use of these resources for diagnosis in the future.
Collapse
Affiliation(s)
| | - Colene Byrne
- RTI International Research Triangle Park, NC, USA
| | | |
Collapse
|
31
|
Sims MH, Hodges Shaw M, Gilbertson S, Storch J, Halterman MW. Legal and ethical issues surrounding the use of crowdsourcing among healthcare providers. Health Informatics J 2018; 25:1618-1630. [PMID: 30192688 DOI: 10.1177/1460458218796599] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
As the pace of medical discovery widens the knowledge-to-practice gap, technologies that enable peer-to-peer crowdsourcing have become increasingly common. Crowdsourcing has the potential to help medical providers collaborate to solve patient-specific problems in real time. We recently conducted the first trial of a mobile, medical crowdsourcing application among healthcare providers in a university hospital setting. In addition to acknowledging the benefits, our participants also raised concerns regarding the potential negative consequences of this emerging technology. In this commentary, we consider the legal and ethical implications of the major findings identified in our previous trial including compliance with the Health Insurance Portability and Accountability Act, patient protections, healthcare provider liability, data collection, data retention, distracted doctoring, and multi-directional anonymous posting. We believe the commentary and recommendations raised here will provide a frame of reference for individual providers, provider groups, and institutions to explore the salient legal and ethical issues before they implement these systems into their workflow.
Collapse
Affiliation(s)
| | | | - Seth Gilbertson
- University at Buffalo, The State University of New York, USA
| | | | | |
Collapse
|
32
|
Jeganathan J, Knio Z, Amador Y, Hai T, Khamooshian A, Matyal R, Khabbaz KR, Mahmood F. Artificial intelligence in mitral valve analysis. Ann Card Anaesth 2017; 20:129-134. [PMID: 28393769 PMCID: PMC5408514 DOI: 10.4103/aca.aca_243_16] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Background: Echocardiographic analysis of mitral valve (MV) has become essential for diagnosis and management of patients with MV disease. Currently, the various software used for MV analysis require manual input and are prone to interobserver variability in the measurements. Aim: The aim of this study is to determine the interobserver variability in an automated software that uses artificial intelligence for MV analysis. Settings and Design: Retrospective analysis of intraoperative three-dimensional transesophageal echocardiography data acquired from four patients with normal MV undergoing coronary artery bypass graft surgery in a tertiary hospital. Materials and Methods: Echocardiographic data were analyzed using the eSie Valve Software (Siemens Healthcare, Mountain View, CA, USA). Three examiners analyzed three end-systolic (ES) frames from each of the four patients. A total of 36 ES frames were analyzed and included in the study. Statistical Analysis: A multiple mixed-effects ANOVA model was constructed to determine if the examiner, the patient, and the loop had a significant effect on the average value of each parameter. A Bonferroni correction was used to correct for multiple comparisons, and P = 0.0083 was considered to be significant. Results: Examiners did not have an effect on any of the six parameters tested. Patient and loop had an effect on the average parameter value for each of the six parameters as expected (P < 0.0083 for both). Conclusion: We were able to conclude that using automated analysis, it is possible to obtain results with good reproducibility, which only requires minimal user intervention.
Collapse
Affiliation(s)
- Jelliffe Jeganathan
- Department of Anesthesia, Critical Care and Pain Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Ziyad Knio
- Department of Surgery, Division of Cardiothoracic Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Yannis Amador
- Department of Anesthesia, Critical Care and Pain Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA; Department of Anesthesia, Hospital México, University of Costa Rica, San José, Costa Rica
| | - Ting Hai
- Department of Anesthesia, Critical Care and Pain Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA; Department of Anesthesiology, Peking University People's Hospital, Beijing, China
| | - Arash Khamooshian
- Department of Cardio-Thoracic Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Robina Matyal
- Department of Anesthesia, Critical Care and Pain Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Kamal R Khabbaz
- Department of Surgery, Division of Cardiothoracic Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Feroze Mahmood
- Department of Anesthesia, Critical Care and Pain Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
33
|
Cahan A, Cimino JJ. A Learning Health Care System Using Computer-Aided Diagnosis. J Med Internet Res 2017; 19:e54. [PMID: 28274905 PMCID: PMC5362695 DOI: 10.2196/jmir.6663] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 01/04/2017] [Accepted: 02/12/2017] [Indexed: 11/13/2022] Open
Abstract
Physicians intuitively apply pattern recognition when evaluating a patient. Rational diagnosis making requires that clinical patterns be put in the context of disease prior probability, yet physicians often exhibit flawed probabilistic reasoning. Difficulties in making a diagnosis are reflected in the high rates of deadly and costly diagnostic errors. Introduced 6 decades ago, computerized diagnosis support systems are still not widely used by internists. These systems cannot efficiently recognize patterns and are unable to consider the base rate of potential diagnoses. We review the limitations of current computer-aided diagnosis support systems. We then portray future diagnosis support systems and provide a conceptual framework for their development. We argue for capturing physician knowledge using a novel knowledge representation model of the clinical picture. This model (based on structured patient presentation patterns) holds not only symptoms and signs but also their temporal and semantic interrelations. We call for the collection of crowdsourced, automatically deidentified, structured patient patterns as means to support distributed knowledge accumulation and maintenance. In this approach, each structured patient pattern adds to a self-growing and -maintaining knowledge base, sharing the experience of physicians worldwide. Besides supporting diagnosis by relating the symptoms and signs with the final diagnosis recorded, the collective pattern map can also provide disease base-rate estimates and real-time surveillance for early detection of outbreaks. We explain how health care in resource-limited settings can benefit from using this approach and how it can be applied to provide feedback-rich medical education for both students and practitioners.
Collapse
Affiliation(s)
- Amos Cahan
- IBM TJ Watson Research Center, Yorktown Heights, NY, United States
| | - James J Cimino
- Informatics Institute, University of Alabama at Birmingham, Birmingham, AL, United States
| |
Collapse
|
34
|
Segal MM, Athreya B, Son MBF, Tirosh I, Hausmann JS, Ang EYN, Zurakowski D, Feldman LK, Sundel RP. Evidence-based decision support for pediatric rheumatology reduces diagnostic errors. Pediatr Rheumatol Online J 2016; 14:67. [PMID: 27964737 PMCID: PMC5155385 DOI: 10.1186/s12969-016-0127-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Accepted: 12/01/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The number of trained specialists world-wide is insufficient to serve all children with pediatric rheumatologic disorders, even in the countries with robust medical resources. We evaluated the potential of diagnostic decision support software (DDSS) to alleviate this shortage by assessing the ability of such software to improve the diagnostic accuracy of non-specialists. METHODS Using vignettes of actual clinical cases, clinician testers generated a differential diagnosis before and after using diagnostic decision support software. The evaluation used the SimulConsult® DDSS tool, based on Bayesian pattern matching with temporal onset of each finding in each disease. The tool covered 5405 diseases (averaging 22 findings per disease). Rheumatology content in the database was developed using both primary references and textbooks. The frequency, timing, age of onset and age of disappearance of findings, as well as their incidence, treatability, and heritability were taken into account in order to guide diagnostic decision making. These capabilities allowed key information such as pertinent negatives and evolution over time to be used in the computations. Efficacy was measured by comparing whether the correct condition was included in the differential diagnosis generated by clinicians before using the software ("unaided"), versus after use of the DDSS ("aided"). RESULTS The 26 clinicians demonstrated a significant reduction in diagnostic errors following introduction of the software, from 28% errors while unaided to 15% using decision support (p < 0.0001). Improvement was greatest for emergency medicine physicians (p = 0.013) and clinicians in practice for less than 10 years (p = 0.012). This error reduction occurred despite the fact that testers employed an "open book" approach to generate their initial lists of potential diagnoses, spending an average of 8.6 min using printed and electronic sources of medical information before using the diagnostic software. CONCLUSIONS These findings suggest that decision support can reduce diagnostic errors and improve use of relevant information by generalists. Such assistance could potentially help relieve the shortage of experts in pediatric rheumatology and similarly underserved specialties by improving generalists' ability to evaluate and diagnose patients presenting with musculoskeletal complaints. TRIAL REGISTRATION ClinicalTrials.gov ID: NCT02205086.
Collapse
Affiliation(s)
| | - Balu Athreya
- DuPont Hospital for Children, Wilmington, DE and Thomas Jefferson University, Philadelphia, PA USA
| | - Mary Beth F. Son
- Boston Children’s Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115 USA
| | - Irit Tirosh
- Edmond and Lily Safra Children’s Hospital, Tel-Hashomer, Ramat-Gan, Israel and Tel Aviv University, Tel Aviv, Israel
| | - Jonathan S. Hausmann
- Boston Children’s Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115 USA
| | | | - David Zurakowski
- Boston Children’s Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115 USA
| | | | - Robert P. Sundel
- Boston Children’s Hospital and Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115 USA
| |
Collapse
|
35
|
Middleton B, Sittig DF, Wright A. Clinical Decision Support: a 25 Year Retrospective and a 25 Year Vision. Yearb Med Inform 2016; Suppl 1:S103-16. [PMID: 27488402 DOI: 10.15265/iys-2016-s034] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
OBJECTIVE The objective of this review is to summarize the state of the art of clinical decision support (CDS) circa 1990, review progress in the 25 year interval from that time, and provide a vision of what CDS might look like 25 years hence, or circa 2040. METHOD Informal review of the medical literature with iterative review and discussion among the authors to arrive at six axes (data, knowledge, inference, architecture and technology, implementation and integration, and users) to frame the review and discussion of selected barriers and facilitators to the effective use of CDS. RESULT In each of the six axes, significant progress has been made. Key advances in structuring and encoding standardized data with an increased availability of data, development of knowledge bases for CDS, and improvement of capabilities to share knowledge artifacts, explosion of methods analyzing and inferring from clinical data, evolution of information technologies and architectures to facilitate the broad application of CDS, improvement of methods to implement CDS and integrate CDS into the clinical workflow, and increasing sophistication of the end-user, all have played a role in improving the effective use of CDS in healthcare delivery. CONCLUSION CDS has evolved dramatically over the past 25 years and will likely evolve just as dramatically or more so over the next 25 years. Increasingly, the clinical encounter between a clinician and a patient will be supported by a wide variety of cognitive aides to support diagnosis, treatment, care-coordination, surveillance and prevention, and health maintenance or wellness.
Collapse
Affiliation(s)
- B Middleton
- Blackford Middleton, Cell: +1 617 335 7098, E-Mail:
| | | | | |
Collapse
|
36
|
Abstract
OBJECTIVES Describe the state of Electronic Health Records (EHRs) in 1992 and their evolution by 2015 and where EHRs are expected to be in 25 years. Further to discuss the expectations for EHRs in 1992 and explore which of them were realized and what events accelerated or disrupted/derailed how EHRs evolved. METHODS Literature search based on "Electronic Health Record", "Medical Record", and "Medical Chart" using Medline, Google, Wikipedia Medical, and Cochrane Libraries resulted in an initial review of 2,356 abstracts and other information in papers and books. Additional papers and books were identified through the review of references cited in the initial review. RESULTS By 1992, hardware had become more affordable, powerful, and compact and the use of personal computers, local area networks, and the Internet provided faster and easier access to medical information. EHRs were initially developed and used at academic medical facilities but since most have been replaced by large vendor EHRs. While EHR use has increased and clinicians are being prepared to practice in an EHR-mediated world, technical issues have been overshadowed by procedural, professional, social, political, and especially ethical issues as well as the need for compliance with standards and information security. There have been enormous advancements that have taken place, but many of the early expectations for EHRs have not been realized and current EHRs still do not meet the needs of today's rapidly changing healthcare environment. CONCLUSION The current use of EHRs initiated by new technology would have been hard to foresee. Current and new EHR technology will help to provide international standards for interoperable applications that use health, social, economic, behavioral, and environmental data to communicate, interpret, and act intelligently upon complex healthcare information to foster precision medicine and a learning health system.
Collapse
Affiliation(s)
- R S Evans
- R. Scott Evans, MS, PhD, FACMI, Department of Medical Informatics, LDS Hospital, 8th Ave & C Street, Salt Lake City, Utah 84143, USA, Tel: +1 801 408-3029, Fax: +1 801 408-5802, E-mail:
| |
Collapse
|
37
|
Grigull L, Lechner W, Petri S, Kollewe K, Dengler R, Mehmecke S, Schumacher U, Lücke T, Schneider-Gold C, Köhler C, Güttsches AK, Kortum X, Klawonn F. Diagnostic support for selected neuromuscular diseases using answer-pattern recognition and data mining techniques: a proof of concept multicenter prospective trial. BMC Med Inform Decis Mak 2016; 16:31. [PMID: 26957320 PMCID: PMC4782522 DOI: 10.1186/s12911-016-0268-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2015] [Accepted: 02/26/2016] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Diagnosis of neuromuscular diseases in primary care is often challenging. Rare diseases such as Pompe disease are easily overlooked by the general practitioner. We therefore aimed to develop a diagnostic support tool using patient-oriented questions and combined data mining algorithms recognizing answer patterns in individuals with selected neuromuscular diseases. A multicenter prospective study for the proof of concept was conducted thereafter. METHODS First, 16 interviews with patients were conducted focusing on their pre-diagnostic observations and experiences. From these interviews, we developed a questionnaire with 46 items. Then, patients with diagnosed neuromuscular diseases as well as patients without such a disease answered the questionnaire to establish a database for data mining. For proof of concept, initially only six diagnoses were chosen (myotonic dystrophy and myotonia (MdMy), Pompe disease (MP), amyotrophic lateral sclerosis (ALS), polyneuropathy (PNP), spinal muscular atrophy (SMA), other neuromuscular diseases, and no neuromuscular disease (NND). A prospective study was performed to validate the automated malleable system, which included six different classification methods combined in a fusion algorithm proposing a final diagnosis. Finally, new diagnoses were incorporated into the system. RESULTS In total, questionnaires from 210 individuals were used to train the system. 89.5 % correct diagnoses were achieved during cross-validation. The sensitivity of the system was 93-97 % for individuals with MP, with MdMy and without neuromuscular diseases, but only 69 % in SMA and 81 % in ALS patients. In the prospective trial, 57/64 (89 %) diagnoses were predicted correctly by the computerized system. All questions, or rather all answers, increased the diagnostic accuracy of the system, with the best results reached by the fusion of different classifier methods. Receiver operating curve (ROC) and p-value analyses confirmed the results. CONCLUSION A questionnaire-based diagnostic support tool using data mining methods exhibited good results in predicting selected neuromuscular diseases. Due to the variety of neuromuscular diseases, additional studies are required to measure beneficial effects in the clinical setting.
Collapse
Affiliation(s)
- Lorenz Grigull
- Department of Pediatric Hematology and Oncology, Hannover Medical School, Carl-Neuberg Str. 1, D-30623, Hannover, Germany.
| | - Werner Lechner
- Improved Medical Diagnostics, IMD GmbH, Hannover, Germany.
| | - Susanne Petri
- Department of Neurology, Hannover Medical School, Hannover, Germany.
| | - Katja Kollewe
- Department of Neurology, Hannover Medical School, Hannover, Germany.
| | - Reinhard Dengler
- Department of Neurology, Hannover Medical School, Hannover, Germany.
| | - Sandra Mehmecke
- Department of Neurology, Hannover Medical School, Hannover, Germany.
| | | | - Thomas Lücke
- Klinik für Kinder- und Jugendmedizin im St. Josef Hospital, Ruhr- Universität Bochum, Bochum, Germany.
| | - Christiane Schneider-Gold
- Department of Neurology, Heimer-Institute at the BG University-Hospital Bergmannsheil GmbH, Ruhr- University Bochum, Bochum, Germany.
| | - Cornelia Köhler
- Klinik für Kinder- und Jugendmedizin im St. Josef Hospital, Ruhr- Universität Bochum, Bochum, Germany.
| | - Anne-Katrin Güttsches
- Department of Neurology, Heimer-Institute at the BG University-Hospital Bergmannsheil GmbH, Ruhr- University Bochum, Bochum, Germany.
| | - Xiaowei Kortum
- Ostfalia University of Applied Sciences, Wolfenbuettel, Germany.
| | - Frank Klawonn
- Ostfalia University of Applied Sciences, Wolfenbuettel, Germany. .,Helmholtz Centre for Infection Research, Biostatistics Group, Braunschweig, Germany.
| |
Collapse
|
38
|
Riches N, Panagioti M, Alam R, Cheraghi-Sohi S, Campbell S, Esmail A, Bower P. The Effectiveness of Electronic Differential Diagnoses (DDX) Generators: A Systematic Review and Meta-Analysis. PLoS One 2016; 11:e0148991. [PMID: 26954234 PMCID: PMC4782994 DOI: 10.1371/journal.pone.0148991] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 01/25/2016] [Indexed: 01/10/2023] Open
Abstract
Background Diagnostic errors are costly and they can contribute to adverse patient outcomes, including avoidable deaths. Differential diagnosis (DDX) generators are electronic tools that may facilitate the diagnostic process. Methods and Findings We conducted a systematic review and meta-analysis to investigate the efficacy and utility of DDX generators. We undertook a comprehensive search of the literature including 16 databases from inception to May 2015 and specialist patient safety databases. We also searched the reference lists of included studies. Article screening, selection and data extraction were independently conducted by 2 reviewers. 36 articles met the eligibility criteria and the pooled accurate diagnosis retrieval rate of DDX tools was high with high heterogeneity (pooled rate = 0.70, 95% CI = 0.63 to 0.77; I2 = 97%, p<0.0001). DDX generators did not demonstrate improved diagnostic retrieval compared to clinicians but small improvements were seen in the before and after studies where clinicians had the opportunity to revisit their diagnoses following DDX generator consultation. Clinical utility data generally indicated high levels of user satisfaction and significant reductions in time taken to use for newer web-based tools. Lengthy differential lists and their low relevance were areas of concern and have the potential to increase diagnostic uncertainty. Data on the number of investigations ordered and on cost-effectiveness remain inconclusive. Conclusions DDX generators have the potential to improve diagnostic practice among clinicians. However, the high levels of heterogeneity, the variable quality of the reported data and the minimal benefits observed for complex cases suggest caution. Further research needs to be undertaken in routine clinical settings with greater consideration of enablers and barriers which are likely to impact on DDX use before their use in routine clinical practice can be recommended.
Collapse
Affiliation(s)
- Nicholas Riches
- NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre (Greater Manchester PSTRC), Williamson Building, The University of Manchester, Manchester, United Kingdom
- * E-mail:
| | - Maria Panagioti
- NIHR School for Primary Care Research, Centre for Primary Care, Institute of Population Health, University of Manchester, Manchester, United Kingdom
| | - Rahul Alam
- NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre (Greater Manchester PSTRC), Williamson Building, The University of Manchester, Manchester, United Kingdom
| | - Sudeh Cheraghi-Sohi
- NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre (Greater Manchester PSTRC), Williamson Building, The University of Manchester, Manchester, United Kingdom
| | - Stephen Campbell
- NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre (Greater Manchester PSTRC), Williamson Building, The University of Manchester, Manchester, United Kingdom
| | - Aneez Esmail
- NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre (Greater Manchester PSTRC), Williamson Building, The University of Manchester, Manchester, United Kingdom
| | - Peter Bower
- NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre (Greater Manchester PSTRC), Williamson Building, The University of Manchester, Manchester, United Kingdom
| |
Collapse
|
39
|
|
40
|
Wright A, Maloney FL, Wien M, Samal L, Emani S, Zuccotti G. Assessing information system readiness for mitigating malpractice risk through simulation: results of a multi-site study. J Am Med Inform Assoc 2015; 22:1020-8. [PMID: 26017230 DOI: 10.1093/jamia/ocv041] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2015] [Accepted: 04/08/2015] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE To develop and test an instrument for assessing a healthcare organization's ability to mitigate malpractice risk through clinical decision support (CDS). MATERIALS AND METHODS Based on a previously collected malpractice data set, we identified common types of CDS and the number and cost of malpractice cases that might have been prevented through this CDS. We then designed clinical vignettes and questions that test an organization's CDS capabilities through simulation. Seven healthcare organizations completed the simulation. RESULTS All seven organizations successfully completed the self-assessment. The proportion of potentially preventable indemnity loss for which CDS was available ranged from 16.5% to 73.2%. DISCUSSION There is a wide range in organizational ability to mitigate malpractice risk through CDS, with many organizations' electronic health records only being able to prevent a small portion of malpractice events seen in a real-world dataset. CONCLUSION The simulation approach to assessing malpractice risk mitigation through CDS was effective. Organizations should consider using malpractice claims experience to facilitate prioritizing CDS development.
Collapse
Affiliation(s)
- Adam Wright
- Partners HealthCare, Boston, MA, USA Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Francine L Maloney
- Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Matthew Wien
- Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Lipika Samal
- Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Gianna Zuccotti
- Partners HealthCare, Boston, MA, USA Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA CRICO/Risk Management Foundation, Cambridge, MA, USA
| |
Collapse
|
41
|
Dhiman GJ, Amber KT, Goodman KW. Comparative outcome studies of clinical decision support software: limitations to the practice of evidence-based system acquisition. J Am Med Inform Assoc 2015; 22:e13-20. [PMID: 25665704 PMCID: PMC7659211 DOI: 10.1093/jamia/ocu033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2014] [Revised: 11/21/2014] [Accepted: 11/24/2014] [Indexed: 11/14/2022] Open
Abstract
Clinical decision support systems (CDSSs) assist clinicians with patient diagnosis and treatment. However, inadequate attention has been paid to the process of selecting and buying systems. The diversity of CDSSs, coupled with research obstacles, marketplace limitations, and legal impediments, has thwarted comparative outcome studies and reduced the availability of reliable information and advice for purchasers. We review these limitations and recommend several comparative studies, which were conducted in phases; studies conducted in phases and focused on limited outcomes of safety, efficacy, and implementation in varied clinical settings. Additionally, we recommend the increased availability of guidance tools to assist purchasers with evidence-based purchases. Transparency is necessary in purchasers' reporting of system defects and vendors' disclosure of marketing conflicts of interest to support methodologically sound studies. Taken together, these measures can foster the evolution of evidence-based tools that, in turn, will enable and empower system purchasers to make wise choices and improve the care of patients.
Collapse
Affiliation(s)
| | - Kyle T Amber
- University of Miami Miller School of Medicine, Miami, FL, USA
| | - Kenneth W Goodman
- Bioethics Program, University of Miami Miller School of Medicine, Miami, FL, USA
| |
Collapse
|
42
|
Segal MM, Williams MS, Gropman AL, Torres AR, Forsyth R, Connolly AM, El-Hattab AW, Perlman SJ, Samanta D, Parikh S, Pavlakis SG, Feldman LK, Betensky RA, Gospe SM. Evidence-based decision support for neurological diagnosis reduces errors and unnecessary workup. J Child Neurol 2014; 29:487-92. [PMID: 23576414 DOI: 10.1177/0883073813483365] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Using vignettes of real cases and the SimulConsult diagnostic decision support software, neurologists listed a differential diagnosis and workup before and after using the decision support. Using the software, there was a significant reduction in error, up to 75% for diagnosis and 56% for workup. This error reduction occurred despite the baseline being one in which testers were allowed to use narrative resources and Web searching. A key factor that improved performance was taking enough time (>2 minutes) to enter clinical findings into the software accurately. Under these conditions and for instances in which the diagnoses changed based on using the software, diagnostic accuracy improved in 96% of instances. There was a 6% decrease in the number of workup items accompanied by a 34% increase in relevance. The authors conclude that decision support for a neurological diagnosis can reduce errors and save on unnecessary testing.
Collapse
|
43
|
Electronic Health Records and Patient Safety. Patient Saf Surg 2014. [DOI: 10.1007/978-1-4471-4369-7_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
44
|
Williams CN, Bratton SL, Hirshberg EL. Computerized decision support in adult and pediatric critical care. World J Crit Care Med 2013; 2:21-8. [PMID: 24701413 PMCID: PMC3953873 DOI: 10.5492/wjccm.v2.i4.21] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Revised: 08/02/2013] [Accepted: 08/20/2013] [Indexed: 02/06/2023] Open
Abstract
Computerized decision support (CDS) is the most advanced form of clinical decision support available and has evolved with innovative technologies to provide meaningful assistance to medical professionals. Critical care clinicians are in unique environments where vast amounts of data are collected on individual patients, and where expedient and accurate decisions are paramount to the delivery of quality healthcare. Many CDS tools are in use today among adult and pediatric intensive care units as diagnostic aides, safety alerts, computerized protocols, and automated recommendations for management. Some CDS use have significantly decreased adverse events and improved costs when carefully implemented and properly operated. CDS tools integrated into electronic health records are also valuable to researchers providing rapid identification of eligible patients, streamlining data-gathering and analysis, and providing cohorts for study of rare and chronic diseases through data-warehousing. Although the need for human judgment in the daily care of critically ill patients has limited the study and realization of meaningful improvements in overall patient outcomes, CDS tools continue to evolve and integrate into the daily workflow of clinicians, and will likely provide advancements over time. Through novel technologies, CDS tools have vast potential for progression and will significantly impact the field of critical care and clinical research in the future.
Collapse
|
45
|
Braithwaite RS, Scotch M. Using value of information to guide evaluation of decision supports for differential diagnosis: is it time for a new look? BMC Med Inform Decis Mak 2013; 13:105. [PMID: 24020989 PMCID: PMC3846909 DOI: 10.1186/1472-6947-13-105] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Accepted: 09/06/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Decision support systems for differential diagnosis have traditionally been evaluated on the basis of criteria how sensitively and specifically they are able to identify the correct diagnosis established by expert clinicians. DISCUSSION This article questions whether evaluation criteria pertaining to identifying the correct diagnosis are most appropriate or useful. Instead it advocates evaluation of decision support systems for differential diagnosis based on the criterion of maximizing value of information. SUMMARY This approach quantitatively and systematically integrates several important clinical management priorities, including avoiding serious diagnostic errors of omission and avoiding harmful or expensive tests.
Collapse
Affiliation(s)
- R Scott Braithwaite
- Department of Population Health, New York University School of Medicine, 550 First Avenue, VZ30 6th floor, 615, New York, NY 10016, USA
| | - Matthew Scotch
- Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ, USA
| |
Collapse
|
46
|
Cognition and decision in biomedical artificial intelligence: From symbolic representation to emergence. AI & SOCIETY 2013. [DOI: 10.1007/bf01210601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
47
|
Papier A. Decision support in dermatology and medicine: history and recent developments. ACTA ACUST UNITED AC 2013; 31:153-9. [PMID: 22929351 DOI: 10.1016/j.sder.2012.06.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Revised: 06/06/2012] [Accepted: 06/19/2012] [Indexed: 11/29/2022]
Abstract
This article is focused on diagnostic decision support tools and will provide a brief history of clinical decision support (CDS), examine the components of CDS and its associated terminology, and discuss recent developments in the use and application of CDS systems, particularly in the field of dermatology. For this article, we use CDS to mean an interactive system allowing input of patient-specific information and providing customized medical knowledge-based results via automated reasoning, for example, a set of rules and/or an underlying logic, and associations.
Collapse
Affiliation(s)
- Art Papier
- Dermatology and Medical Informatics, University of Rochester College of Medicine, Rochester, NY 14642, USA.
| |
Collapse
|
48
|
Bogich TL, Funk S, Malcolm TR, Chhun N, Epstein JH, Chmura AA, Kilpatrick AM, Brownstein JS, Hutchison OC, Doyle-Capitman C, Deaville R, Morse SS, Cunningham AA, Daszak P. Using network theory to identify the causes of disease outbreaks of unknown origin. J R Soc Interface 2013; 10:20120904. [PMID: 23389893 DOI: 10.1098/rsif.2012.0904] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The identification of undiagnosed disease outbreaks is critical for mobilizing efforts to prevent widespread transmission of novel virulent pathogens. Recent developments in online surveillance systems allow for the rapid communication of the earliest reports of emerging infectious diseases and tracking of their spread. The efficacy of these programs, however, is inhibited by the anecdotal nature of informal reporting and uncertainty of pathogen identity in the early stages of emergence. We developed theory to connect disease outbreaks of known aetiology in a network using an array of properties including symptoms, seasonality and case-fatality ratio. We tested the method with 125 reports of outbreaks of 10 known infectious diseases causing encephalitis in South Asia, and showed that different diseases frequently form distinct clusters within the networks. The approach correctly identified unknown disease outbreaks with an average sensitivity of 76 per cent and specificity of 88 per cent. Outbreaks of some diseases, such as Nipah virus encephalitis, were well identified (sensitivity = 100%, positive predictive values = 80%), whereas others (e.g. Chandipura encephalitis) were more difficult to distinguish. These results suggest that unknown outbreaks in resource-poor settings could be evaluated in real time, potentially leading to more rapid responses and reducing the risk of an outbreak becoming a pandemic.
Collapse
Affiliation(s)
- Tiffany L Bogich
- EcoHealth Alliance, 460 West 34th Street, 17th Floor, New York, NY 10001, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Belle A, Kon MA, Najarian K. Biomedical informatics for computer-aided decision support systems: a survey. ScientificWorldJournal 2013; 2013:769639. [PMID: 23431259 PMCID: PMC3575619 DOI: 10.1155/2013/769639] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2012] [Accepted: 01/09/2013] [Indexed: 11/18/2022] Open
Abstract
The volumes of current patient data as well as their complexity make clinical decision making more challenging than ever for physicians and other care givers. This situation calls for the use of biomedical informatics methods to process data and form recommendations and/or predictions to assist such decision makers. The design, implementation, and use of biomedical informatics systems in the form of computer-aided decision support have become essential and widely used over the last two decades. This paper provides a brief review of such systems, their application protocols and methodologies, and the future challenges and directions they suggest.
Collapse
Affiliation(s)
- Ashwin Belle
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Mark A. Kon
- Department of Mathematics and Statistics, Boston University, Boston, MA 02215, USA
| | - Kayvan Najarian
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
50
|
Yuan MJ, Finley GM, Long J, Mills C, Johnson RK. Evaluation of user interface and workflow design of a bedside nursing clinical decision support system. Interact J Med Res 2013; 2:e4. [PMID: 23612350 PMCID: PMC3628119 DOI: 10.2196/ijmr.2402] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2012] [Revised: 12/10/2012] [Accepted: 12/29/2012] [Indexed: 11/20/2022] Open
Abstract
Background Clinical decision support systems (CDSS) are important tools to improve health care outcomes and reduce preventable medical adverse events. However, the effectiveness and success of CDSS depend on their implementation context and usability in complex health care settings. As a result, usability design and validation, especially in real world clinical settings, are crucial aspects of successful CDSS implementations. Objective Our objective was to develop a novel CDSS to help frontline nurses better manage critical symptom changes in hospitalized patients, hence reducing preventable failure to rescue cases. A robust user interface and implementation strategy that fit into existing workflows was key for the success of the CDSS. Methods Guided by a formal usability evaluation framework, UFuRT (user, function, representation, and task analysis), we developed a high-level specification of the product that captures key usability requirements and is flexible to implement. We interviewed users of the proposed CDSS to identify requirements, listed functions, and operations the system must perform. We then designed visual and workflow representations of the product to perform the operations.
The user interface and workflow design were evaluated via heuristic and end user performance evaluation. The heuristic evaluation was done after the first prototype, and its results were incorporated into the product before the end user evaluation was conducted. First, we recruited 4 evaluators with strong domain expertise to study the initial prototype. Heuristic violations were coded and rated for severity. Second, after development of the system, we assembled a panel of nurses, consisting of 3 licensed vocational nurses and 7 registered nurses, to evaluate the user interface and workflow via simulated use cases. We recorded whether each session was successfully completed and its completion time. Each nurse was asked to use the National Aeronautics and Space Administration (NASA) Task Load Index to self-evaluate the amount of cognitive and physical burden associated with using the device. Results A total of 83 heuristic violations were identified in the studies. The distribution of the heuristic violations and their average severity are reported. The nurse evaluators successfully completed all 30 sessions of the performance evaluations. All nurses were able to use the device after a single training session. On average, the nurses took 111 seconds (SD 30 seconds) to complete the simulated task. The NASA Task Load Index results indicated that the work overhead on the nurses was low. In fact, most of the burden measures were consistent with zero. The only potentially significant burden was temporal demand, which was consistent with the primary use case of the tool. Conclusions The evaluation has shown that our design was functional and met the requirements demanded by the nurses’ tight schedules and heavy workloads. The user interface embedded in the tool provided compelling utility to the nurse with minimal distraction.
Collapse
|