1
|
Hammoud M, Douglas S, Darmach M, Alawneh S, Sanyal S, Kanbour Y. Evaluating the Diagnostic Performance of Symptom Checkers: Clinical Vignette Study. JMIR AI 2024; 3:e46875. [PMID: 38875676 PMCID: PMC11091811 DOI: 10.2196/46875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 06/15/2023] [Accepted: 03/02/2024] [Indexed: 06/16/2024]
Abstract
BACKGROUND Medical self-diagnostic tools (or symptom checkers) are becoming an integral part of digital health and our daily lives, whereby patients are increasingly using them to identify the underlying causes of their symptoms. As such, it is essential to rigorously investigate and comprehensively report the diagnostic performance of symptom checkers using standard clinical and scientific approaches. OBJECTIVE This study aims to evaluate and report the accuracies of a few known and new symptom checkers using a standard and transparent methodology, which allows the scientific community to cross-validate and reproduce the reported results, a step much needed in health informatics. METHODS We propose a 4-stage experimentation methodology that capitalizes on the standard clinical vignette approach to evaluate 6 symptom checkers. To this end, we developed and peer-reviewed 400 vignettes, each approved by at least 5 out of 7 independent and experienced primary care physicians. To establish a frame of reference and interpret the results of symptom checkers accordingly, we further compared the best-performing symptom checker against 3 primary care physicians with an average experience of 16.6 (SD 9.42) years. To measure accuracy, we used 7 standard metrics, including M1 as a measure of a symptom checker's or a physician's ability to return a vignette's main diagnosis at the top of their differential list, F1-score as a trade-off measure between recall and precision, and Normalized Discounted Cumulative Gain (NDCG) as a measure of a differential list's ranking quality, among others. RESULTS The diagnostic accuracies of the 6 tested symptom checkers vary significantly. For instance, the differences in the M1, F1-score, and NDCG results between the best-performing and worst-performing symptom checkers or ranges were 65.3%, 39.2%, and 74.2%, respectively. The same was observed among the participating human physicians, whereby the M1, F1-score, and NDCG ranges were 22.8%, 15.3%, and 21.3%, respectively. When compared against each other, physicians outperformed the best-performing symptom checker by an average of 1.2% using F1-score, whereas the best-performing symptom checker outperformed physicians by averages of 10.2% and 25.1% using M1 and NDCG, respectively. CONCLUSIONS The performance variation between symptom checkers is substantial, suggesting that symptom checkers cannot be treated as a single entity. On a different note, the best-performing symptom checker was an artificial intelligence (AI)-based one, shedding light on the promise of AI in improving the diagnostic capabilities of symptom checkers, especially as AI keeps advancing exponentially.
Collapse
|
2
|
Riboli-Sasco E, El-Osta A, Alaa A, Webber I, Karki M, El Asmar ML, Purohit K, Painter A, Hayhoe B. Triage and Diagnostic Accuracy of Online Symptom Checkers: Systematic Review. J Med Internet Res 2023; 25:e43803. [PMID: 37266983 DOI: 10.2196/43803] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 03/27/2023] [Accepted: 04/11/2023] [Indexed: 06/03/2023] Open
Abstract
BACKGROUND In the context of a deepening global shortage of health workers and, in particular, the COVID-19 pandemic, there is growing international interest in, and use of, online symptom checkers (OSCs). However, the evidence surrounding the triage and diagnostic accuracy of these tools remains inconclusive. OBJECTIVE This systematic review aimed to summarize the existing peer-reviewed literature evaluating the triage accuracy (directing users to appropriate services based on their presenting symptoms) and diagnostic accuracy of OSCs aimed at lay users for general health concerns. METHODS Searches were conducted in MEDLINE, Embase, CINAHL, Health Management Information Consortium (HMIC), and Web of Science, as well as the citations of the studies selected for full-text screening. We included peer-reviewed studies published in English between January 1, 2010, and February 16, 2022, with a controlled and quantitative assessment of either or both triage and diagnostic accuracy of OSCs directed at lay users. We excluded tools supporting health care professionals, as well as disease- or specialty-specific OSCs. Screening and data extraction were carried out independently by 2 reviewers for each study. We performed a descriptive narrative synthesis. RESULTS A total of 21,296 studies were identified, of which 14 (0.07%) were included. The included studies used clinical vignettes, medical records, or direct input by patients. Of the 14 studies, 6 (43%) reported on triage and diagnostic accuracy, 7 (50%) focused on triage accuracy, and 1 (7%) focused on diagnostic accuracy. These outcomes were assessed based on the diagnostic and triage recommendations attached to the vignette in the case of vignette studies or on those provided by nurses or general practitioners, including through face-to-face and telephone consultations. Both diagnostic accuracy and triage accuracy varied greatly among OSCs. Overall diagnostic accuracy was deemed to be low and was almost always lower than that of the comparator. Similarly, most of the studies (9/13, 69 %) showed suboptimal triage accuracy overall, with a few exceptions (4/13, 31%). The main variables affecting the levels of diagnostic and triage accuracy were the severity and urgency of the condition, the use of artificial intelligence algorithms, and demographic questions. However, the impact of each variable differed across tools and studies, making it difficult to draw any solid conclusions. All included studies had at least one area with unclear risk of bias according to the revised Quality Assessment of Diagnostic Accuracy Studies-2 tool. CONCLUSIONS Although OSCs have potential to provide accessible and accurate health advice and triage recommendations to users, more research is needed to validate their triage and diagnostic accuracy before widescale adoption in community and health care settings. Future studies should aim to use a common methodology and agreed standard for evaluation to facilitate objective benchmarking and validation. TRIAL REGISTRATION PROSPERO CRD42020215210; https://tinyurl.com/3949zw83.
Collapse
Affiliation(s)
- Eva Riboli-Sasco
- Self-Care Academic Research Unit (SCARU), Department of Primary Care and Public Health, Imperial College London, London, United Kingdom
| | - Austen El-Osta
- Self-Care Academic Research Unit (SCARU), Department of Primary Care and Public Health, Imperial College London, London, United Kingdom
| | - Aos Alaa
- Self-Care Academic Research Unit (SCARU), Department of Primary Care and Public Health, Imperial College London, London, United Kingdom
| | - Iman Webber
- Self-Care Academic Research Unit (SCARU), Department of Primary Care and Public Health, Imperial College London, London, United Kingdom
| | - Manisha Karki
- Self-Care Academic Research Unit (SCARU), Department of Primary Care and Public Health, Imperial College London, London, United Kingdom
| | - Marie Line El Asmar
- Self-Care Academic Research Unit (SCARU), Department of Primary Care and Public Health, Imperial College London, London, United Kingdom
| | - Katie Purohit
- Self-Care Academic Research Unit (SCARU), Department of Primary Care and Public Health, Imperial College London, London, United Kingdom
| | - Annabelle Painter
- Self-Care Academic Research Unit (SCARU), Department of Primary Care and Public Health, Imperial College London, London, United Kingdom
| | - Benedict Hayhoe
- Self-Care Academic Research Unit (SCARU), Department of Primary Care and Public Health, Imperial College London, London, United Kingdom
| |
Collapse
|
3
|
Turnbull J, MacLellan J, Churruca K, Ellis LA, Prichard J, Browne D, Braithwaite J, Petter E, Chisambi M, Pope C. A multimethod study of NHS 111 online. HEALTH AND SOCIAL CARE DELIVERY RESEARCH 2023; 11:1-104. [PMID: 37464813 DOI: 10.3310/ytrr9821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
Background NHS 111 online offers 24-hour access to health assessment and triage. Objectives This study examined pathways to care, differential access and use, and workforce impacts of NHS 111 online. This study compared NHS 111 with Healthdirect (Haymarket, Australia) virtual triage. Design Interviews with 80 staff and stakeholders in English primary, urgent and emergency care, and 41 staff and stakeholders associated with Healthdirect. A survey of 2754 respondents, of whom 1137 (41.3%) had used NHS 111 online and 1617 (58.7%) had not. Results NHS 111 online is one of several digital health-care technologies and was not differentiated from the NHS 111 telephone service or well understood. There is a similar lack of awareness of Healthdirect virtual triage. NHS 111 and Healthdirect virtual triage are perceived as creating additional work for health-care staff and inappropriate demand for some health services, especially emergency care. One-third of survey respondents reported that they had not used any NHS 111 service (telephone or online). Older people and those with less educational qualifications are less likely to use NHS 111 online. Respondents who had used NHS 111 online reported more use of other urgent care services and make more cumulative use of services than those who had not used NHS 111 online. Users of NHS 111 online had higher levels of self-reported eHealth literacy. There were differences in reported preferences for using NHS 111 online for different symptom presentations. Conclusions Greater clarity about what the NHS 111 online service offers would allow better signposting and reduce confusion. Generic NHS 111 services are perceived as creating additional work in the primary, urgent and emergency care system. There are differences in eHealth literacy between users and those who have not used NHS 111 online, and this suggests that 'digital first' policies may increase health inequalities. Limitations This research bridged the pandemic from 2020 to 2021; therefore, findings may change as services adjust going forward. Surveys used a digital platform so there is probably bias towards some level of e-Literacy, but this also means that our data may underestimate the digital divide. Future work Further investigation of access to digital services could address concerns about digital exclusion. Research comparing the affordances and cost-benefits of different triage and assessment systems for users and health-care providers is needed. Research about trust in virtual assessments may show how duplication can be reduced. Mixed-methods studies looking at outcomes, impacts on work and costs, and ways to measure eHealth literacy, can inform the development NHS 111 online and opportunities for further international shared learning could be pursued. Study registration This study is registered at the research registry (UIN 5392). Funding This project was funded by the National Institute for Health and Care Research (NIHR) Health and Social Care Delivery Research Programme and will be published in full in Health and Social Care Delivery Research; Vol. 11, No. 5. See the NIHR Journals Library website for further project information.
Collapse
Affiliation(s)
- Joanne Turnbull
- School of Health Sciences, University of Southampton, Southampton, UK
| | - Jennifer MacLellan
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
| | - Kate Churruca
- Australian Institute of Health Innovation, Macquarie University, Sydney, NSW, Australia
| | - Louise A Ellis
- Australian Institute of Health Innovation, Macquarie University, Sydney, NSW, Australia
| | - Jane Prichard
- School of Health Sciences, University of Southampton, Southampton, UK
| | | | - Jeffrey Braithwaite
- Australian Institute of Health Innovation, Macquarie University, Sydney, NSW, Australia
| | - Emily Petter
- NHS Hampshire, Southampton and Isle of Wight Clinical Commissioning Group, Winchester, UK
| | - Matthew Chisambi
- Imperial College Health Partners, Chelsea and Westminster Hospital NHS Foundation Trust, London, UK
| | - Catherine Pope
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
| |
Collapse
|
4
|
Pairon A, Philips H, Verhoeven V. A scoping review on the use and usefulness of online symptom checkers and triage systems: How to proceed? Front Med (Lausanne) 2023; 9:1040926. [PMID: 36687416 PMCID: PMC9853165 DOI: 10.3389/fmed.2022.1040926] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 12/16/2022] [Indexed: 01/09/2023] Open
Abstract
Background Patients are increasingly turning to the Internet for health information. Numerous online symptom checkers and digital triage tools are currently available to the general public in an effort to meet this need, simultaneously acting as a demand management strategy to aid the overburdened health care system. The implementation of these services requires an evidence-based approach, warranting a review of the available literature on this rapidly evolving topic. Objective This scoping review aims to provide an overview of the current state of the art and identify research gaps through an analysis of the strengths and weaknesses of the presently available literature. Methods A systematic search strategy was formed and applied to six databases: Cochrane library, NICE, DARE, NIHR, Pubmed, and Web of Science. Data extraction was performed by two researchers according to a pre-established data charting methodology allowing for a thematic analysis of the results. Results A total of 10,250 articles were identified, and 28 publications were found eligible for inclusion. Users of these tools are often younger, female, more highly educated and technologically literate, potentially impacting digital divide and health equity. Triage algorithms remain risk-averse, which causes challenges for their accuracy. Recent evolutions in algorithms have varying degrees of success. Results on impact are highly variable, with potential effects on demand, accessibility of care, health literacy and syndromic surveillance. Both patients and healthcare providers are generally positive about the technology and seem amenable to the advice given, but there are still improvements to be made toward a more patient-centered approach. The significant heterogeneity across studies and triage systems remains the primary challenge for the field, limiting transferability of findings. Conclusion Current evidence included in this review is characterized by significant variability in study design and outcomes, highlighting the significant challenges for future research.An evolution toward more homogeneous methodologies, studies tailored to the intended setting, regulation and standardization of evaluations, and a patient-centered approach could benefit the field.
Collapse
|
5
|
Ben-Shabat N, Sharvit G, Meimis B, Ben Joya D, Sloma A, Kiderman D, Shabat A, Tsur AM, Watad A, Amital H. Assessing data gathering of chatbot based symptom checkers - a clinical vignettes study. Int J Med Inform 2022; 168:104897. [PMID: 36306653 PMCID: PMC9595333 DOI: 10.1016/j.ijmedinf.2022.104897] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 10/09/2022] [Accepted: 10/10/2022] [Indexed: 11/06/2022]
Abstract
BACKGROUND The burden on healthcare systems is mounting continuously owing to population growth and aging, overuse of medical services, and the recent COVID-19 pandemic. This overload is also causing reduced healthcare quality and outcomes. One solution gaining momentum is the integration of intelligent self-assessment tools, known as symptom-checkers, into healthcare-providers' systems. To the best of our knowledge, no study so far has investigated the data-gathering capabilities of these tools, which represent a crucial resource for simulating doctors' skills in medical-interviews. OBJECTIVES The goal of this study was to evaluate the data-gathering function of currently available chatbot symptom-checkers. METHODS We evaluated 8 symptom-checkers using 28 clinical vignettes from the repository of MSD-Manual case studies. The mean number of predefined pertinent findings for each case was 31.8 ± 6.8. The vignettes were entered into the platforms by 3 medical students who simulated the role of the patient. For each conversation, we obtained the number of pertinent findings retrieved and the number of questions asked. We then calculated the recall-rates (pertinent-findings retrieved out of all predefined pertinent-findings), and efficiency-rates (pertinent-findings retrieved out of the number of questions asked) of data-gathering, and compared them between the platforms. RESULTS The overall recall rate for all symptom-checkers was 0.32(2,280/7,112;95 %CI 0.31-0.33) for all pertinent findings, 0.37(1,110/2,992;95 %CI 0.35-0.39) for present findings, and 0.28(1140/4120;95 %CI 0.26-0.29) for absent findings. Among the symptom-checkers, Kahun platform had the highest recall rate with 0.51(450/889;95 %CI 0.47-0.54). Out of 4,877 questions asked overall, 2,280 findings were gathered, yielding an efficiency rate of 0.46(95 %CI 0.45-0.48) across all platforms. Kahun was the most efficient tool 0.74 (95 %CI 0.70-0.77) without a statistically significant difference from Your.MD 0.69(95 %CI 0.65-0.73). CONCLUSION The data-gathering performance of currently available symptom checkers is questionable. From among the tools available, Kahun demonstrated the best overall performance.
Collapse
Affiliation(s)
- Niv Ben-Shabat
- Sackler Faculty of Medicine, Tel-Aviv University, Israel,Department of Medicine 'B’, Sheba Medical Centre, Ramat-Gan, Israel,Zabludowicz Center for Autoimmune Diseases, Sheba Medical Centre, Ramat-Gan, Israel,Corresponding author at: Department of Medicine 'B', Sheba Medical Center, Ramat Gan, 5262100, Israel
| | - Gal Sharvit
- Sackler Faculty of Medicine, Tel-Aviv University, Israel
| | - Ben Meimis
- Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Daniel Ben Joya
- Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Ariel Sloma
- Sackler Faculty of Medicine, Tel-Aviv University, Israel
| | | | - Aviv Shabat
- Department of Pediatrics A, Edmond and Lily Safra Children's Hospital, Sheba Medical Center, Ramat-Gan, Israel
| | - Avishai M Tsur
- Sackler Faculty of Medicine, Tel-Aviv University, Israel,Department of Medicine 'B’, Sheba Medical Centre, Ramat-Gan, Israel,Zabludowicz Center for Autoimmune Diseases, Sheba Medical Centre, Ramat-Gan, Israel,Israel Defence Forces, Medical Corps, Tel Hashomer, Ramat Gan, Israel
| | - Abdulla Watad
- Sackler Faculty of Medicine, Tel-Aviv University, Israel,Department of Medicine 'B’, Sheba Medical Centre, Ramat-Gan, Israel,Zabludowicz Center for Autoimmune Diseases, Sheba Medical Centre, Ramat-Gan, Israel,Section of Musculoskeletal Disease, NIHR Leeds Musculoskeletal Biomedical Research Unit, Leeds Institute of Molecular Medicine, University of Leeds, Chapel Allerton Hospital, Leeds, UK
| | - Howard Amital
- Sackler Faculty of Medicine, Tel-Aviv University, Israel,Department of Medicine 'B’, Sheba Medical Centre, Ramat-Gan, Israel,Zabludowicz Center for Autoimmune Diseases, Sheba Medical Centre, Ramat-Gan, Israel
| |
Collapse
|
6
|
Milavec Kapun M, Drnovšek R, Rajkovič V, Rajkovič U. A multi-criteria decision model for assessing health and self-care ability. CENTRAL EUROPEAN JOURNAL OF OPERATIONS RESEARCH 2022; 31:1-16. [PMID: 36320642 PMCID: PMC9614758 DOI: 10.1007/s10100-022-00823-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 10/07/2022] [Indexed: 06/16/2023]
Abstract
Population ageing together with the greater prevalence of multimorbidity add to the need for and complexity of healthcare services. This makes it important to encourage and empower patients with chronic diseases to take care of themselves. An associated goal of such efforts is to significantly reduce the burden on healthcare systems and positively impact patients' health outcomes and quality of life. The paper presents a multi-criteria decision model for assessing the health and self-care of patients with chronic diseases in the home environment. The model is based on the DEX methodology and was tested on ten cases. The model assists with the timely recognition of relevant symptoms and signs in decision-making about health and self-care. It can be used to promote patients taking on an active role with respect to caring for their health and well-being. The model could be integrated into self-care processes. It might also serve as a basis for an interprofessional approach to supporting older patients with chronic diseases living as fully and independently as possible in the environment in which they feel most comfortable.
Collapse
Affiliation(s)
- Marija Milavec Kapun
- Faculty of Health Sciences, University of Ljubljana, Zdravstvena Pot 5, 1000 Ljubljana, Slovenia
| | - Rok Drnovšek
- University Medical Centre Ljubljana, Zaloška Cesta 2, 1000 Ljubljana, Slovenia
- Faculty of Organizational Sciences, University of Maribor, Kidričeva Cesta 55a, 4000 Kranj, Slovenia
| | - Vladislav Rajkovič
- Faculty of Organizational Sciences, University of Maribor, Kidričeva Cesta 55a, 4000 Kranj, Slovenia
| | - Uroš Rajkovič
- Faculty of Organizational Sciences, University of Maribor, Kidričeva Cesta 55a, 4000 Kranj, Slovenia
| |
Collapse
|
7
|
Liu VDM, Kaila M, Koskela T. User initiated symptom assessment with an electronic symptom checker. Study protocol for mixed-methods validation. (Preprint). JMIR Res Protoc 2022. [PMID: 37467041 PMCID: PMC10398552 DOI: 10.2196/41423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023] Open
Abstract
BACKGROUND The national Omaolo digital social welfare and health care service of Finland provides a symptom checker, Omaolo, which is a medical device (based on Duodecim Clinical Decision Support EBMEDS software) with a CE marking (risk class IIa), manufactured by the government-owned DigiFinland Oy. Users of this service can perform their triage by using the questions in the symptom checker. By completing the symptom checker, the user receives a recommendation for action and a service assessment with appropriate guidance regarding their health problems on the basis of a selected specific symptom in the symptom checker. This allows users to be provided with appropriate health care services, regardless of time and place. OBJECTIVE This study describes the protocol for the mixed methods validation process of the symptom checker available in Omaolo digital services. METHODS This is a mixed methods study using quantitative and qualitative methods, which will be part of the clinical validation process that takes place in primary health care centers in Finland. Each organization provides a space where the study and the nurse triage can be done in order to include an unscreened target population of users. The primary health care units provide walk-in model services, where no prior phone call or contact is required. For the validation of the Omaolo symptom checker, case vignettes will be incorporated to supplement the triage accuracy of rare and acute cases that cannot be tested extensively in real-life settings. Vignettes are produced from a variety of clinical sources, and they test the symptom checker in different triage levels by using 1 standardized patient case example. RESULTS This study plan underwent an ethics review by the regional permission, which was requested from each organization participating in the research, and an ethics committee statement was requested and granted from Pirkanmaa hospital district's ethics committee, which is in accordance with the University of Tampere's regulations. Of 964 clinical user-filled symptom checker assessments, 877 cases were fully completed with a triage result, and therefore, they met the requirements for clinical validation studies. The goal for sufficient data has been reached for most of the chief symptoms. Data collection was completed in September 2019, and the first feasibility and patient experience results were published by the end of 2020. Case vignettes have been identified and are to be completed before further testing the symptom checker. The analysis and reporting are estimated to be finalized in 2024. CONCLUSIONS The primary goals of this multimethod electronic symptom checker study are to assess safety and to provide crucial information regarding the accuracy and usability of the Omaolo electronic symptom checker. To our knowledge, this will be the first study to include real-life clinical cases along with case vignettes. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/41423.
Collapse
|
8
|
Cotte F, Mueller T, Gilbert S, Blümke B, Multmeier J, Hirsch MC, Wicks P, Wolanski J, Tutschkow D, Schade Brittinger C, Timmermann L, Jerrentrup A. Safety of Triage Self-assessment Using a Symptom Assessment App for Walk-in Patients in the Emergency Care Setting: Observational Prospective Cross-sectional Study. JMIR Mhealth Uhealth 2022; 10:e32340. [PMID: 35343909 PMCID: PMC9002590 DOI: 10.2196/32340] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 12/17/2021] [Accepted: 02/18/2022] [Indexed: 01/29/2023] Open
Abstract
Background Increasing use of emergency departments (EDs) by patients with low urgency, combined with limited availability of medical staff, results in extended waiting times and delayed care. Technological approaches could possibly increase efficiency by providing urgency advice and symptom assessments. Objective The purpose of this study is to evaluate the safety of urgency advice provided by a symptom assessment app, Ada, in an ED. Methods The study was conducted at the interdisciplinary ED of Marburg University Hospital, with data collection performed between August 2019 and March 2020. This study had a single-center cross-sectional prospective observational design and included 378 patients. The app’s urgency recommendation was compared with an established triage concept (Manchester Triage System [MTS]), including patients from the lower 3 MTS categories only. For all patients who were undertriaged, an expert physician panel assessed the case to detect potential avoidable hazardous situations (AHSs). Results Of 378 participants, 344 (91%) were triaged the same or more conservatively and 34 (8.9%) were undertriaged by the app. Of the 378 patients, 14 (3.7%) had received safe advice determined by the expert panel and 20 (5.3%) were considered to be potential AHS. Therefore, the assessment could be considered safe in 94.7% (358/378) of the patients when compared with the MTS assessment. From the 3 lowest MTS categories, 43.4% (164/378) of patients were not considered as emergency cases by the app, but could have been safely treated by a general practitioner or would not have required a physician consultation at all. Conclusions The app provided urgency advice after patient self-triage that has a high rate of safety, a rate of undertriage, and a rate of triage with potential to be an AHS, equivalent to telephone triage by health care professionals while still being more conservative than direct ED triage. A large proportion of patients in the ED were not considered as emergency cases, which could possibly relieve ED burden if used at home. Further research should be conducted in the at-home setting to evaluate this hypothesis. Trial Registration German Clinical Trial Registration DRKS00024909; https://www.drks.de/drks_web/navigate.do? navigationId=trial.HTML&TRIAL_ID=DRKS00024909
Collapse
Affiliation(s)
- Fabienne Cotte
- Charité Universitäsmedizin Berlin, Berlin, Germany.,Department of Emergency Medicine, University Clinic Marburg, Philipps-University, Marburg, Germany.,Ada Health GmbH, Berlin, Germany
| | - Tobias Mueller
- Center for Unknown and Rare Diseases, UKGM GmbH, University Clinic Marburg, Philipps-University, Marburg, Germany
| | - Stephen Gilbert
- Ada Health GmbH, Berlin, Germany.,Else Kröner Fresenius Center for Digital Health, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | | | | | - Martin Christian Hirsch
- Ada Health GmbH, Berlin, Germany.,Institute of Artificial Intelligence, Philipps-University Marburg, Marburg, Germany
| | | | | | - Darja Tutschkow
- Coordinating Center for Clinical Trials, Philipps University Marburg, Marburg, Germany, Marburg, Germany
| | - Carmen Schade Brittinger
- Coordinating Center for Clinical Trials, Philipps University Marburg, Marburg, Germany, Marburg, Germany
| | - Lars Timmermann
- Department of Neurology, University Hospital of Marburg, Marburg, Germany
| | - Andreas Jerrentrup
- Department of Emergency Medicine, University Clinic Marburg, Philipps-University, Marburg, Germany
| |
Collapse
|
9
|
Chan F, Lai S, Pieterman M, Richardson L, Singh A, Peters J, Toy A, Piccininni C, Rouault T, Wong K, Quong JK, Wakabayashi AT, Pawelec-Brzychczy A. Performance of a new symptom checker in patient triage: Canadian cohort study. PLoS One 2021; 16:e0260696. [PMID: 34852016 PMCID: PMC8635379 DOI: 10.1371/journal.pone.0260696] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 11/15/2021] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Computerized algorithms known as symptom checkers aim to help patients decide what to do should they have a new medical concern. However, despite widespread implementation, most studies on symptom checkers have involved simulated patients. Only limited evidence currently exists about symptom checker safety or accuracy when used by real patients. We developed a new prototype symptom checker and assessed its safety and accuracy in a prospective cohort of patients presenting to primary care and emergency departments with new medical concerns. METHOD A prospective cohort study was done to assess the prototype's performance. The cohort consisted of adult patients (≥16 years old) who presented to hospital emergency departments and family physician clinics. Primary outcomes were safety and accuracy of triage recommendations to seek hospital care, seek primary care, or manage symptoms at home. RESULTS Data from 281 hospital patients and 300 clinic patients were collected and analyzed. Sensitivity to emergencies was 100% (10/10 encounters). Sensitivity to urgencies was 90% (73/81) and 97% (34/35) for hospital and primary care patients, respectively. The prototype was significantly more accurate than patients at triage (73% versus 58%, p<0.01). Compliance with triage recommendations in this cohort using this iteration of the symptom checker would have reduced hospital visits by 55% but cause potential harm in 2-3% from delay in care. INTERPRETATION The prototype symptom checker was superior to patients in deciding the most appropriate treatment setting for medical issues. This symptom checker could reduce a significant number of unnecessary hospital visits, with accuracy and safety outcomes comparable to existing data on telephone triage.
Collapse
Affiliation(s)
- Forson Chan
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Simon Lai
- University of British Columbia, Faculty of Medicine, Health Sciences Mall, Vancouver, Canada
| | - Marcus Pieterman
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Lisa Richardson
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Amanda Singh
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Jocelynn Peters
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Alex Toy
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Caroline Piccininni
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Taiysa Rouault
- University of British Columbia, Faculty of Medicine, Health Sciences Mall, Vancouver, Canada
| | - Kristie Wong
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | | | - Adrienne T. Wakabayashi
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Anna Pawelec-Brzychczy
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| |
Collapse
|
10
|
Munsch N, Martin A, Gruarin S, Nateqi J, Abdarahmane I, Weingartner-Ortner R, Knapp B. Diagnostic Accuracy of Web-Based COVID-19 Symptom Checkers: Comparison Study. J Med Internet Res 2020; 22:e21299. [PMID: 33001828 PMCID: PMC7541039 DOI: 10.2196/21299] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 07/27/2020] [Accepted: 09/14/2020] [Indexed: 01/06/2023] Open
Abstract
Background A large number of web-based COVID-19 symptom checkers and chatbots have been developed; however, anecdotal evidence suggests that their conclusions are highly variable. To our knowledge, no study has evaluated the accuracy of COVID-19 symptom checkers in a statistically rigorous manner. Objective The aim of this study is to evaluate and compare the diagnostic accuracies of web-based COVID-19 symptom checkers. Methods We identified 10 web-based COVID-19 symptom checkers, all of which were included in the study. We evaluated the COVID-19 symptom checkers by assessing 50 COVID-19 case reports alongside 410 non–COVID-19 control cases. A bootstrapping method was used to counter the unbalanced sample sizes and obtain confidence intervals (CIs). Results are reported as sensitivity, specificity, F1 score, and Matthews correlation coefficient (MCC). Results The classification task between COVID-19–positive and COVID-19–negative for “high risk” cases among the 460 test cases yielded (sorted by F1 score): Symptoma (F1=0.92, MCC=0.85), Infermedica (F1=0.80, MCC=0.61), US Centers for Disease Control and Prevention (CDC) (F1=0.71, MCC=0.30), Babylon (F1=0.70, MCC=0.29), Cleveland Clinic (F1=0.40, MCC=0.07), Providence (F1=0.40, MCC=0.05), Apple (F1=0.29, MCC=-0.10), Docyet (F1=0.27, MCC=0.29), Ada (F1=0.24, MCC=0.27) and Your.MD (F1=0.24, MCC=0.27). For “high risk” and “medium risk” combined the performance was: Symptoma (F1=0.91, MCC=0.83) Infermedica (F1=0.80, MCC=0.61), Cleveland Clinic (F1=0.76, MCC=0.47), Providence (F1=0.75, MCC=0.45), Your.MD (F1=0.72, MCC=0.33), CDC (F1=0.71, MCC=0.30), Babylon (F1=0.70, MCC=0.29), Apple (F1=0.70, MCC=0.25), Ada (F1=0.42, MCC=0.03), and Docyet (F1=0.27, MCC=0.29). Conclusions We found that the number of correctly assessed COVID-19 and control cases varies considerably between symptom checkers, with different symptom checkers showing different strengths with respect to sensitivity and specificity. A good balance between sensitivity and specificity was only achieved by two symptom checkers.
Collapse
Affiliation(s)
| | | | | | - Jama Nateqi
- Medical Department, Symptoma, Attersee, Austria.,Department of Internal Medicine, Paracelsus Medical University, Salzburg, Austria
| | | | | | | |
Collapse
|
11
|
Hill MG, Sim M, Mills B. The quality of diagnosis and triage advice provided by free online symptom checkers and apps in Australia. Med J Aust 2020; 212:514-519. [PMID: 32391611 DOI: 10.5694/mja2.50600] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Accepted: 12/10/2019] [Indexed: 01/07/2023]
Abstract
OBJECTIVES To investigate the quality of diagnostic and triage advice provided by free website and mobile application symptom checkers (SCs) accessible in Australia. DESIGN 36 SCs providing medical diagnosis or triage advice were tested with 48 medical condition vignettes (1170 diagnosis vignette tests, 688 triage vignette tests). MAIN OUTCOME MEASURES Correct diagnosis advice (provided in first, the top three or top ten diagnosis results); correct triage advice (appropriate triage category recommended). RESULTS The 27 diagnostic SCs listed the correct diagnosis first in 421 of 1170 SC vignette tests (36%; 95% CI, 31-42%), among the top three results in 606 tests (52%; 95% CI, 47-59%), and among the top ten results in 681 tests (58%; 95% CI, 53-65%). SCs using artificial intelligence algorithms listed the correct diagnosis first in 46% of tests (95% CI, 40-57%), compared with 32% (95% CI, 26-38%) for other SCs. The mean rate of first correct results for individual SCs ranged between 12% and 61%. The 19 triage SCs provided correct advice for 338 of 688 vignette tests (49%; 95% CI, 44-54%). Appropriate triage advice was more frequent for emergency care (63%; 95% CI, 52-71%) and urgent care vignette tests (56%; 95% CI, 52-75%) than for non-urgent care (30%; 95% CI, 11-39%) and self-care tests (40%; 95% CI, 26-49%). CONCLUSION The quality of diagnostic advice varied between SCs, and triage advice was generally risk-averse, often recommending more urgent care than appropriate.
Collapse
|
12
|
Powell J. Trust Me, I'm a Chatbot: How Artificial Intelligence in Health Care Fails the Turing Test. J Med Internet Res 2019; 21:e16222. [PMID: 31661083 PMCID: PMC6914236 DOI: 10.2196/16222] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 10/12/2019] [Indexed: 11/23/2022] Open
Abstract
Over the next decade, one issue which will dominate sociotechnical studies in health informatics is the extent to which the promise of artificial intelligence in health care will be realized, along with the social and ethical issues which accompany it. A useful thought experiment is the application of the Turing test to user-facing artificial intelligence systems in health care (such as chatbots or conversational agents). In this paper I argue that many medical decisions require value judgements and the doctor-patient relationship requires empathy and understanding to arrive at a shared decision, often handling large areas of uncertainty and balancing competing risks. Arguably, medicine requires wisdom more than intelligence, artificial or otherwise. Artificial intelligence therefore needs to supplement rather than replace medical professionals, and identifying the complementary positioning of artificial intelligence in medical consultation is a key challenge for the future. In health care, artificial intelligence needs to pass the implementation game, not the imitation game.
Collapse
Affiliation(s)
- John Powell
- Nuffield Department of Primary Care Health Sciences, Medical Sciences Division, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
13
|
Chambers D, Cantrell AJ, Johnson M, Preston L, Baxter SK, Booth A, Turner J. Digital and online symptom checkers and health assessment/triage services for urgent health problems: systematic review. BMJ Open 2019; 9:e027743. [PMID: 31375610 PMCID: PMC6688675 DOI: 10.1136/bmjopen-2018-027743] [Citation(s) in RCA: 87] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 06/12/2019] [Accepted: 07/02/2019] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVES In England, the NHS111 service provides assessment and triage by telephone for urgent health problems. A digital version of this service has recently been introduced. We aimed to systematically review the evidence on digital and online symptom checkers and similar services. DESIGN Systematic review. DATA SOURCES We searched Medline, Embase, the Cochrane Library, Cumulative Index to Nursing and Allied Health Literature (CINAHL), Health Management Information Consortium, Web of Science and ACM Digital Library up to April 2018, supplemented by phrase searches for known symptom checkers and citation searching of key studies. ELIGIBILITY CRITERIA Studies of any design that evaluated a digital or online symptom checker or health assessment service for people seeking advice about an urgent health problem. DATA EXTRACTION AND SYNTHESIS Data extraction and quality assessment (using the Cochrane Collaboration version of QUADAS for diagnostic accuracy studies and the National Heart, Lung and Blood Institute tool for observational studies) were done by one reviewer with a sample checked for accuracy and consistency. We performed a narrative synthesis of the included studies structured around pre-defined research questions and key outcomes. RESULTS We included 29 publications (27 studies). Evidence on patient safety was weak. Diagnostic accuracy varied between different systems but was generally low. Algorithm-based triage tended to be more risk averse than that of health professionals. There was very limited evidence on patients' compliance with online triage advice. Study participants generally expressed high levels of satisfaction, although in mainly uncontrolled studies. Younger and more highly educated people were more likely to use these services. CONCLUSIONS The English 'digital 111' service has been implemented against a background of uncertainty around the likely impact on important outcomes. The health system may need to respond to short-term changes and/or shifts in demand. The popularity of online and digital services with younger and more educated people has implications for health equity. PROSPERO REGISTRATION NUMBER CRD42018093564.
Collapse
Affiliation(s)
- Duncan Chambers
- School of Health and Related Research, The University of Sheffield, Sheffield, UK
| | - Anna J Cantrell
- School of Health and Related Research, The University of Sheffield, Sheffield, UK
| | - Maxine Johnson
- School of Health and Related Research, The University of Sheffield, Sheffield, UK
| | - Louise Preston
- School of Health and Related Research, The University of Sheffield, Sheffield, UK
| | - Susan K Baxter
- School of Health and Related Research, The University of Sheffield, Sheffield, UK
| | - Andrew Booth
- School of Health and Related Research, The University of Sheffield, Sheffield, UK
| | - Janette Turner
- School of Health and Related Research, The University of Sheffield, Sheffield, UK
| |
Collapse
|
14
|
Chambers D, Cantrell A, Johnson M, Preston L, Baxter SK, Booth A, Turner J. Digital and online symptom checkers and assessment services for urgent care to inform a new digital platform: a systematic review. HEALTH SERVICES AND DELIVERY RESEARCH 2019. [DOI: 10.3310/hsdr07290] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Background
Digital and online symptom checkers and assessment services are used by patients seeking guidance about health problems. NHS England is planning to introduce a digital platform (NHS111 Online) to operate alongside the NHS111 urgent-care telephone service. This review focuses on digital and online symptom checkers for urgent health problems.
Objectives
This systematic review was commissioned to provide NHS England with an independent review of previous research in this area to inform strategic decision-making and service design.
Data sources
Focused searches of seven bibliographic databases were performed and supplemented by phrase searching for names of symptom checker systems and citation searches of key included studies. The bibliographic databases searched were MEDLINE, EMBASE, The Cochrane Library, CINAHL (Cumulative Index to Nursing and Allied Health Literature), HMIC (Health Management Information Consortium), Web of Science and the Association of Computing Machinery (ACM) Digital Library, from inception up to April 2018.
Review methods
Brief inclusion criteria were (1) population – general population seeking information online or digitally to address an urgent health problem; (2) intervention – any online or digital service designed to assess symptoms, provide health advice and direct patients to appropriate services; and (3) comparator – telephone or face-to-face assessment, comparative performance in tests or simulations (studies with no comparator were included if they reported relevant outcomes). Outcomes of interest included safety, clinical effectiveness, costs or cost-effectiveness, diagnostic and triage accuracy, use of and contacts with health services, compliance with advice received, patient/carer satisfaction, and equity and inclusion. Inclusion was not restricted by study design. Screening studies for inclusion, data extraction and quality assessment were carried out by one reviewer with a sample checked for accuracy and consistency. Final decisions on study inclusion were taken by consensus of the review team. A narrative synthesis of the included studies was performed and structured around the predefined research questions and key outcomes. The overall strength of evidence for each outcome was classified as ‘stronger’, ‘weaker’, ‘conflicting’ or ‘insufficient’, based on study numbers and design.
Results
In total, 29 publications describing 27 studies were included. Studies were diverse in their design and methodology. The overall strength of the evidence was weak because it was largely based on observational studies and with a substantial component of non-peer-reviewed grey literature. There was little evidence to suggest that symptom checkers are unsafe, but studies evaluating their safety were generally short term and small scale. Diagnostic accuracy was highly variable between different systems but was generally low. Algorithm-based triage tended to be more risk averse than that of health professionals. Inconsistent evidence was found on effects on service use. There was very limited evidence on patients’ reactions to online triage advice. The studies showed that younger and more highly educated people are more likely to use these services. Study participants generally expressed high levels of satisfaction with digital and online triage services, albeit in uncontrolled studies.
Limitations
Findings from symptom checker systems for specific conditions may not be applicable to more general systems and vice versa. Studies of symptom checkers as part of electronic consultation systems in general practice were also included, which is a slightly different setting from a general ‘digital 111’ service. Most studies were screened by one reviewer.
Conclusions
Major uncertainties surround the probable impact of digital 111 services on most outcomes. It will be important to monitor and evaluate the services using all available data sources and by commissioning high-quality research.
Future work
Priorities for research include comparisons of different systems, rigorous economic evaluations and investigations of patient pathways.
Study registration
The study is registered as PROSPERO CRD42018093564.
Funding
The National Institute for Health Research Health Services and Delivery Research programme.
Collapse
Affiliation(s)
- Duncan Chambers
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Anna Cantrell
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Maxine Johnson
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Louise Preston
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Susan K Baxter
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Andrew Booth
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Janette Turner
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| |
Collapse
|
15
|
Yu SWY, Ma A, Tsang VHM, Chung LSW, Leung SC, Leung LP. Triage accuracy of online symptom checkers for Accident and Emergency Department patients. HONG KONG J EMERG ME 2019. [DOI: 10.1177/1024907919842486] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Background: Overutilisation of the Accident and Emergency Department is an increasingly serious healthcare challenge. Online symptom checkers could help alleviate this challenge by allowing patients to self-triage before visiting the Accident and Emergency Department. Objectives: This study aimed to assess the triage accuracy of online symptom checkers, which would help determine the potential roles of symptom checkers in an Accident and Emergency Department setting. Methods: A total of 100 random Accident and Emergency Department records were sampled from the Queen Mary Hospital in Hong Kong. The inclusion criteria were patients over the age of 18 attending the Queen Mary Hospital Accident and Emergency Department in 2016. Symptom checkers by Drugs.com and FamilyDoctor were selected as representative tools. One triage recommendation was generated by each symptom checker for each case record. Each symptom checker’s triage accuracy was then evaluated using a few outcome measures: overall sensitivity, sensitivity for emergency cases and specificity for non-emergency cases, when compared with the triage categories assigned by the triage nurses. Results: The results showed that Drugs.com had a higher overall triage accuracy than FamilyDoctor (74% and 50%, respectively), but both checkers are inadequately sensitive to emergency cases (70% and 45%, respectively) with low negative predictive values (43% and 24%, respectively). Conclusion: In their current states, symptom checkers are not yet suitable as alternatives to Accident and Emergency Department triage protocols due to their low overall sensitivities and negative predictive values. However, symptom checkers might serve as useful Accident and Emergency Department adjuncts in other ways, such as to provide more information prior to a patient’s arrival to streamline the triage and preparation process at the Accident and Emergency Department.
Collapse
Affiliation(s)
- Stephanie Wing Yin Yu
- Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR
| | - Andre Ma
- Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR
| | - Vivian Hiu Man Tsang
- Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR
| | - Lulu Suet Wing Chung
- Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR
| | - Siu-Chung Leung
- Accident and Emergency Department, Queen Mary Hospital, Pokfulam, Hong Kong SAR
| | - Ling-Pong Leung
- Emergency Medicine Unit, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR
| |
Collapse
|
16
|
The Role of a Decision Support System in Back Pain Diagnoses: A Pilot Study. BIOMED RESEARCH INTERNATIONAL 2019; 2019:1314028. [PMID: 31019964 PMCID: PMC6452564 DOI: 10.1155/2019/1314028] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Accepted: 02/05/2019] [Indexed: 12/20/2022]
Abstract
It is the main goal of this study to investigate the concordance of a decision support system and the recommendation of spinal surgeons regarding back pain. 111 patients had to complete the decision support system. Furthermore, their illness was diagnosed by a spinal surgeon. The results showed significant medium relation between the DSS and the diagnosis of the medical doctor. Besides, in almost 50% of the cases the recommendation for the treatment was concordant and overestimation occurred more often than underestimation. The results are discussed in relation to the “symptom checker” literature and the claim of further evaluations.
Collapse
|
17
|
Verzantvoort NCM, Teunis T, Verheij TJM, van der Velden AW. Self-triage for acute primary care via a smartphone application: Practical, safe and efficient? PLoS One 2018; 13:e0199284. [PMID: 29944708 PMCID: PMC6019095 DOI: 10.1371/journal.pone.0199284] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 06/05/2018] [Indexed: 11/19/2022] Open
Abstract
Background Since the start of out-of-hours (OOH) primary care clinics, the number of patient consultations has been increasing. Triage plays an important role in patient selection for a consultation, and in providing reassurance and self-management advice. Objective We aimed to investigate whether the smartphone application “Should I see a doctor?” (in Dutch:”moet ik naar de dokter?”) could guide patients in appropriate consultation at OOH clinics by focusing on four topics: 1) app usage, 2) user satisfaction, 3) whether the app provides the correct advice, and 4) whether users intend to follow the advice. Design and setting A prospective, cross-sectional study amongst app users in a routine primary care setting. Methods The app is a self-triage tool for acute primary care. A built-in questionnaire asked users about the app’s clarity, their satisfaction and whether they intended to follow the app’s advice (n = 4456). A convenience sample of users was phoned by a triage nurse (reference standard) to evaluate whether the app’s advice corresponded with the outcome of the triage call (n = 126). Suggestions of phoned participants were listed. Results The app was used by patients of all ages, also by parents for their children, and mostly for abdominal pain, skin disorders and cough. 58% of users received the advice to contact the clinic, 34% a self-care advice and 8% to wait-and-see. 65% of users intended to follow the app’s advice. The app was rated as ‘neutral’ to ‘very clear’ by 87%, and 89% were ‘neutral’ to ‘very satisfied’. In 81% of participants the app’s advice corresponded to the triage call outcome, with sensitivity, specificity, positive- and negative predictive values of 84%, 74%, 88% and 67%, respectively. Conclusion The app “Should I see a doctor?” could be a valuable tool to guide patients in contacting the OOH primary care clinic for acute care. To further improve the app’s safety and efficiency, triaging multiple symptoms should be facilitated, and more information should be provided to patients receiving a wait-and-see advice.
Collapse
Affiliation(s)
- Natascha C. M. Verzantvoort
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Teun Teunis
- Plastic, Reconstructive and Hand Surgery, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Theo J. M. Verheij
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Alike W. van der Velden
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, the Netherlands
- * E-mail:
| |
Collapse
|
18
|
Semigran HL, Linder JA, Gidengil C, Mehrotra A. Evaluation of symptom checkers for self diagnosis and triage: audit study. BMJ 2015; 351:h3480. [PMID: 26157077 PMCID: PMC4496786 DOI: 10.1136/bmj.h3480] [Citation(s) in RCA: 195] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/15/2015] [Indexed: 01/17/2023]
Abstract
OBJECTIVE To determine the diagnostic and triage accuracy of online symptom checkers (tools that use computer algorithms to help patients with self diagnosis or self triage). DESIGN Audit study. SETTING Publicly available, free symptom checkers. PARTICIPANTS 23 symptom checkers that were in English and provided advice across a range of conditions. 45 standardized patient vignettes were compiled and equally divided into three categories of triage urgency: emergent care required (for example, pulmonary embolism), non-emergent care reasonable (for example, otitis media), and self care reasonable (for example, viral upper respiratory tract infection). MAIN OUTCOME MEASURES For symptom checkers that provided a diagnosis, our main outcomes were whether the symptom checker listed the correct diagnosis first or within the first 20 potential diagnoses (n=770 standardized patient evaluations). For symptom checkers that provided a triage recommendation, our main outcomes were whether the symptom checker correctly recommended emergent care, non-emergent care, or self care (n=532 standardized patient evaluations). RESULTS The 23 symptom checkers provided the correct diagnosis first in 34% (95% confidence interval 31% to 37%) of standardized patient evaluations, listed the correct diagnosis within the top 20 diagnoses given in 58% (55% to 62%) of standardized patient evaluations, and provided the appropriate triage advice in 57% (52% to 61%) of standardized patient evaluations. Triage performance varied by urgency of condition, with appropriate triage advice provided in 80% (95% confidence interval 75% to 86%) of emergent cases, 55% (47% to 63%) of non-emergent cases, and 33% (26% to 40%) of self care cases (P<0.001). Performance on appropriate triage advice across the 23 individual symptom checkers ranged from 33% (95% confidence interval 19% to 48%) to 78% (64% to 91%) of standardized patient evaluations. CONCLUSIONS Symptom checkers had deficits in both triage and diagnosis. Triage advice from symptom checkers is generally risk averse, encouraging users to seek care for conditions where self care is reasonable.
Collapse
Affiliation(s)
- Hannah L Semigran
- Department of Health Care Policy, Harvard Medical School, Boston, MA 02115, USA
| | - Jeffrey A Linder
- Division of General Medicine and Primary Care, Brigham and Women's Hospital & Harvard Medical School, Boston, MA, USA
| | - Courtney Gidengil
- Division of Infectious Diseases, Boston Children's Hospital, Boston, MA, USA RAND Corporation, Boston, MA, USA
| | - Ateev Mehrotra
- Department of Health Care Policy, Harvard Medical School, Boston, MA 02115, USA Division of General Internal Medicine and Primary Care, Beth Israel Deaconess Medical Center, Boston, MA, USA
| |
Collapse
|