1
|
Chamorro-Delmo J, Lopez-Fernandez O, Villasante-Soriano P, Antonio PPD, Álvarez-García R, Porras-Segovia A, Baca-García E. A feasibility study of a Smart screening tool for people at risk of mental health issues: Response rate, and sociodemographic and clinical factors. J Affect Disord 2024; 362:755-761. [PMID: 39029676 DOI: 10.1016/j.jad.2024.07.067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 07/03/2024] [Accepted: 07/14/2024] [Indexed: 07/21/2024]
Abstract
BACKGROUND The use of Smart Screening tools to identify mental health problems has scarce empirical data on their effectiveness. This study aims to explore the response rate of patients to this tool and observe their socio-demographic and healthcare characteristics, and the tool's ability to detect potential mental health diagnoses. METHODS The study employed an online survey within patient portal from patients of four teaching hospitals in Madrid. The sample included 8749 patients, comprising 66.77 % females and 31.21 % middle-aged adults. RESULTS 60.56 % responded to the Smart Screening tool. Respondents were found to be predominantly middle-aged women who had been contacted by mental health services multiple times but had not exhibited suicidal behaviour. These patients demonstrated a higher appointment attendance rate and generated low healthcare costs. The tool identified probable low depression and mild anxiety (72.16 %), and individuals aged 50-65 exhibited higher levels of mental health problems, such as psychosis and suicidality, although these results were not all significant regarding previous mental health diagnoses. LIMITATIONS The Smart Screening tool collects anonymous online data through short questionnaires to apply sophisticated algorithms and determine probable mental health diagnoses. CONCLUSIONS The response rate to the Smart Screening tool was higher than in previous studies. The respondents' profile was middle-aged and older women with moderate mental health problems, although suicidality was also identified. Future research should focus on those who did not respond to the tool and explore the link between previous psychiatric diagnoses and the accuracy of the Smart Screening tool.
Collapse
Affiliation(s)
- Jaime Chamorro-Delmo
- Department of Psychiatry, University Hospital Jimenez Diaz Foundation, 28040 Madrid, Spain
| | - Olatz Lopez-Fernandez
- Department of Psychiatry, University Hospital Jimenez Diaz Foundation, 28040 Madrid, Spain; Department of Personality, Assessment and Clinical Psychology, Faculty of Psychology, Universidad Complutense de Madrid, 28049 Madrid, Spain; Faculty of Education and Psychology, Universidad Francisco de Vitoria, 28049 Madrid, Spain; Faculty of Psychology, Centro de Enseñanza Superior Cardenal Cisneros, Universidad Complutense de Madrid, 28223 Madrid, Spain
| | - Paula Villasante-Soriano
- Department of Psychiatry, University Hospital Jimenez Diaz Foundation, 28040 Madrid, Spain; ROSAN International Consulting & Research, Valencia, Spain
| | | | | | - Alejandro Porras-Segovia
- Department of Psychiatry, University Hospital Rey Juan Carlos, Móstoles, Spain; Translational Psychiatry Research Group, Health Research Institute Jimenez Diaz Foundation, Madrid, Spain
| | - Enrique Baca-García
- Department of Psychiatry, University Hospital Jimenez Diaz Foundation, 28040 Madrid, Spain; Department of Psychiatry, University Hospital Rey Juan Carlos, Móstoles, Spain; Translational Psychiatry Research Group, Health Research Institute Jimenez Diaz Foundation, Madrid, Spain; Department of Psychiatry, University Hospital Infanta Elena, Valdemoro, Spain; CIBERSAM, research group CB/07/09/0025, Madrid, Spain; Nimes University Hospital, Nimes, France; Department of Psychiatry, Universidad Autónoma de Madrid, Madrid, Spain; Department of Psychiatry, Central Hospital de Villalba, Villalba, Spain.
| |
Collapse
|
2
|
Hindelang M, Sitaru S, Zink A. Transforming Health Care Through Chatbots for Medical History-Taking and Future Directions: Comprehensive Systematic Review. JMIR Med Inform 2024; 12:e56628. [PMID: 39207827 PMCID: PMC11393511 DOI: 10.2196/56628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 05/08/2024] [Accepted: 07/11/2024] [Indexed: 09/04/2024] Open
Abstract
BACKGROUND The integration of artificial intelligence and chatbot technology in health care has attracted significant attention due to its potential to improve patient care and streamline history-taking. As artificial intelligence-driven conversational agents, chatbots offer the opportunity to revolutionize history-taking, necessitating a comprehensive examination of their impact on medical practice. OBJECTIVE This systematic review aims to assess the role, effectiveness, usability, and patient acceptance of chatbots in medical history-taking. It also examines potential challenges and future opportunities for integration into clinical practice. METHODS A systematic search included PubMed, Embase, MEDLINE (via Ovid), CENTRAL, Scopus, and Open Science and covered studies through July 2024. The inclusion and exclusion criteria for the studies reviewed were based on the PICOS (participants, interventions, comparators, outcomes, and study design) framework. The population included individuals using health care chatbots for medical history-taking. Interventions focused on chatbots designed to facilitate medical history-taking. The outcomes of interest were the feasibility, acceptance, and usability of chatbot-based medical history-taking. Studies not reporting on these outcomes were excluded. All study designs except conference papers were eligible for inclusion. Only English-language studies were considered. There were no specific restrictions on study duration. Key search terms included "chatbot*," "conversational agent*," "virtual assistant," "artificial intelligence chatbot," "medical history," and "history-taking." The quality of observational studies was classified using the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) criteria (eg, sample size, design, data collection, and follow-up). The RoB 2 (Risk of Bias) tool assessed areas and the levels of bias in randomized controlled trials (RCTs). RESULTS The review included 15 observational studies and 3 RCTs and synthesized evidence from different medical fields and populations. Chatbots systematically collect information through targeted queries and data retrieval, improving patient engagement and satisfaction. The results show that chatbots have great potential for history-taking and that the efficiency and accessibility of the health care system can be improved by 24/7 automated data collection. Bias assessments revealed that of the 15 observational studies, 5 (33%) studies were of high quality, 5 (33%) studies were of moderate quality, and 5 (33%) studies were of low quality. Of the RCTs, 2 had a low risk of bias, while 1 had a high risk. CONCLUSIONS This systematic review provides critical insights into the potential benefits and challenges of using chatbots for medical history-taking. The included studies showed that chatbots can increase patient engagement, streamline data collection, and improve health care decision-making. For effective integration into clinical practice, it is crucial to design user-friendly interfaces, ensure robust data security, and maintain empathetic patient-physician interactions. Future research should focus on refining chatbot algorithms, improving their emotional intelligence, and extending their application to different health care settings to realize their full potential in modern medicine. TRIAL REGISTRATION PROSPERO CRD42023410312; www.crd.york.ac.uk/prospero.
Collapse
Affiliation(s)
- Michael Hindelang
- Department of Dermatology and Allergy, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
- Pettenkofer School of Public Health, Munich, Germany
- Institute for Medical Information Processing, Biometry and Epidemiology (IBE), Faculty of Medicine, Ludwig-Maximilian University, LMU, Munich, Germany
| | - Sebastian Sitaru
- Department of Dermatology and Allergy, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
| | - Alexander Zink
- Department of Dermatology and Allergy, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
- Division of Dermatology and Venereology, Department of Medicine Solna, Karolinska Institute, Stockholm, Sweden
| |
Collapse
|
3
|
Szumilas D, Ochmann A, Zięba K, Bartoszewicz B, Kubrak A, Makuch S, Agrawal S, Mazur G, Chudek J. Evaluation of AI-Driven LabTest Checker for Diagnostic Accuracy and Safety: Prospective Cohort Study. JMIR Med Inform 2024; 12:e57162. [PMID: 39149851 PMCID: PMC11337233 DOI: 10.2196/57162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 05/22/2024] [Accepted: 05/25/2024] [Indexed: 08/17/2024] Open
Abstract
Background In recent years, the implementation of artificial intelligence (AI) in health care is progressively transforming medical fields, with the use of clinical decision support systems (CDSSs) as a notable application. Laboratory tests are vital for accurate diagnoses, but their increasing reliance presents challenges. The need for effective strategies for managing laboratory test interpretation is evident from the millions of monthly searches on test results' significance. As the potential role of CDSSs in laboratory diagnostics gains significance, however, more research is needed to explore this area. Objective The primary objective of our study was to assess the accuracy and safety of LabTest Checker (LTC), a CDSS designed to support medical diagnoses by analyzing both laboratory test results and patients' medical histories. Methods This cohort study embraced a prospective data collection approach. A total of 101 patients aged ≥18 years, in stable condition, and requiring comprehensive diagnosis were enrolled. A panel of blood laboratory tests was conducted for each participant. Participants used LTC for test result interpretation. The accuracy and safety of the tool were assessed by comparing AI-generated suggestions to experienced doctor (consultant) recommendations, which are considered the gold standard. Results The system achieved a 74.3% accuracy and 100% sensitivity for emergency safety and 92.3% sensitivity for urgent cases. It potentially reduced unnecessary medical visits by 41.6% (42/101) and achieved an 82.9% accuracy in identifying underlying pathologies. Conclusions This study underscores the transformative potential of AI-based CDSSs in laboratory diagnostics, contributing to enhanced patient care, efficient health care systems, and improved medical outcomes. LTC's performance evaluation highlights the advancements in AI's role in laboratory medicine.
Collapse
Affiliation(s)
- Dawid Szumilas
- Department of Internal Medicine and Oncological Chemotherapy, Medical University of Silesia, Reymonta St. 8, Katowice, 40-027, Poland, +48 32 2591 202
| | - Anna Ochmann
- Department of Internal Medicine and Oncological Chemotherapy, Medical University of Silesia, Reymonta St. 8, Katowice, 40-027, Poland, +48 32 2591 202
| | - Katarzyna Zięba
- Department of Internal Medicine and Oncological Chemotherapy, Medical University of Silesia, Reymonta St. 8, Katowice, 40-027, Poland, +48 32 2591 202
| | | | | | - Sebastian Makuch
- Department of Clinical and Experimental Pathology, Wroclaw Medical University, Wroclaw, Poland
| | | | - Grzegorz Mazur
- Labplus R&D, Wroclaw, Poland
- Department and Clinic of Internal Medicine, Occupational Diseases, Hypertension and Clinical Oncology, Wroclaw Medical University, Wroclaw, Poland
| | - Jerzy Chudek
- Department of Internal Medicine and Oncological Chemotherapy, Medical University of Silesia, Reymonta St. 8, Katowice, 40-027, Poland, +48 32 2591 202
| |
Collapse
|
4
|
Knauer J, Baumeister H, Schmitt A, Terhorst Y. Acceptance of smart sensing, its determinants, and the efficacy of an acceptance-facilitating intervention in people with diabetes: results from a randomized controlled trial. Front Digit Health 2024; 6:1352762. [PMID: 38863954 PMCID: PMC11165071 DOI: 10.3389/fdgth.2024.1352762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 05/06/2024] [Indexed: 06/13/2024] Open
Abstract
Background Mental health problems are prevalent among people with diabetes, yet often under-diagnosed. Smart sensing, utilizing passively collected digital markers through digital devices, is an innovative diagnostic approach that can support mental health screening and intervention. However, the acceptance of this technology remains unclear. Grounded on the Unified Theory of Acceptance and Use of Technology (UTAUT), this study aimed to investigate (1) the acceptance of smart sensing in a diabetes sample, (2) the determinants of acceptance, and (3) the effectiveness of an acceptance facilitating intervention (AFI). Methods A total of N = 132 participants with diabetes were randomized to an intervention group (IG) or a control group (CG). The IG received a video-based AFI on smart sensing and the CG received an educational video on mindfulness. Acceptance and its potential determinants were assessed through an online questionnaire as a single post-measurement. The self-reported behavioral intention, interest in using a smart sensing application and installation of a smart sensing application were assessed as outcomes. The data were analyzed using latent structural equation modeling and t-tests. Results The acceptance of smart sensing at baseline was average (M = 12.64, SD = 4.24) with 27.8% showing low, 40.3% moderate, and 31.9% high acceptance. Performance expectancy (γ = 0.64, p < 0.001), social influence (γ = 0.23, p = .032) and trust (γ = 0.27, p = .040) were identified as potential determinants of acceptance, explaining 84% of the variance. SEM model fit was acceptable (RMSEA = 0.073, SRMR = 0.059). The intervention did not significantly impact acceptance (γ = 0.25, 95%-CI: -0.16-0.65, p = .233), interest (OR = 0.76, 95% CI: 0.38-1.52, p = .445) or app installation rates (OR = 1.13, 95% CI: 0.47-2.73, p = .777). Discussion The high variance in acceptance supports a need for acceptance facilitating procedures. The analyzed model supported performance expectancy, social influence, and trust as potential determinants of smart sensing acceptance; perceived benefit was the most influential factor towards acceptance. The AFI was not significant. Future research should further explore factors contributing to smart sensing acceptance and address implementation barriers.
Collapse
Affiliation(s)
- Johannes Knauer
- Department of Clinical Psychology and Psychotherapy, Institute of Psychology and Education, University Ulm, Ulm, Germany
| | - Harald Baumeister
- Department of Clinical Psychology and Psychotherapy, Institute of Psychology and Education, University Ulm, Ulm, Germany
| | - Andreas Schmitt
- Research Institute Diabetes Academy Mergentheim (FIDAM), Bad Mergentheim, Germany
| | - Yannik Terhorst
- Department of Psychological Methods and Assessment, Ludwigs-Maximilian University Munich, Munich, Germany
| |
Collapse
|
5
|
Hammoud M, Douglas S, Darmach M, Alawneh S, Sanyal S, Kanbour Y. Evaluating the Diagnostic Performance of Symptom Checkers: Clinical Vignette Study. JMIR AI 2024; 3:e46875. [PMID: 38875676 PMCID: PMC11091811 DOI: 10.2196/46875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 06/15/2023] [Accepted: 03/02/2024] [Indexed: 06/16/2024]
Abstract
BACKGROUND Medical self-diagnostic tools (or symptom checkers) are becoming an integral part of digital health and our daily lives, whereby patients are increasingly using them to identify the underlying causes of their symptoms. As such, it is essential to rigorously investigate and comprehensively report the diagnostic performance of symptom checkers using standard clinical and scientific approaches. OBJECTIVE This study aims to evaluate and report the accuracies of a few known and new symptom checkers using a standard and transparent methodology, which allows the scientific community to cross-validate and reproduce the reported results, a step much needed in health informatics. METHODS We propose a 4-stage experimentation methodology that capitalizes on the standard clinical vignette approach to evaluate 6 symptom checkers. To this end, we developed and peer-reviewed 400 vignettes, each approved by at least 5 out of 7 independent and experienced primary care physicians. To establish a frame of reference and interpret the results of symptom checkers accordingly, we further compared the best-performing symptom checker against 3 primary care physicians with an average experience of 16.6 (SD 9.42) years. To measure accuracy, we used 7 standard metrics, including M1 as a measure of a symptom checker's or a physician's ability to return a vignette's main diagnosis at the top of their differential list, F1-score as a trade-off measure between recall and precision, and Normalized Discounted Cumulative Gain (NDCG) as a measure of a differential list's ranking quality, among others. RESULTS The diagnostic accuracies of the 6 tested symptom checkers vary significantly. For instance, the differences in the M1, F1-score, and NDCG results between the best-performing and worst-performing symptom checkers or ranges were 65.3%, 39.2%, and 74.2%, respectively. The same was observed among the participating human physicians, whereby the M1, F1-score, and NDCG ranges were 22.8%, 15.3%, and 21.3%, respectively. When compared against each other, physicians outperformed the best-performing symptom checker by an average of 1.2% using F1-score, whereas the best-performing symptom checker outperformed physicians by averages of 10.2% and 25.1% using M1 and NDCG, respectively. CONCLUSIONS The performance variation between symptom checkers is substantial, suggesting that symptom checkers cannot be treated as a single entity. On a different note, the best-performing symptom checker was an artificial intelligence (AI)-based one, shedding light on the promise of AI in improving the diagnostic capabilities of symptom checkers, especially as AI keeps advancing exponentially.
Collapse
|
6
|
Miller NE, North F, Curry EN, Thompson MC, Pecina JL. Recommendation endpoints and safety of an online self-triage for depression symptoms. J Telemed Telecare 2024:1357633X241245161. [PMID: 38646705 DOI: 10.1177/1357633x241245161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
INTRODUCTION Online symptom checkers are a way to address patient concerns and potentially offload a burdened healthcare system. However, safety outcomes of self-triage are unknown, so we reviewed triage recommendations and outcomes of our institution's depression symptom checker. METHODS We examined endpoint recommendations and follow-up encounters seven days afterward during 2 December 2021 to 13 December 2022. Patients with an emergency department visit or hospitalization within seven days of self-triaging had a manual review of the electronic health record to determine if the visit was related to depression, suicidal ideation, or suicide attempt. Charts were reviewed for deaths within seven days of self-triage. RESULTS There were 287 unique encounters from 263 unique patients. In 86.1% (247/287), the endpoint was an instruction to call nurse triage; in 3.1% of encounters (9/287), instruction was to seek emergency care. Only 20.2% (58/287) followed the recommendations given. Of the 229 patients that did not follow the endpoint recommendations, 121 (52.8%) had some type of follow-up within seven days. Nearly 11% (31/287) were triaged to endpoints not requiring urgent contact and 9.1% (26/287) to an endpoint that would not need any healthcare team input. No patients died in the study period. CONCLUSIONS Most patients did not follow the recommendations for follow-up care although ultimately most patients did receive care within seven days. Self-triage appears to appropriately sort patients with depressed mood to emergency care. On-line self-triaging tools for depression have the potential to safely offload some work from clinic personnel.
Collapse
Affiliation(s)
| | - Frederick North
- Division of Community Internal Medicine, Geriatrics, and Palliative Care, Mayo Clinic, Rochester, MN, USA
| | | | - Matthew C Thompson
- Mayo Clinic Enterprise Office of Access Management, Mayo Clinic, Rochester, MN, USA
| | | |
Collapse
|
7
|
Sarkar S, Gaur M, Chen LK, Garg M, Srivastava B. A review of the explainability and safety of conversational agents for mental health to identify avenues for improvement. Front Artif Intell 2023; 6:1229805. [PMID: 37899961 PMCID: PMC10601652 DOI: 10.3389/frai.2023.1229805] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 08/29/2023] [Indexed: 10/31/2023] Open
Abstract
Virtual Mental Health Assistants (VMHAs) continuously evolve to support the overloaded global healthcare system, which receives approximately 60 million primary care visits and 6 million emergency room visits annually. These systems, developed by clinical psychologists, psychiatrists, and AI researchers, are designed to aid in Cognitive Behavioral Therapy (CBT). The main focus of VMHAs is to provide relevant information to mental health professionals (MHPs) and engage in meaningful conversations to support individuals with mental health conditions. However, certain gaps prevent VMHAs from fully delivering on their promise during active communications. One of the gaps is their inability to explain their decisions to patients and MHPs, making conversations less trustworthy. Additionally, VMHAs can be vulnerable in providing unsafe responses to patient queries, further undermining their reliability. In this review, we assess the current state of VMHAs on the grounds of user-level explainability and safety, a set of desired properties for the broader adoption of VMHAs. This includes the examination of ChatGPT, a conversation agent developed on AI-driven models: GPT3.5 and GPT-4, that has been proposed for use in providing mental health services. By harnessing the collaborative and impactful contributions of AI, natural language processing, and the mental health professionals (MHPs) community, the review identifies opportunities for technological progress in VMHAs to ensure their capabilities include explainable and safe behaviors. It also emphasizes the importance of measures to guarantee that these advancements align with the promise of fostering trustworthy conversations.
Collapse
Affiliation(s)
- Surjodeep Sarkar
- Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD, United States
| | - Manas Gaur
- Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD, United States
| | - Lujie Karen Chen
- Department of Information Systems, University of Maryland, Baltimore County, Baltimore, MD, United States
| | - Muskan Garg
- Department of AI & Informatics, Mayo Clinic, Rochester, MN, United States
| | - Biplav Srivastava
- AI Institute, University of South Carolina, Columbia, SC, United States
| |
Collapse
|
8
|
Määttä J, Lindell R, Hayward N, Martikainen S, Honkanen K, Inkala M, Hirvonen P, Martikainen TJ. Diagnostic Performance, Triage Safety, and Usability of a Clinical Decision Support System Within a University Hospital Emergency Department: Algorithm Performance and Usability Study. JMIR Med Inform 2023; 11:e46760. [PMID: 37656018 PMCID: PMC10501486 DOI: 10.2196/46760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 06/22/2023] [Accepted: 07/14/2023] [Indexed: 09/02/2023] Open
Abstract
Background Computerized clinical decision support systems (CDSSs) are increasingly adopted in health care to optimize resources and streamline patient flow. However, they often lack scientific validation against standard medical care. Objective The purpose of this study was to assess the performance, safety, and usability of a CDSS in a university hospital emergency department setting in Kuopio, Finland. Methods Patients entering the emergency department were asked to voluntarily participate in this study. Patients aged 17 years or younger, patients with cognitive impairments, and patients who entered the unit in an ambulance or with the need for immediate care were excluded. Patients completed the CDSS web-based form and usability questionnaire when waiting for the triage nurse's evaluation. The CDSS data were anonymized and did not affect the patients' usual evaluation or treatment. Retrospectively, 2 medical doctors evaluated the urgency of each patient's condition by using the triage nurse's information, and urgent and nonurgent groups were created. The International Statistical Classification of Diseases, Tenth Revision diagnoses were collected from the electronic health records. Usability was assessed by using a positive version of the System Usability Scale questionnaire. Results In total, our analyses included 248 patients. Regarding urgency, the mean sensitivities were 85% and 19%, respectively, for urgent and nonurgent cases when assessing the performance of CDSS evaluations in comparison to that of physicians. The mean sensitivities were 85% and 35%, respectively, when comparing the evaluations between the two physicians. Our CDSS did not miss any cases that were evaluated to be emergencies by physicians; thus, all emergency cases evaluated by physicians were evaluated as either urgent cases or emergency cases by the CDSS. In differential diagnosis, the CDSS had an exact match accuracy of 45.5% (97/213). The usability was good, with a mean System Usability Scale score of 78.2 (SD 16.8). Conclusions In a university hospital emergency department setting with a large real-world population, our CDSS was found to be equally as sensitive in urgent patient cases as physicians and was found to have an acceptable differential diagnosis accuracy, with good usability. These results suggest that this CDSS can be safely assessed further in a real-world setting. A CDSS could accelerate triage by providing patient-provided data in advance of patients' initial consultations and categorize patient cases as urgent and nonurgent cases upon patients' arrival to the emergency department.
Collapse
Affiliation(s)
| | - Rony Lindell
- Klinik Healthcare Solutions Oy, Helsinki, Finland
| | - Nick Hayward
- Klinik Healthcare Solutions Oy, Helsinki, Finland
| | - Susanna Martikainen
- Department of Health and Social Management, University of Eastern Finland, Kuopio, Finland
| | - Katri Honkanen
- Department of Emergency Care, Kuopio University Hospital, Kuopio, Finland
| | - Matias Inkala
- Department of Emergency Care, Kuopio University Hospital, Kuopio, Finland
| | | | - Tero J Martikainen
- Department of Emergency Care, Kuopio University Hospital, Kuopio, Finland
| |
Collapse
|
9
|
Terhorst Y, Weilbacher N, Suda C, Simon L, Messner EM, Sander LB, Baumeister H. Acceptance of smart sensing: a barrier to implementation-results from a randomized controlled trial. Front Digit Health 2023; 5:1075266. [PMID: 37519894 PMCID: PMC10373890 DOI: 10.3389/fdgth.2023.1075266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 06/26/2023] [Indexed: 08/01/2023] Open
Abstract
Background Accurate and timely diagnostics are essential for effective mental healthcare. Given a resource- and time-limited mental healthcare system, novel digital and scalable diagnostic approaches such as smart sensing, which utilizes digital markers collected via sensors from digital devices, are explored. While the predictive accuracy of smart sensing is promising, its acceptance remains unclear. Based on the unified theory of acceptance and use of technology, the present study investigated (1) the effectiveness of an acceptance facilitating intervention (AFI), (2) the determinants of acceptance, and (3) the acceptance of adults toward smart sensing. Methods The participants (N = 202) were randomly assigned to a control group (CG) or intervention group (IG). The IG received a video AFI on smart sensing, and the CG a video on mindfulness. A reliable online questionnaire was used to assess acceptance, performance expectancy, effort expectancy, facilitating conditions, social influence, and trust. The self-reported interest in using and the installation of a smart sensing app were assessed as behavioral outcomes. The intervention effects were investigated in acceptance using t-tests for observed data and latent structural equation modeling (SEM) with full information maximum likelihood to handle missing data. The behavioral outcomes were analyzed with logistic regression. The determinants of acceptance were analyzed with SEM. The root mean square error of approximation (RMSEA) and standardized root mean square residual (SRMR) were used to evaluate the model fit. Results The intervention did not affect the acceptance (p = 0.357), interest (OR = 0.75, 95% CI: 0.42-1.32, p = 0.314), or installation rate (OR = 0.29, 95% CI: 0.01-2.35, p = 0.294). The performance expectancy (γ = 0.45, p < 0.001), trust (γ = 0.24, p = 0.002), and social influence (γ = 0.32, p = 0.008) were identified as the core determinants of acceptance explaining 68% of its variance. The SEM model fit was excellent (RMSEA = 0.06, SRMR = 0.05). The overall acceptance was M = 10.9 (SD = 3.73), with 35.41% of the participants showing a low, 47.92% a moderate, and 10.41% a high acceptance. Discussion The present AFI was not effective. The low to moderate acceptance of smart sensing poses a major barrier to its implementation. The performance expectancy, social influence, and trust should be targeted as the core factors of acceptance. Further studies are needed to identify effective ways to foster the acceptance of smart sensing and to develop successful implementation strategies. Clinical Trial Registration identifier 10.17605/OSF.IO/GJTPH.
Collapse
Affiliation(s)
- Yannik Terhorst
- Department of Clinical Psychology and Psychotherapy, Institute of Psychology and Education, University Ulm, Ulm, Germany
| | - Nadine Weilbacher
- Department of Clinical Psychology and Psychotherapy, Institute of Psychology and Education, University Ulm, Ulm, Germany
| | - Carolin Suda
- Department of Rehabilitation Psychology and Psychotherapy, Institute of Psychology, Albert-Ludwigs University Freiburg, Freiburg, Germany
| | - Laura Simon
- Department of Clinical Psychology and Psychotherapy, Institute of Psychology and Education, University Ulm, Ulm, Germany
| | - Eva-Maria Messner
- Department of Clinical Psychology and Psychotherapy, Institute of Psychology and Education, University Ulm, Ulm, Germany
| | - Lasse Bosse Sander
- Medical Psychology and Medical Sociology, Faculty of Medicine, Albert-Ludwigs University Freiburg, Freiburg, Germany
| | - Harald Baumeister
- Department of Clinical Psychology and Psychotherapy, Institute of Psychology and Education, University Ulm, Ulm, Germany
| |
Collapse
|
10
|
Painter A, Hayhoe B, Riboli-Sasco E, El-Osta A. Online Symptom Checkers: Recommendations for a Vignette-Based Clinical Evaluation Standard. J Med Internet Res 2022; 24:e37408. [DOI: 10.2196/37408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Revised: 09/15/2022] [Accepted: 10/11/2022] [Indexed: 11/13/2022] Open
Abstract
The use of patient-facing online symptom checkers (OSCs) has expanded in recent years, but their accuracy, safety, and impact on patient behaviors and health care systems remain unclear. The lack of a standardized process of clinical evaluation has resulted in significant variation in approaches to OSC validation and evaluation. The aim of this paper is to characterize a set of congruent requirements for a standardized vignette-based clinical evaluation process of OSCs. Discrepancies in the findings of comparative studies to date suggest that different steps in OSC evaluation methodology can significantly influence outcomes. A standardized process with a clear specification for vignette-based clinical evaluation is urgently needed to guide developers and facilitate the objective comparison of OSCs. We propose 15 recommendation requirements for an OSC evaluation standard. A third-party evaluation process and protocols for prospective real-world evidence studies should also be prioritized to quality assure OSC assessment.
Collapse
|
11
|
Fraser HSF, Cohan G, Koehler C, Anderson J, Lawrence A, Pateña J, Bacher I, Ranney ML. Evaluation of Diagnostic and Triage Accuracy and Usability of a Symptom Checker in an Emergency Department: Observational Study. JMIR Mhealth Uhealth 2022; 10:e38364. [PMID: 36121688 PMCID: PMC9531004 DOI: 10.2196/38364] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 05/31/2022] [Accepted: 06/10/2022] [Indexed: 11/26/2022] Open
Abstract
Background Symptom checkers are clinical decision support apps for patients, used by tens of millions of people annually. They are designed to provide diagnostic and triage advice and assist users in seeking the appropriate level of care. Little evidence is available regarding their diagnostic and triage accuracy with direct use by patients for urgent conditions. Objective The aim of this study is to determine the diagnostic and triage accuracy and usability of a symptom checker in use by patients presenting to an emergency department (ED). Methods We recruited a convenience sample of English-speaking patients presenting for care in an urban ED. Each consenting patient used a leading symptom checker from Ada Health before the ED evaluation. Diagnostic accuracy was evaluated by comparing the symptom checker’s diagnoses and those of 3 independent emergency physicians viewing the patient-entered symptom data, with the final diagnoses from the ED evaluation. The Ada diagnoses and triage were also critiqued by the independent physicians. The patients completed a usability survey based on the Technology Acceptance Model. Results A total of 40 (80%) of the 50 participants approached completed the symptom checker assessment and usability survey. Their mean age was 39.3 (SD 15.9; range 18-76) years, and they were 65% (26/40) female, 68% (27/40) White, 48% (19/40) Hispanic or Latino, and 13% (5/40) Black or African American. Some cases had missing data or a lack of a clear ED diagnosis; 75% (30/40) were included in the analysis of diagnosis, and 93% (37/40) for triage. The sensitivity for at least one of the final ED diagnoses by Ada (based on its top 5 diagnoses) was 70% (95% CI 54%-86%), close to the mean sensitivity for the 3 physicians (on their top 3 diagnoses) of 68.9%. The physicians rated the Ada triage decisions as 62% (23/37) fully agree and 24% (9/37) safe but too cautious. It was rated as unsafe and too risky in 22% (8/37) of cases by at least one physician, in 14% (5/37) of cases by at least two physicians, and in 5% (2/37) of cases by all 3 physicians. Usability was rated highly; participants agreed or strongly agreed with the 7 Technology Acceptance Model usability questions with a mean score of 84.6%, although “satisfaction” and “enjoyment” were rated low. Conclusions This study provides preliminary evidence that a symptom checker can provide acceptable usability and diagnostic accuracy for patients with various urgent conditions. A total of 14% (5/37) of symptom checker triage recommendations were deemed unsafe and too risky by at least two physicians based on the symptoms recorded, similar to the results of studies on telephone and nurse triage. Larger studies are needed of diagnosis and triage performance with direct patient use in different clinical environments.
Collapse
Affiliation(s)
- Hamish S F Fraser
- Brown Center for Biomedical Informatics, Warren Alpert Medical School, Brown University, Providence, RI, United States
- School of Public Health, Brown University, Providence, RI, United States
| | - Gregory Cohan
- Warren Alpert Medical School, Brown University, Providence, RI, United States
| | - Christopher Koehler
- Department of Emergency Medicine, Brown University, Providence, RI, United States
| | - Jared Anderson
- Department of Emergency Medicine, Brown University, Providence, RI, United States
| | - Alexis Lawrence
- Harvard Medical Faculty Physicians, Department of Emergency Medicine, St Luke's Hospital, New Bedford, MA, United States
| | - John Pateña
- Brown-Lifespan Center for Digital Health, Providence, RI, United States
| | - Ian Bacher
- Brown Center for Biomedical Informatics, Warren Alpert Medical School, Brown University, Providence, RI, United States
| | - Megan L Ranney
- School of Public Health, Brown University, Providence, RI, United States
- Department of Emergency Medicine, Brown University, Providence, RI, United States
- Brown-Lifespan Center for Digital Health, Providence, RI, United States
| |
Collapse
|
12
|
Zielasek J, Reinhardt I, Schmidt L, Gouzoulis-Mayfrank E. Adapting and Implementing Apps for Mental Healthcare. Curr Psychiatry Rep 2022; 24:407-417. [PMID: 35835898 PMCID: PMC9283030 DOI: 10.1007/s11920-022-01350-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/01/2022] [Indexed: 11/03/2022]
Abstract
PURPOSE OF REVIEW To describe examples of adapting apps for use in mental healthcare and to formulate recommendations for successful adaptation in mental healthcare settings. RECENT FINDINGS International examples are given to explore implementation procedures to address this multitude of challenges. There are only few published examples of adapting apps for use in mental healthcare. From these examples and from results of studies in implementation science in general clinical settings, it can be concluded that the process of adapting apps for mental healthcare needs to address clinician training and information needs, user needs which include cultural adaptation and go beyond mere translation, and organizational needs for blending app use into everyday clinical mental healthcare workflows.
Collapse
Affiliation(s)
- Jürgen Zielasek
- Section of Healthcare Research, LVR-Institute for Research and Education, Wilhelm-Griesinger Str. 23, 51109, Cologne, Germany.
- Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
| | - Isabelle Reinhardt
- Section of Healthcare Research, LVR-Institute for Research and Education, Wilhelm-Griesinger Str. 23, 51109, Cologne, Germany
| | - Laura Schmidt
- Section of Healthcare Research, LVR-Institute for Research and Education, Wilhelm-Griesinger Str. 23, 51109, Cologne, Germany
| | - Euphrosyne Gouzoulis-Mayfrank
- Section of Healthcare Research, LVR-Institute for Research and Education, Wilhelm-Griesinger Str. 23, 51109, Cologne, Germany
| |
Collapse
|
13
|
Schmieding ML, Kopka M, Schmidt K, Schulz-Niethammer S, Balzer F, Feufel MA. Triage Accuracy of Symptom Checker Apps: 5-Year Follow-up Evaluation. J Med Internet Res 2022; 24:e31810. [PMID: 35536633 PMCID: PMC9131144 DOI: 10.2196/31810] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 11/19/2021] [Accepted: 01/30/2022] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Symptom checkers are digital tools assisting laypersons in self-assessing the urgency and potential causes of their medical complaints. They are widely used but face concerns from both patients and health care professionals, especially regarding their accuracy. A 2015 landmark study substantiated these concerns using case vignettes to demonstrate that symptom checkers commonly err in their triage assessment. OBJECTIVE This study aims to revisit the landmark index study to investigate whether and how symptom checkers' capabilities have evolved since 2015 and how they currently compare with laypersons' stand-alone triage appraisal. METHODS In early 2020, we searched for smartphone and web-based applications providing triage advice. We evaluated these apps on the same 45 case vignettes as the index study. Using descriptive statistics, we compared our findings with those of the index study and with publicly available data on laypersons' triage capability. RESULTS We retrieved 22 symptom checkers providing triage advice. The median triage accuracy in 2020 (55.8%, IQR 15.1%) was close to that in 2015 (59.1%, IQR 15.5%). The apps in 2020 were less risk averse (odds 1.11:1, the ratio of overtriage errors to undertriage errors) than those in 2015 (odds 2.82:1), missing >40% of emergencies. Few apps outperformed laypersons in either deciding whether emergency care was required or whether self-care was sufficient. No apps outperformed the laypersons on both decisions. CONCLUSIONS Triage performance of symptom checkers has, on average, not improved over the course of 5 years. It decreased in 2 use cases (advice on when emergency care is required and when no health care is needed for the moment). However, triage capability varies widely within the sample of symptom checkers. Whether it is beneficial to seek advice from symptom checkers depends on the app chosen and on the specific question to be answered. Future research should develop resources (eg, case vignette repositories) to audit the capabilities of symptom checkers continuously and independently and provide guidance on when and to whom they should be recommended.
Collapse
Affiliation(s)
- Malte L Schmieding
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Marvin Kopka
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Cognitive Psychology and Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| | - Konrad Schmidt
- Institute of General Practice and Family Medicine, Jena University Hospital, Germany, Jena, Germany
- Institute of General Practice and Family Medicine, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Sven Schulz-Niethammer
- Division of Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| | - Felix Balzer
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Markus A Feufel
- Division of Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| |
Collapse
|
14
|
Millen E, Salim N, Azadzoy H, Bane MM, O'Donnell L, Schmude M, Bode P, Tuerk E, Vaidya R, Gilbert SH. Study protocol for a pilot prospective, observational study investigating the condition suggestion and urgency advice accuracy of a symptom assessment app in sub-Saharan Africa: the AFYA-'Health' Study. BMJ Open 2022; 12:e055915. [PMID: 35410928 PMCID: PMC9003603 DOI: 10.1136/bmjopen-2021-055915] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
INTRODUCTION Due to a global shortage of healthcare workers, there is a lack of basic healthcare for 4 billion people worldwide, particularly affecting low-income and middle-income countries. The utilisation of AI-based healthcare tools such as symptom assessment applications (SAAs) has the potential to reduce the burden on healthcare systems. The purpose of the AFYA Study (AI-based Assessment oF health sYmptoms in TAnzania) is to evaluate the accuracy of the condition suggestions and urgency advice provided by a user on a Swahili language Ada SAA. METHODS AND ANALYSIS This study is designed as an observational prospective clinical study. The setting is a waiting room of a Tanzanian district hospital. It will include patients entering the outpatient clinic with various conditions and age groups, including children and adolescents. Patients will be asked to use the SAA before proceeding to usual care. After usual care, they will have a consultation with a study-provided physician. Patients and healthcare practitioners will be blinded to the SAA's results. An expert panel will compare the Ada SAA's condition suggestions and urgency advice to usual care and study provided differential diagnoses and triage. The primary outcome measures are the accuracy and comprehensiveness of the Ada SAA evaluated against the gold standard differential diagnoses. ETHICS AND DISSEMINATION Ethical approval was received by the ethics committee (EC) of Muhimbili University of Health and Allied Sciences with an approval number MUHAS-REC-09-2019-044 and the National Institute for Medical Research, NIMR/HQ/R.8c/Vol. I/922. All amendments to the protocol are reported and adapted on the basis of the requirements of the EC. The results from this study will be submitted to peer-reviewed journals, local and international stakeholders, and will be communicated in editorials/articles by Ada Health. TRIAL REGISTRATION NUMBER NCT04958577.
Collapse
Affiliation(s)
| | - Nahya Salim
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, United Republic of Tanzania
| | | | - Mustafa Miraji Bane
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, United Republic of Tanzania
| | | | | | | | | | | | - Stephen Henry Gilbert
- Ada Health GmbH, Berlin, Germany
- EKFZ for Digital Health, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|