Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Siddiqui E, Shah AM, Sambol J, Waller AH. Readability Assessment of Online Patient Education Materials on Atrial Fibrillation. Cureus 2020;12:e10397. [PMID: 33062517 PMCID: PMC7552109 DOI: 10.7759/cureus.10397] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open

For:	Siddiqui E, Shah AM, Sambol J, Waller AH. Readability Assessment of Online Patient Education Materials on Atrial Fibrillation. Cureus 2020;12:e10397. [PMID: 33062517 PMCID: PMC7552109 DOI: 10.7759/cureus.10397] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open

Number

Cited by Other Article(s)

Lee TJ, Campbell DJ, Rao AK, Hossain A, Elkattawy O, Radfar N, Lee P, Gardin JM. Evaluating ChatGPT Responses on Atrial Fibrillation for Patient Education. Cureus 2024;16:e61680. [PMID: 38841294 PMCID: PMC11151148 DOI: 10.7759/cureus.61680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/04/2024] [Indexed: 06/07/2024] Open

Abstract

Background ChatGPT is a language model that has gained widespread popularity for its fine-tuned conversational abilities. However, a known drawback to the artificial intelligence (AI) chatbot is its tendency to confidently present users with inaccurate information. We evaluated the quality of ChatGPT responses to questions pertaining to atrial fibrillation for patient education. Our analysis included the accuracy and estimated grade level of answers and whether references were provided for the answers. Methodology ChatGPT was prompted four times and 16 frequently asked questions on atrial fibrillation from the American Heart Association were asked. Prompts included Form 1 (no prompt), Form 2 (patient-friendly prompt), Form 3 (physician-level prompt), and Form 4 (prompting for statistics/references). Responses were scored as incorrect, partially correct, or correct with references (perfect). Flesch-Kincaid grade-level unique words and response lengths were recorded for answers. Proportions of the responses at differing scores were compared using the chi-square analysis. The relationship between form and grade level was assessed using the analysis of variance. Results Across all forms, scoring frequencies were one (1.6%) incorrect, five (7.8%) partially correct, 55 (85.9%) correct, and three (4.7%) perfect. Proportions of responses that were at least correct did not differ by form (p = 0.350), but perfect responses did (p = 0.001). Form 2 answers had a lower mean grade level (12.80 ± 3.38) than Forms 1 (14.23 ± 2.34), 3 (16.73 ± 2.65), and 4 (14.85 ± 2.76) (p < 0.05). Across all forms, references were provided in only three (4.7%) answers. Notably, when additionally prompted for sources or references, ChatGPT still only provided sources on three responses out of 16 (18.8%). Conclusions ChatGPT holds significant potential for enhancing patient education through accurate, adaptive responses. Its ability to alter response complexity based on user input, combined with high accuracy rates, supports its use as an informational resource in healthcare settings. Future advancements and continuous monitoring of AI capabilities will be crucial in maximizing the benefits while mitigating the risks associated with AI-driven patient education.

Collapse

Lee TJ, Rao AK, Campbell DJ, Radfar N, Dayal M, Khrais A. Evaluating ChatGPT-3.5 and ChatGPT-4.0 Responses on Hyperlipidemia for Patient Education. Cureus 2024;16:e61067. [PMID: 38803402 PMCID: PMC11128363 DOI: 10.7759/cureus.61067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/25/2024] [Indexed: 05/29/2024] Open

Abstract

Introduction Hyperlipidemia is prevalent worldwide and affects a significant number of US adults. It significantly contributes to ischemic heart disease and millions of deaths annually. With the increasing use of the internet for health information, tools like ChatGPT (OpenAI, San Francisco, CA, USA) have gained traction. ChatGPT version 4.0, launched in March 2023, offers enhanced features over its predecessor but requires a monthly fee. This study compares the accuracy, comprehensibility, and response length of the free and paid versions of ChatGPT for patient education on hyperlipidemia. Materials and methods ChatGPT versions 3.5 and 4.0 were prompted in three different ways and 25 questions from the Cleveland Clinic's frequently asked questions (FAQs) on hyperlipidemia. Prompts included no prompting (Form 1), patient-friendly prompting (Form 2), and physician-level prompting (Form 3). Responses were categorized as incorrect, partially correct, or correct. Additionally, the grade level and word count from each response were recorded for analysis. Results Overall, scoring frequencies for ChatGPT version 3.5 were: five (6.67%) incorrect, 18 partially correct (24%), and 52 (69.33%) correct. Scoring frequencies for ChatGPT version 4.0 were: one (1.33%) incorrect, 18 (24.00%) partially correct, and 56 (74.67%) correct. Correct answers did not significantly differ between ChatGPT version 3.5 and ChatGPT version 4.0 (p = 0.586). ChatGPT version 3.5 had a significantly higher grade reading level than version 4.0 (p = 0.0002). ChatGPT version 3.5 had a significantly higher word count than version 4.0 (p = 0.0073). Discussion There was no significant difference in accuracy between the free and paid versions of hyperlipidemia FAQs. Both versions provided accurate but sometimes partially complete responses. Version 4.0 offered more concise and readable information, aligning with the readability of most online medical resources despite exceeding the National Institutes of Health's (NIH's) recommended eighth-grade reading level. The paid version demonstrated superior adaptability in tailoring responses based on the input. Conclusion Both versions of ChatGPT provide reliable medical information, with the paid version offering more adaptable and readable responses. Healthcare providers can recommend ChatGPT as a source of patient education, regardless of the version used. Future research should explore diverse question formulations and ChatGPT's handling of incorrect information.

Collapse

Zaleski AL, Berkowsky R, Craig KJT, Pescatello LS. Comprehensiveness, Accuracy, and Readability of Exercise Recommendations Provided by an AI-Based Chatbot: Mixed Methods Study. JMIR MEDICAL EDUCATION 2024;10:e51308. [PMID: 38206661 PMCID: PMC10811574 DOI: 10.2196/51308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 10/05/2023] [Accepted: 12/11/2023] [Indexed: 01/12/2024]

Abstract

BACKGROUND

Regular physical activity is critical for health and disease prevention. Yet, health care providers and patients face barriers to implement evidence-based lifestyle recommendations. The potential to augment care with the increased availability of artificial intelligence (AI) technologies is limitless; however, the suitability of AI-generated exercise recommendations has yet to be explored.

OBJECTIVE

The purpose of this study was to assess the comprehensiveness, accuracy, and readability of individualized exercise recommendations generated by a novel AI chatbot.

METHODS

A coding scheme was developed to score AI-generated exercise recommendations across ten categories informed by gold-standard exercise recommendations, including (1) health condition-specific benefits of exercise, (2) exercise preparticipation health screening, (3) frequency, (4) intensity, (5) time, (6) type, (7) volume, (8) progression, (9) special considerations, and (10) references to the primary literature. The AI chatbot was prompted to provide individualized exercise recommendations for 26 clinical populations using an open-source application programming interface. Two independent reviewers coded AI-generated content for each category and calculated comprehensiveness (%) and factual accuracy (%) on a scale of 0%-100%. Readability was assessed using the Flesch-Kincaid formula. Qualitative analysis identified and categorized themes from AI-generated output.

RESULTS

AI-generated exercise recommendations were 41.2% (107/260) comprehensive and 90.7% (146/161) accurate, with the majority (8/15, 53%) of inaccuracy related to the need for exercise preparticipation medical clearance. Average readability level of AI-generated exercise recommendations was at the college level (mean 13.7, SD 1.7), with an average Flesch reading ease score of 31.1 (SD 7.7). Several recurring themes and observations of AI-generated output included concern for liability and safety, preference for aerobic exercise, and potential bias and direct discrimination against certain age-based populations and individuals with disabilities.

CONCLUSIONS

There were notable gaps in the comprehensiveness, accuracy, and readability of AI-generated exercise recommendations. Exercise and health care professionals should be aware of these limitations when using and endorsing AI-based technologies as a tool to support lifestyle change involving exercise.

Collapse

Seneviratne NU, Ho SY, Boro A, Correa DJ. Readability and content gaps in online epilepsy surgery materials as potential health literacy and shared-decision-making barriers. Epilepsia Open 2023;8:1566-1575. [PMID: 37805810 PMCID: PMC10690683 DOI: 10.1002/epi4.12842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 10/04/2023] [Indexed: 10/09/2023] Open

Abstract

OBJECTIVE

Epilepsy surgery is an effective albeit underused treatment for refractory epilepsy, and online materials are vital to patient understanding of the complex process. Our goal is to analyze the readability and content inclusion of online patient health education materials designed for epilepsy surgery.

METHODS

A private browser setting was used on Google and Bing to identify the top 100 search results for the terms "epilepsy+surgery". Scientific papers, insurance pages, pay-wall access sites, and non-text content were excluded. The website text was reformatted to exclude graphics, contact information, links, and headers. Readability metrics were calculated using an online tool. Text content was analyzed for inclusion of important concepts (pre-surgical evaluation, complications, risks of continued seizures, types of surgery, complimentary diagrams/audiovisual material). Comparison of readability and content inclusion was performed as a function of organization types (epilepsy center, community health organization, pediatric-specific) and location (region, country).

RESULTS

Browser search yielded 82 distinct websites with information regarding epilepsy surgery, with 98.7% of websites exceeding the recommended 6th-grade reading level for health information. Epilepsy centers had significantly worse readability (Flesch-Kincaid Grade Level (FKGL) P < 0.01 and Flesch Reading Ease (FRE) P < 0.05). Content analysis showed that only 37% of websites discuss surgical side effects and only 23% mention the risks of continued seizures. Epilepsy centers were less likely to report information on surgical side effects (P < 0.001). UK-based websites had better readability (FKGL P < 0.01 and FRE P < 0.01) and were more likely to discuss side effects (P = 0.01) compared to US-based websites.

SIGNIFICANCE

The majority of online health content is overly complex and relatively incomplete in multiple key areas important to health literacy and understanding of surgical candidacy. Our findings suggest academic organizations, including level 4 epilepsy centers, need to simplify and broaden online education resources. More comprehensive, publicly accessible, and readable information may lead to better-shared decision-making.

Collapse

Kaya E, Görmez S. Quality and readability of online information on plantar fasciitis and calcaneal spur. Rheumatol Int 2022;42:1965-1972. [PMID: 35763090 DOI: 10.1007/s00296-022-05165-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Accepted: 06/08/2022] [Indexed: 10/17/2022]

Shneyderman M, Snow GE, Davis R, Best S, Akst LM. Readability of Online Materials Related to Vocal Cord Leukoplakia. OTO Open 2021;5:2473974X211032644. [PMID: 34396027 PMCID: PMC8358515 DOI: 10.1177/2473974x211032644] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 06/26/2021] [Indexed: 11/15/2022] Open