1
|
Ko TK, Tan DJY, Fan KS. Evaluation of the Quality and Readability of Web-Based Information Regarding Foreign Bodies of the Ear, Nose, and Throat: Qualitative Content Analysis. JMIR Form Res 2024; 8:e55535. [PMID: 39145998 PMCID: PMC11362703 DOI: 10.2196/55535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/28/2024] [Accepted: 01/29/2024] [Indexed: 08/16/2024] Open
Abstract
BACKGROUND Foreign body (FB) inhalation, ingestion, and insertion account for 11% of emergency admissions for ear, nose, and throat conditions. Children are disproportionately affected, and urgent intervention may be needed to maintain airway patency and prevent blood vessel occlusion. High-quality, readable online information could help reduce poor outcomes from FBs. OBJECTIVE We aim to evaluate the quality and readability of available online health information relating to FBs. METHODS In total, 6 search phrases were queried using the Google search engine. For each search term, the first 30 results were captured. Websites in the English language and displaying health information were included. The provider and country of origin were recorded. The modified 36-item Ensuring Quality Information for Patients tool was used to assess information quality. Readability was assessed using a combination of tools: Flesch Reading Ease score, Flesch-Kincaid Grade Level, Gunning-Fog Index, and Simple Measure of Gobbledygook. RESULTS After the removal of duplicates, 73 websites were assessed, with the majority originating from the United States (n=46, 63%). Overall, the quality of the content was of moderate quality, with a median Ensuring Quality Information for Patients score of 21 (IQR 18-25, maximum 29) out of a maximum possible score of 36. Precautionary measures were not mentioned on 41% (n=30) of websites and 30% (n=22) did not identify disk batteries as a risky FB. Red flags necessitating urgent care were identified on 95% (n=69) of websites, with 89% (n=65) advising patients to seek medical attention and 38% (n=28) advising on safe FB removal. Readability scores (Flesch Reading Ease score=12.4, Flesch-Kincaid Grade Level=6.2, Gunning-Fog Index=6.5, and Simple Measure of Gobbledygook=5.9 years) showed most websites (56%) were below the recommended sixth-grade level. CONCLUSIONS The current quality and readability of information regarding FBs is inadequate. More than half of the websites were above the recommended sixth-grade reading level, and important information regarding high-risk FBs such as disk batteries and magnets was frequently excluded. Strategies should be developed to improve access to high-quality information that informs patients and parents about risks and when to seek medical help. Strategies to promote high-quality websites in search results also have the potential to improve outcomes.
Collapse
Affiliation(s)
- Tsz Ki Ko
- Department of Surgery, Royal Stoke Hospital, Stoke, United Kingdom
| | | | - Ka Siu Fan
- Department of Surgery, Royal Surrey County Hospital, Guildford, United Kingdom
| |
Collapse
|
2
|
Tao BKL, Hua N, Milkovich J, Micieli JA. ChatGPT-3.5 and Bing Chat in ophthalmology: an updated evaluation of performance, readability, and informative sources. Eye (Lond) 2024; 38:1897-1902. [PMID: 38509182 PMCID: PMC11226422 DOI: 10.1038/s41433-024-03037-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 03/04/2024] [Accepted: 03/14/2024] [Indexed: 03/22/2024] Open
Abstract
BACKGROUND/OBJECTIVES Experimental investigation. Bing Chat (Microsoft) integration with ChatGPT-4 (OpenAI) integration has conferred the capability of accessing online data past 2021. We investigate its performance against ChatGPT-3.5 on a multiple-choice question ophthalmology exam. SUBJECTS/METHODS In August 2023, ChatGPT-3.5 and Bing Chat were evaluated against 913 questions derived from the Academy's Basic and Clinical Science Collection collection. For each response, the sub-topic, performance, Simple Measure of Gobbledygook readability score (measuring years of required education to understand a given passage), and cited resources were collected. The primary outcomes were the comparative scores between models, and qualitatively, the resources referenced by Bing Chat. Secondary outcomes included performance stratified by response readability, question type (explicit or situational), and BCSC sub-topic. RESULTS Across 913 questions, ChatGPT-3.5 scored 59.69% [95% CI 56.45,62.94] while Bing Chat scored 73.60% [95% CI 70.69,76.52]. Both models performed significantly better in explicit than clinical reasoning questions. Both models performed best on general medicine questions than ophthalmology subsections. Bing Chat referenced 927 online entities and provided at-least one citation to 836 of the 913 questions. The use of more reliable (peer-reviewed) sources was associated with higher likelihood of correct response. The most-cited resources were eyewiki.aao.org, aao.org, wikipedia.org, and ncbi.nlm.nih.gov. Bing Chat showed significantly better readability than ChatGPT-3.5, averaging a reading level of grade 11.4 [95% CI 7.14, 15.7] versus 12.4 [95% CI 8.77, 16.1], respectively (p-value < 0.0001, ρ = 0.25). CONCLUSIONS The online access, improved readability, and citation feature of Bing Chat confers additional utility for ophthalmology learners. We recommend critical appraisal of cited sources during response interpretation.
Collapse
Affiliation(s)
- Brendan Ka-Lok Tao
- Faculty of Medicine, The University of British Columbia, 317-2194 Health Sciences Mall, Vancouver, BC, V6T 1Z3, Canada
| | - Nicholas Hua
- Temerty Faculty of Medicine, University of Toronto, 1 King's College Circle, Toronto, ON, M5S 1A8, Canada
| | - John Milkovich
- Temerty Faculty of Medicine, University of Toronto, 1 King's College Circle, Toronto, ON, M5S 1A8, Canada
| | - Jonathan Andrew Micieli
- Temerty Faculty of Medicine, University of Toronto, 1 King's College Circle, Toronto, ON, M5S 1A8, Canada.
- Department of Ophthalmology and Vision Sciences, University of Toronto, 340 College Street, Toronto, ON, M5T 3A9, Canada.
- Division of Neurology, Department of Medicine, University of Toronto, 6 Queen's Park Crescent West, Toronto, ON, M5S 3H2, Canada.
- Kensington Vision and Research Center, 340 College Street, Toronto, ON, M5T 3A9, Canada.
- St. Michael's Hospital, 36 Queen Street East, Toronto, ON, M5B 1W8, Canada.
- Toronto Western Hospital, 399 Bathurst Street, Toronto, ON, M5T 2S8, Canada.
- University Health Network, 190 Elizabeth Street, Toronto, ON, M5G 2C4, Canada.
| |
Collapse
|
3
|
Ichhpujani P, Parmar UPS, Kumar S. Appropriateness and readability of Google Bard and ChatGPT-3.5 generated responses for surgical treatment of glaucoma. Rom J Ophthalmol 2024; 68:243-248. [PMID: 39464759 PMCID: PMC11503238 DOI: 10.22336/rjo.2024.45] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/14/2024] [Indexed: 10/29/2024] Open
Abstract
Aim To evaluate the appropriateness and readability of the medical knowledge provided by ChatGPT-3.5 and Google Bard, artificial-intelligence-powered conversational search engines, regarding surgical treatment for glaucoma. Methods In this retrospective, cross-sectional study, 25 common questions related to the surgical management of glaucoma were asked on ChatGPT-3.5 and Google Bard. Glaucoma specialists graded the responses' appropriateness, and different scores assessed readability. Results Appropriate answers to the posed questions were obtained in 68% of the responses with Google Bard and 96% with ChatGPT-3.5. On average, the responses generated by Google Bard had a significantly lower proportion of sentences, having more than 30 and 20 syllables (23% and 52% respectively) compared to ChatGPT-3.5 (66% and 82% respectively), as noted by readability. Google Bard had significantly (p<0.0001) lower readability grade scores and significantly higher "Flesch Reading ease score", implying greater ease of readability amongst the answers generated by Google Bard. Discussion Many patients and their families turn to LLM chatbots for information, necessitating clear and accurate content. Assessments of online glaucoma information have shown variability in quality and readability, with institutional websites generally performing better than private ones. We found that ChatGPT-3.5, while precise, has lower readability than Google Bard, which is more accessible but less precise. For example, the Flesch Reading Ease Score was 57.6 for Google Bard and 22.6 for ChatGPT, indicating Google Bard's content is easier to read. Moreover, the Gunning Fog Index scores suggested that Google Bard's text is more suitable for a broader audience. ChatGPT's knowledge is limited to data up to 2021, whereas Google Bard, trained with real-time data, offers more current information. Further research is needed to evaluate these tools across various medical topics. Conclusion The answers generated by ChatGPT-3.5™ AI are more accurate than the ones given by Google Bard. However, comprehension of ChatGPT-3.5™ answers may be difficult for the public with glaucoma. This study emphasized the importance of verifying the accuracy and clarity of online information that glaucoma patients rely on to make informed decisions about their ocular health. This is an exciting new area for patient education and health literacy.
Collapse
Affiliation(s)
- Parul Ichhpujani
- Department of Ophthalmology, Government Medical College and Hospital, Chandigarh, India
| | | | - Suresh Kumar
- Department of Ophthalmology, Government Medical College and Hospital, Chandigarh, India
| |
Collapse
|
4
|
Zhang B, Naderi N, Mishra R, Teodoro D. Online Health Search Via Multidimensional Information Quality Assessment Based on Deep Language Models: Algorithm Development and Validation. JMIR AI 2024; 3:e42630. [PMID: 38875551 PMCID: PMC11099810 DOI: 10.2196/42630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 07/12/2023] [Accepted: 01/15/2024] [Indexed: 06/16/2024]
Abstract
BACKGROUND Widespread misinformation in web resources can lead to serious implications for individuals seeking health advice. Despite that, information retrieval models are often focused only on the query-document relevance dimension to rank results. OBJECTIVE We investigate a multidimensional information quality retrieval model based on deep learning to enhance the effectiveness of online health care information search results. METHODS In this study, we simulated online health information search scenarios with a topic set of 32 different health-related inquiries and a corpus containing 1 billion web documents from the April 2019 snapshot of Common Crawl. Using state-of-the-art pretrained language models, we assessed the quality of the retrieved documents according to their usefulness, supportiveness, and credibility dimensions for a given search query on 6030 human-annotated, query-document pairs. We evaluated this approach using transfer learning and more specific domain adaptation techniques. RESULTS In the transfer learning setting, the usefulness model provided the largest distinction between help- and harm-compatible documents, with a difference of +5.6%, leading to a majority of helpful documents in the top 10 retrieved. The supportiveness model achieved the best harm compatibility (+2.4%), while the combination of usefulness, supportiveness, and credibility models achieved the largest distinction between help- and harm-compatibility on helpful topics (+16.9%). In the domain adaptation setting, the linear combination of different models showed robust performance, with help-harm compatibility above +4.4% for all dimensions and going as high as +6.8%. CONCLUSIONS These results suggest that integrating automatic ranking models created for specific information quality dimensions can increase the effectiveness of health-related information retrieval. Thus, our approach could be used to enhance searches made by individuals seeking online health information.
Collapse
Affiliation(s)
- Boya Zhang
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Nona Naderi
- Department of Computer Science, Université Paris-Saclay, Centre national de la recherche scientifique, Laboratoire Interdisciplinaire des Sciences du Numérique, Orsay, France
| | - Rahul Mishra
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Douglas Teodoro
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| |
Collapse
|
5
|
Khan S, Walters RK, Walker AM, Nguyen SA, Liu SY, Tremont TJ, Abdelwahab MA. The readability of online patient education materials on maxillomandibular advancement surgery. Sleep Breath 2024; 28:745-751. [PMID: 38062224 DOI: 10.1007/s11325-023-02952-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Revised: 10/05/2023] [Accepted: 11/20/2023] [Indexed: 05/31/2024]
Abstract
STUDY OBJECTIVES Maxillomandibular advancement (MMA) is an effective surgical option for patients suffering from obstructive sleep apnea (OSA). As a relatively new treatment option, patients may turn to the Internet to learn more. However, online patient education materials (OPEMs) on MMA may be written at a higher literacy level than recommended for patients. The aim of this study was to analyze the readability of OPEMs on MMA. METHODS A Google search of "maxillomandibular advancement" was performed, and the first 100 results were screened. Websites that met eligibility criteria were analyzed for their readability using the Automated Readability Index (ARI), Coleman-Liau Index (CLI), Flesch-Kincaid Grade Level (FKGL), Gunning Fog (GF), and Simple Measure of Gobbledygook (SMOG) and compared to the recommended sixth-grade reading level using one-tailed t tests. Readability scores were compared based on the type of website, including hospitals/universities or physician clinics, using ANOVA tests. RESULTS The mean (SD) for ARI, CLI, FKGL, GF, and SMOG was 11.91 (2.43), 13.42 (1.81), 11.91 (2.06), 14.32 (2.34), and 13.99 (1.56), respectively. All readability scores were significantly higher than a sixth-grade reading level (p < 0.001). After comparing readability scores between different website types (university/hospital, clinic, and other), there was no statistical difference found. CONCLUSIONS The available OPEMs on MMA surgery for OSA are above the recommended sixth-grade reading level. Identifying and reducing the gap between the reading levels of OPEMs and the reading level of the patient are needed to encourage a more active role, informed decisions, and better patient satisfaction.
Collapse
Affiliation(s)
- Sofia Khan
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 500, Charleston, SC, 29425, USA.
| | - Rameen K Walters
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 500, Charleston, SC, 29425, USA
| | - Angelica M Walker
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 500, Charleston, SC, 29425, USA
| | - Shaun A Nguyen
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 500, Charleston, SC, 29425, USA
| | - Stanley Y Liu
- Department of Otolaryngology-Head and Neck Surgery, Stanford University, Stanford, CA, 94305, USA
| | - Timothy J Tremont
- Department of Orthodontics, Medical University of South Carolina, Charleston, SC, 29425, USA
| | - Mohamed A Abdelwahab
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 500, Charleston, SC, 29425, USA
| |
Collapse
|
6
|
DiSipio T, Scholte C, Diaz A. Evaluation of online text-based information resources of gynaecological cancer symptoms. Cancer Med 2024; 13:e7167. [PMID: 38676385 PMCID: PMC11053368 DOI: 10.1002/cam4.7167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 03/18/2024] [Accepted: 03/27/2024] [Indexed: 04/28/2024] Open
Abstract
BACKGROUND Gynaecological cancer symptoms are often vague and non-specific. Quality health information is central to timely cancer diagnosis and treatment. The aim of this study was to identify and evaluate the quality of online text-based patient information resources regarding gynaecological cancer symptoms. METHODS A targeted website search and Google search were conducted to identify health information resources published by the Australian government and non-government health organisations. Resources were classified by topic (gynaecological health, gynaecological cancers, cancer, general health); assessed for reading level (Simple Measure of Gobbledygook, SMOG) and difficulty (Flesch Reading Ease, FRE); understandability and actionability (Patient Education Materials Assessment Tool, PEMAT, 0-100), whereby higher scores indicate better understandability/actionability. Seven criteria were used to assess cultural inclusivity specific for Aboriginal and Torres Strait Islander people; resources which met 3-5 items were deemed to be moderately inclusive and 6+ items as inclusive. RESULTS A total of 109 resources were identified and 76% provided information on symptoms in the context of gynaecological cancers. The average readability was equivalent to a grade 10 reading level on the SMOG and classified as 'difficult to read' on the FRE. The mean PEMAT scores were 95% (range 58-100) for understandability and 13% (range 0-80) for actionability. Five resources were evaluated as being moderately culturally inclusive. No resource met all the benchmarks. CONCLUSIONS This study highlights the inadequate quality of online resources available on pre-diagnosis gynaecological cancer symptom information. Resources should be revised in line with the recommended standards for readability, understandability and actionability and to meet the needs of a culturally diverse population.
Collapse
Affiliation(s)
- Tracey DiSipio
- School of Public HealthThe University of QueenslandBrisbaneQueenslandAustralia
| | - Cate Scholte
- School of Public HealthThe University of QueenslandBrisbaneQueenslandAustralia
| | - Abbey Diaz
- School of Public HealthThe University of QueenslandBrisbaneQueenslandAustralia
| |
Collapse
|
7
|
Shin A, Banubakode S, Taveras Alam S, Gonzalez AO. Evaluating the Readability of Online Blood Cancer Education Materials Across Different Readability Measures. Cureus 2024; 16:e58488. [PMID: 38765438 PMCID: PMC11101262 DOI: 10.7759/cureus.58488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/17/2024] [Indexed: 05/22/2024] Open
Abstract
Introduction The National Institutes of Health and the American Medical Association recommend patient education materials (EMs) be at or below the sixth-grade reading level. The American Cancer Society, Leukemia & Lymphoma Society, and National Comprehensive Cancer Network have accurate blood cancer EMs. Methods One hundred one (101) blood cancer EMs from the above organizations were assessed using the following: Flesch Reading Ease Formula (FREF), Flesch-Kincaid Grade Level (FKGL), Gunning Fog Index (GFI), Simple Measure of Gobbledygook Index (SMOG), and the Coleman-Liau Index (CLI). Results Only 3.96% of patient EMs scored at or below the seventh-grade reading level in all modalities. Healthcare professional education materials (HPEMs) averaged around the college to graduate level. For leukemia and lymphoma patient EMs, there were significant differences for FKGL vs. SMOG, FKGL vs. GFI, FKGL vs. CLI, SMOG vs. CLI, and GFI vs. CLI. For HPEMs, there were significant differences for FKGL vs. GFI and GFI vs. CLI. Conclusion The majority of patient EMs were above the seventh-grade reading level. A lack of easily readable patient EMs could lead to a poor understanding of disease and, thus, adverse health outcomes. Overall, patient EMs should not replace physician counseling. Physicians must close the gaps in patients' understanding throughout their cancer treatment.
Collapse
Affiliation(s)
- Ashley Shin
- Division of Hematology/Oncology, McGovern Medical School at UTHealth Houston, Houston, USA
| | - Surbhi Banubakode
- Division of Hematology/Oncology, McGovern Medical School at UTHealth Houston, Houston, USA
| | - Sara Taveras Alam
- Division of Hematology/Oncology, McGovern Medical School at UTHealth Houston, Houston, USA
| | - Anneliese O Gonzalez
- Division of Hematology/Oncology, McGovern Medical School at UTHealth Houston, Houston, USA
| |
Collapse
|
8
|
Rouhi AD, Ghanem YK, Yolchieva L, Saleh Z, Joshi H, Moccia MC, Suarez-Pierre A, Han JJ. Can Artificial Intelligence Improve the Readability of Patient Education Materials on Aortic Stenosis? A Pilot Study. Cardiol Ther 2024; 13:137-147. [PMID: 38194058 PMCID: PMC10899139 DOI: 10.1007/s40119-023-00347-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Accepted: 12/13/2023] [Indexed: 01/10/2024] Open
Abstract
INTRODUCTION The advent of generative artificial intelligence (AI) dialogue platforms and large language models (LLMs) may help facilitate ongoing efforts to improve health literacy. Additionally, recent studies have highlighted inadequate health literacy among patients with cardiac disease. The aim of the present study was to ascertain whether two freely available generative AI dialogue platforms could rewrite online aortic stenosis (AS) patient education materials (PEMs) to meet recommended reading skill levels for the public. METHODS Online PEMs were gathered from a professional cardiothoracic surgical society and academic institutions in the USA. PEMs were then inputted into two AI-powered LLMs, ChatGPT-3.5 and Bard, with the prompt "translate to 5th-grade reading level". Readability of PEMs before and after AI conversion was measured using the validated Flesch Reading Ease (FRE), Flesch-Kincaid Grade Level (FKGL), Simple Measure of Gobbledygook Index (SMOGI), and Gunning-Fog Index (GFI) scores. RESULTS Overall, 21 PEMs on AS were gathered. Original readability measures indicated difficult readability at the 10th-12th grade reading level. ChatGPT-3.5 successfully improved readability across all four measures (p < 0.001) to the approximately 6th-7th grade reading level. Bard successfully improved readability across all measures (p < 0.001) except for SMOGI (p = 0.729) to the approximately 8th-9th grade level. Neither platform generated PEMs written below the recommended 6th-grade reading level. ChatGPT-3.5 demonstrated significantly more favorable post-conversion readability scores, percentage change in readability scores, and conversion time compared to Bard (all p < 0.001). CONCLUSION AI dialogue platforms can enhance the readability of PEMs for patients with AS but may not fully meet recommended reading skill levels, highlighting potential tools to help strengthen cardiac health literacy in the future.
Collapse
Affiliation(s)
- Armaun D Rouhi
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Yazid K Ghanem
- Department of Surgery, Cooper University Hospital, Camden, NJ, USA
| | - Laman Yolchieva
- College of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
| | - Zena Saleh
- Department of Surgery, Cooper University Hospital, Camden, NJ, USA
| | - Hansa Joshi
- Department of Surgery, Cooper University Hospital, Camden, NJ, USA
| | - Matthew C Moccia
- Department of Surgery, Cooper University Hospital, Camden, NJ, USA
| | | | - Jason J Han
- Division of Cardiovascular Surgery, Department of Surgery, Perelman School of Medicine, Hospital of the University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
9
|
Meyer A, Riese J, Streichert T. Comparison of the Performance of GPT-3.5 and GPT-4 With That of Medical Students on the Written German Medical Licensing Examination: Observational Study. JMIR MEDICAL EDUCATION 2024; 10:e50965. [PMID: 38329802 PMCID: PMC10884900 DOI: 10.2196/50965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 11/14/2023] [Accepted: 12/11/2023] [Indexed: 02/09/2024]
Abstract
BACKGROUND The potential of artificial intelligence (AI)-based large language models, such as ChatGPT, has gained significant attention in the medical field. This enthusiasm is driven not only by recent breakthroughs and improved accessibility, but also by the prospect of democratizing medical knowledge and promoting equitable health care. However, the performance of ChatGPT is substantially influenced by the input language, and given the growing public trust in this AI tool compared to that in traditional sources of information, investigating its medical accuracy across different languages is of particular importance. OBJECTIVE This study aimed to compare the performance of GPT-3.5 and GPT-4 with that of medical students on the written German medical licensing examination. METHODS To assess GPT-3.5's and GPT-4's medical proficiency, we used 937 original multiple-choice questions from 3 written German medical licensing examinations in October 2021, April 2022, and October 2022. RESULTS GPT-4 achieved an average score of 85% and ranked in the 92.8th, 99.5th, and 92.6th percentiles among medical students who took the same examinations in October 2021, April 2022, and October 2022, respectively. This represents a substantial improvement of 27% compared to GPT-3.5, which only passed 1 out of the 3 examinations. While GPT-3.5 performed well in psychiatry questions, GPT-4 exhibited strengths in internal medicine and surgery but showed weakness in academic research. CONCLUSIONS The study results highlight ChatGPT's remarkable improvement from moderate (GPT-3.5) to high competency (GPT-4) in answering medical licensing examination questions in German. While GPT-4's predecessor (GPT-3.5) was imprecise and inconsistent, it demonstrates considerable potential to improve medical education and patient care, provided that medically trained users critically evaluate its results. As the replacement of search engines by AI tools seems possible in the future, further studies with nonprofessional questions are needed to assess the safety and accuracy of ChatGPT for the general population.
Collapse
Affiliation(s)
- Annika Meyer
- Institute for Clinical Chemistry, University Hospital Cologne, Cologne, Germany
| | - Janik Riese
- Department of General Surgery, Visceral, Thoracic and Vascular Surgery, University Hospital Greifswald, Greifswald, Germany
| | - Thomas Streichert
- Institute for Clinical Chemistry, University Hospital Cologne, Cologne, Germany
| |
Collapse
|
10
|
Rauzi A, Powell LE, White M, Prathibha S, Hui JYC. Readability Analysis of Online Breast Cancer Surgery Patient Education Materials from National Cancer Institute-Designated Cancer Centers Compared with Top Internet Search Results. Ann Surg Oncol 2023; 30:8061-8066. [PMID: 37707665 DOI: 10.1245/s10434-023-14279-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 08/22/2023] [Indexed: 09/15/2023]
Abstract
BACKGROUND The National Institutes of Health (NIH) recommends patient education materials reflect the average reading grade level of the US population. Due to the importance of shared decision-making in breast cancer surgery, this study evaluates the reading level of patient education materials from National Cancer Institute-designated cancer centers (NCI-DCC) compared with top Internet search results. METHODS Online materials from NCI-DCC and top Internet search results on breast cancer, staging, surgical options, and pre- and postoperative expectations were analyzed using three validated readability algorithms: Simplified Measure of Gobbledygook Readability Formula, Coleman-Liau index, and Flesch-Kincaid grade level. Mean readability was compared across source groups and information subcategories using an unpaired t-test with statistical significance set at p < 0.05. Mean readability was compared using a one-way analysis of variance. RESULTS Mean readability scores from NCI-DCC and Internet groups ranged from a 9th-12th grade level, significantly above the NIH recommended reading level of 6th-7th grade. There was no significant difference between reading levels from the two sources. The discrepancy between actual and recommended reading level was most pronounced for "surgical options" at a 10th-12th grade level from both sources. CONCLUSIONS Patient education materials on breast cancer from both NCI-DCC and top Internet search results were written several reading grade levels higher than the NIH recommendation. Materials should be revised to enhance patient comprehension of breast cancer surgical treatment and guide patients in this important decision-making process to ultimately improve health outcomes.
Collapse
Affiliation(s)
- Anna Rauzi
- University of Minnesota School of Medicine, Minneapolis, MN, USA
| | - Lauren E Powell
- Division of Plastic Surgery, Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| | - McKenzie White
- Division of Surgical Oncology, Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| | - Saranya Prathibha
- Division of Surgical Oncology, Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| | - Jane Yuet Ching Hui
- Division of Surgical Oncology, Department of Surgery, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
11
|
Ayre J, Bonner C, Gonzalez J, Vaccaro T, Cousins M, McCaffery K, Muscat DM. Integrating consumer perspectives into a large-scale health literacy audit of health information materials: learnings and next steps. BMC Health Serv Res 2023; 23:416. [PMID: 37120520 PMCID: PMC10148726 DOI: 10.1186/s12913-023-09434-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 04/21/2023] [Indexed: 05/01/2023] Open
Abstract
BACKGROUND Health information is less effective when it does not meet the health literacy needs of its consumers. For health organisations, assessing the appropriateness of their existing health information resources is a key step to addressing this issue. This study describes novel methods for a consumer-centred large-scale health literacy audit of existing resources and reflects on opportunities to further refine the method. METHODS This audit focused on resources developed by NPS MedicineWise, an Australian not-for-profit that promotes safe and informed use of medicines. The audit comprised 4 stages, with consumers engaged at each stage: 1) Select a sample of resources for assessment; 2) Assess the sample using subjective (Patient Education Materials Assessment Tool) and objective (Sydney Health Literacy Lab Health Literacy Editor) assessment tools; 3) Review audit findings through workshops and identify priority areas for future work; 4) Reflect and gather feedback on the audit process via interviews. RESULTS Of 147 resources, consumers selected 49 for detailed assessment that covered a range of health topics, health literacy skills, and formats, and which had varied web usage. Overall, 42 resources (85.7%) were assessed as easy to understand, but only 26 (53.1%) as easy to act on. A typical text was written at a grade 12 reading level and used the passive voice 6 times. About one in five words in a typical text were considered complex (19%). Workshops identified three key areas for action: make resources easier to understand and act on; consider the readers' context, needs, and skills; and improve inclusiveness and representation. Interviews with workshop attendees highlighted that audit methods could be further improved by setting clear expectations about the project rationale, objectives, and consumer roles; providing consumers with a simpler subjective health literacy assessment tool, and addressing issues related to diverse representation. CONCLUSIONS This audit yielded valuable consumer-centred priorities for improving organisational health literacy with regards to updating a large existing database of health information resources. We also identified important opportunities to further refine the process. Study findings provide valuable practical insights that can inform organisational health actions for the upcoming Australian National Health Literacy Strategy.
Collapse
Affiliation(s)
- Julie Ayre
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia.
| | - Carissa Bonner
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia
- Menzies Centre for Health Policy and Economics, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | | | | | | | - Kirsten McCaffery
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia
| | - Danielle M Muscat
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia
| |
Collapse
|
12
|
Zubiena L, Lewin O, Coleman R, Phezulu J, Ogunfiditimi G, Blackburn T, Joseph L. Development and testing of the health information website evaluation tool on neck pain websites - An analysis of reliability, validity, and utility. PATIENT EDUCATION AND COUNSELING 2023; 113:107762. [PMID: 37087877 DOI: 10.1016/j.pec.2023.107762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 08/30/2022] [Accepted: 04/14/2023] [Indexed: 05/03/2023]
Abstract
OBJECTIVE Online health information contributes to patient education and knowledge on disease management. The aims of this study were to design the Health Information Website Evaluation Tool (HIWET) to evaluate the quality of online information, and to investigate the reliability, validity, and utility of HIWET. METHODS HIWET was developed by a literature search and small-scale pilot testing. Upon development, psychometric properties of HIWET were evaluated on 20 neck pain websites. Reliability was analysed using Intra class correlation coefficient (ICC). Validity was analysed using Pearson and Spearman correlation coefficients. Utility was analysed using an independent samples t-test. RESULTS HIWET demonstrated excellent intra-rater reliability (0.94 (0.98-0.99), p < .001) and fair inter-rater reliability (0.55 (0.88-0.10), p = .04). HIWET demonstrated validity with strong correlation against DISCERN (r = 0.656, n = 20, p = .002) and LIDA (r = 0.564, n = 20, p = 0.010). HIWET was time-efficient when compared to three comparison tools combined. CONCLUSION HIWET is a reliable and valid tool for evaluating the qualities of online health information. PRACTICAL IMPLICATIONS HIWET has the advantages of being a simple, quick to use and freely accessible tool. It can be implemented into clinical practice, education, and research to evaluate quality of online health information.
Collapse
Affiliation(s)
- Luke Zubiena
- School of Sports and Health Science, University of Brighton, Eastbourne, East Sussex, United Kingdom
| | - Olivia Lewin
- School of Sports and Health Science, University of Brighton, Eastbourne, East Sussex, United Kingdom
| | - Robert Coleman
- School of Sports and Health Science, University of Brighton, Eastbourne, East Sussex, United Kingdom
| | - James Phezulu
- School of Sports and Health Science, University of Brighton, Eastbourne, East Sussex, United Kingdom
| | - Gbemisola Ogunfiditimi
- School of Sports and Health Science, University of Brighton, Eastbourne, East Sussex, United Kingdom
| | - Tiffany Blackburn
- School of Sports and Health Science, University of Brighton, Eastbourne, East Sussex, United Kingdom
| | - Leonard Joseph
- School of Sports and Health Science, University of Brighton, Eastbourne, East Sussex, United Kingdom.
| |
Collapse
|
13
|
Mavragani A, Bonner C, Muscat DM, Dunn AG, Harrison E, Dalmazzo J, Mouwad D, Aslani P, Shepherd HL, McCaffery KJ. Multiple Automated Health Literacy Assessments of Written Health Information: Development of the SHeLL (Sydney Health Literacy Lab) Health Literacy Editor v1. JMIR Form Res 2023; 7:e40645. [PMID: 36787164 PMCID: PMC9975914 DOI: 10.2196/40645] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 12/13/2022] [Accepted: 12/29/2022] [Indexed: 02/15/2023] Open
Abstract
Producing health information that people can easily understand is challenging and time-consuming. Existing guidance is often subjective and lacks specificity. With advances in software that reads and analyzes text, there is an opportunity to develop tools that provide objective, specific, and automated guidance on the complexity of health information. This paper outlines the development of the SHeLL (Sydney Health Literacy Lab) Health Literacy Editor, an automated tool to facilitate the implementation of health literacy guidelines for the production of easy-to-read written health information. Target users were any person or organization that develops consumer-facing education materials, with or without prior experience with health literacy concepts. Anticipated users included health professionals, staff, and government and nongovernment agencies. To develop this tool, existing health literacy and relevant writing guidelines were collated. Items amenable to programmable automated assessment were incorporated into the Editor. A set of natural language processing methods were also adapted for use in the SHeLL Editor, though the approach was primarily procedural (rule-based). As a result of this process, the Editor comprises 6 assessments: readability (school grade reading score calculated using the Simple Measure of Gobbledygook (SMOG)), complex language (percentage of the text that contains public health thesaurus entries, words that are uncommon in English, or acronyms), passive voice, text structure (eg, use of long paragraphs), lexical density and diversity, and person-centered language. These are presented as global scores, with additional, more specific feedback flagged in the text itself. Feedback is provided in real-time so that users can iteratively revise and improve the text. The design also includes a "text preparation" mode, which allows users to quickly make adjustments to ensure accurate calculation of readability. A hierarchy of assessments also helps users prioritize the most important feedback. Lastly, the Editor has a function that exports the analysis and revised text. The SHeLL Health Literacy Editor is a new tool that can help improve the quality and safety of written health information. It provides objective, immediate feedback on a range of factors, complementing readability with other less widely used but important objective assessments such as complex and person-centered language. It can be used as a scalable intervention to support the uptake of health literacy guidelines by health services and providers of health information. This early prototype can be further refined by expanding the thesaurus and leveraging new machine learning methods for assessing the complexity of the written text. User-testing with health professionals is needed before evaluating the Editor's ability to improve the health literacy of written health information and evaluating its implementation into existing Australian health services.
Collapse
Affiliation(s)
| | - Carissa Bonner
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Danielle M Muscat
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Adam G Dunn
- Biomedical Informatics and Digital Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Eliza Harrison
- Biomedical Informatics and Digital Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Jason Dalmazzo
- Biomedical Informatics and Digital Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Dana Mouwad
- Western Sydney Local Health District, Health Literacy Hub, Sydney, Australia
| | - Parisa Aslani
- School of Pharmacy, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Heather L Shepherd
- Susan Wakil School of Nursing and Midwifery, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Kirsten J McCaffery
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| |
Collapse
|
14
|
Rosenberg A, Walker J, Griffiths S, Jenkins R. Plain language summaries: Enabling increased diversity, equity, inclusion and accessibility in scholarly publishing. LEARNED PUBLISHING 2023. [DOI: 10.1002/leap.1524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Affiliation(s)
| | - Joanne Walker
- Publishing Department Becaris Publishing Ltd. Royston UK
| | | | | |
Collapse
|
15
|
Jawad D, Cheng H, Wen LM, Rissel C, Baur L, Mihrshahi S, Taki S. Interactivity, Quality, and Content of Websites Promoting Health Behaviors During Infancy: 6-Year Update of the Systematic Assessment. J Med Internet Res 2022; 24:e38641. [PMID: 36206031 PMCID: PMC9587494 DOI: 10.2196/38641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 06/03/2022] [Accepted: 07/15/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND As of 2021, 89% of the Australian population are active internet users. Although the internet is widely used, there are concerns about the quality, accuracy, and credibility of health-related websites. A 2015 systematic assessment of infant feeding websites and apps available in Australia found that 61% of websites were of poor quality and readability, with minimal coverage of infant feeding topics and lack of author credibility. OBJECTIVE We aimed to systematically assess the quality, interactivity, readability, and comprehensibility of information targeting infant health behaviors on websites globally and provide an update of the 2015 systematic assessment. METHODS Keywords related to infant milk feeding behaviors, solid feeding behaviors, active play, screen time, and sleep were used to identify websites targeting infant health behaviors on the Google search engine on Safari. The websites were assessed by a subset of the authors using predetermined criteria between July 2021 and February 2022 and assessed for information content based on the Australian Infant Feeding Guidelines and National Physical Activity Recommendations. The Suitability Assessment of Materials, Quality Component Scoring System, the Health-Related Website Evaluation Form, and the adherence to the Health on the Net code were used to evaluate the suitability and quality of information. Readability was assessed using 3 web-based readability tools. RESULTS Of the 450 websites screened, 66 were included based on the selection criteria and evaluated. Overall, the quality of websites was mostly adequate. Media-related sources, nongovernmental organizations, hospitals, and privately owned websites had the highest median quality scores, whereas university websites received the lowest median score (35%). The information covered within the websites was predominantly poor: 91% (60/66) of the websites received an overall score of ≤74% (mean 53%, SD 18%). The suitability of health information was mostly rated adequate for literacy demand, layout, and learning and motivation of readers. The median readability score for the websites was grade 8.5, which is higher than the government recommendations ( CONCLUSIONS Quality, content, readability, and interactivity of websites promoting health behaviors during infancy ranged between poor and adequate. Since the 2015 systematic assessment, there was a slight improvement in the quality of websites but no difference in the Suitability Assessment of Materials rating and readability of information. There is a need for researchers and health care providers to leverage innovative web-based platforms to provide culturally competent evidence-based information based on government guidelines that are accessible to those with limited English proficiency.
Collapse
Affiliation(s)
- Danielle Jawad
- Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
- Health Promotion Unit, Population Health Research & Evaluation Hub, Sydney Local Health District, Sydney, Australia
- National Health and Medical Research Council Centre of Research Excellence in the Early Prevention of Obesity in Childhood - Translate, The University of Sydney, Sydney, Australia
| | - Heilok Cheng
- National Health and Medical Research Council Centre of Research Excellence in the Early Prevention of Obesity in Childhood - Translate, The University of Sydney, Sydney, Australia
- Susan Wakil School of Nursing and Midwifery, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
- Sydney Institute for Women, Children and their Families, Sydney Local Health District, Sydney, Australia
| | - Li Ming Wen
- Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
- Health Promotion Unit, Population Health Research & Evaluation Hub, Sydney Local Health District, Sydney, Australia
- National Health and Medical Research Council Centre of Research Excellence in the Early Prevention of Obesity in Childhood - Translate, The University of Sydney, Sydney, Australia
- Sydney Institute for Women, Children and their Families, Sydney Local Health District, Sydney, Australia
| | - Chris Rissel
- Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
- College of Medicine and Public Health, Rural and Remote Health South Australia and Northern Territory, Flinders University, Darwin, Australia
| | - Louise Baur
- Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
- National Health and Medical Research Council Centre of Research Excellence in the Early Prevention of Obesity in Childhood - Translate, The University of Sydney, Sydney, Australia
- Specialty of Child and Adolescent Health, Sydney Medical School, The University of Sydney, Sydney, Australia
| | - Seema Mihrshahi
- Department of Health Sciences, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia
| | - Sarah Taki
- Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
- Health Promotion Unit, Population Health Research & Evaluation Hub, Sydney Local Health District, Sydney, Australia
- National Health and Medical Research Council Centre of Research Excellence in the Early Prevention of Obesity in Childhood - Translate, The University of Sydney, Sydney, Australia
- Sydney Institute for Women, Children and their Families, Sydney Local Health District, Sydney, Australia
| |
Collapse
|
16
|
Santos DF, Santos Malave GF, Asif N, Izquierdo N. An Analysis of the Readability of Phacoemulsification Online Resources. Cureus 2022; 14:e29223. [PMID: 36225456 PMCID: PMC9536863 DOI: 10.7759/cureus.29223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/16/2022] [Indexed: 11/28/2022] Open
Abstract
Introduction: Cataract is the leading cause of blindness worldwide. Phacoemulsification is now the gold standard for cataract extraction and is greatly needed in low socioeconomic status (SES) communities, rural and older patient populations, and patients with poor vision. This greatly increases the importance of high readability for online resources on this topic. This study aims to assess the readability of online information about phacoemulsification based on readability scores for each resource. Methods: We conducted a retrospective cross-sectional study. The term “phacoemulsification” was searched online, and each website was categorized by type: academic, physician, non-physician, commercial, social media, and unspecified. The readability scores for each website were calculated using six different readability tests and a composite score that reflects reading grade level was obtained. To evaluate the difference between the categories of websites, analysis of variance (ANOVA) testing was used. All test scores were compared with the 6th grade standard recommendation using a one-sample t-test. Results: A total of 20 websites were analyzed. Three websites (3/20; 15%) had a score which is correlated with a 6th grade reading level or below. Seventeen websites had a score correlated with a college reading level or above (17/20; 85%). None of the readability scores had a mean below a 6th grade reading level. No category had an average readability score at or below a 6th grade reading level. None of the mean readability scores resulted in a statistically significant difference across categories. All readability tests had an average score which was significantly different from a 6th grade reading level (p<0.001). Conclusions: This is the first study to focus on the accessibility of online English resources on phacoemulsification and implement multiple standardized readability scores with regards to cataract surgery resources. It provides further overwhelming evidence that online resources on phacoemulsification are too complex for the average patient to understand. Interventions should be implemented to improve readability.
Collapse
|
17
|
Downey T, Millar BC, Moore JE. Improving health literacy with mumps, measles and rubella (MMR) vaccination: comparison of the readability of MMR patient-facing literature and MMR scientific abstracts. Ther Adv Vaccines Immunother 2022; 10:25151355221118812. [PMID: 36035444 PMCID: PMC9400405 DOI: 10.1177/25151355221118812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 07/13/2022] [Indexed: 11/16/2022] Open
Abstract
Background: Historically, there have been many factors that have influenced mumps, measles and rubella (MMR) vaccine uptake, including media bias, social/economic determinants, parental education level, deprivation and concerns over vaccine safety. Readability metrics through online tools are now emerging as a means for healthcare professionals to determine the readability of patient-facing vaccine information. The aim of this study was to examine the readability of patient-facing materials describing MMR vaccination, through employment of nine readability and text parameter metrics, and to compare these with MMR vaccination literature for healthcare professionals and scientific abstracts relating to MMR vaccination. Materials and methods: The subscription-based online Readable program (readable.com) was used to determine nine readability indices using various readability formulae: Established readability metrics (n = 5) (Flesch–Kinkaid Grade Level, Gunning Fog Index, SMOG Index, Flesch Reading Ease and New Dale-Chall Score), as well as Text parameters (n = 4) (sentence count, word count, number of words per sentence, number of syllables per word) with 47 MMR vaccination texts [patient-facing literature (n = 22); healthcare professional–focused literature (n = 8); scientific abstracts (n = 17)]. Results: Patient-facing vaccination literature had a Flesch Reading Ease score of 58.4 and a Flesch–Kincaid Grade Level of 8.1, in comparison with poorer readability scores for healthcare professional literature of 30.7 and 12.6, respectively. MMR scientific abstracts had the poorest readability (24.0 and 14.8, respectively). Sentence structure was also considered, where better readability metrics were correlated with significantly lower number of words per sentence and less syllables per word. Conclusion: Use of these readability tools enables the author to ensure their research is more readable to the lay audience. Patient co-production initiatives would help to ensure that not only can the target audience read the literature, but that they understand the content. Increased patient-centric focus groups would give better insights into reasons for MMR-associated vaccine hesitation and vaccine refusal.
Collapse
Affiliation(s)
- Tina Downey
- School of Biomedical Sciences, Ulster University, Coleraine, UK
| | | | - John E Moore
- Laboratory for Disinfection and Pathogen Elimination Studies, Northern Ireland Public Health Laboratory, Nightingale (Belfast City) Hospital, Corry Building, Lisburn Road, Belfast BT9 7AD, UK
| |
Collapse
|
18
|
Outcomes and Critical Factors for Successful Implementation of Organizational Health Literacy Interventions: A Scoping Review. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph182211906. [PMID: 34831658 PMCID: PMC8622809 DOI: 10.3390/ijerph182211906] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Revised: 11/06/2021] [Accepted: 11/10/2021] [Indexed: 01/24/2023]
Abstract
Organizational health literacy (OHL)-interventions can reduce inequality and demands in health care encountered by patients. However, an overview of their impact and critical factors for organization-wide implementation is lacking. The aim of this scoping review is to summarize the evidence on: (1) the outcomes of OHL-interventions at patient, professional and organizational levels; and (2) the factors and strategies that affect implementation and outcomes of OHL-interventions. We reviewed empirical studies following the five-stage framework of Arksey and O'Malley. The databases Scopus, PubMed, PsychInfo and CINAHL were searched from 1 January 2010 to 31 December 2019, focusing on OHL-interventions using terms related to "health literacy", "health care organization" and "intervention characteristics". After a full-text review, we selected 24 descriptive stu-dies. Of these, 23 studies reported health literacy problems in relation to OHL-assessment tools. Nine out of thirteen studies reported that the use of interventions resulted in positive changes on OHL-domains regarding comprehensible communication, professionals' competencies and practices, and strategic organizational changes. Organization-wide OHL-interventions resulted in some improvement of patient outcomes but evidence was scarce. Critical factors for organization-wide implementation of OHL-interventions were leadership support, top-down and bottom-up approaches, a change champion, and staff commitment. Organization-wide interventions lead to more positive change on OHL-domains, but evidence regarding OHL-outcomes needs strengthening.
Collapse
|
19
|
Brown W, Balyan R, Karter AJ, Crossley S, Semere W, Duran ND, Lyles C, Liu J, Moffet HH, Daniels R, McNamara DS, Schillinger D. Challenges and solutions to employing natural language processing and machine learning to measure patients' health literacy and physician writing complexity: The ECLIPPSE study. J Biomed Inform 2021; 113:103658. [PMID: 33316421 PMCID: PMC8186847 DOI: 10.1016/j.jbi.2020.103658] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 12/07/2020] [Accepted: 12/08/2020] [Indexed: 11/18/2022]
Abstract
OBJECTIVE In the National Library of Medicine funded ECLIPPSE Project (Employing Computational Linguistics to Improve Patient-Provider Secure Emails exchange), we attempted to create novel, valid, and scalable measures of both patients' health literacy (HL) and physicians' linguistic complexity by employing natural language processing (NLP) techniques and machine learning (ML). We applied these techniques to > 400,000 patients' and physicians' secure messages (SMs) exchanged via an electronic patient portal, developing and validating an automated patient literacy profile (LP) and physician complexity profile (CP). Herein, we describe the challenges faced and the solutions implemented during this innovative endeavor. MATERIALS AND METHODS To describe challenges and solutions, we used two data sources: study documents and interviews with study investigators. Over the five years of the project, the team tracked their research process using a combination of Google Docs tools and an online team organization, tracking, and management tool (Asana). In year 5, the team convened a number of times to discuss, categorize, and code primary challenges and solutions. RESULTS We identified 23 challenges and associated approaches that emerged from three overarching process domains: (1) Data Mining related to the SM corpus; (2) Analyses using NLP indices on the SM corpus; and (3) Interdisciplinary Collaboration. With respect to Data Mining, problems included cleaning SMs to enable analyses, removing hidden caregiver proxies (e.g., other family members) and Spanish language SMs, and culling SMs to ensure that only patients' primary care physicians were included. With respect to Analyses, critical decisions needed to be made as to which computational linguistic indices and ML approaches should be selected; how to enable the NLP-based linguistic indices tools to run smoothly and to extract meaningful data from a large corpus of medical text; and how to best assess content and predictive validities of both the LP and the CP. With respect to the Interdisciplinary Collaboration, because the research required engagement between clinicians, health services researchers, biomedical informaticians, linguists, and cognitive scientists, continual effort was needed to identify and reconcile differences in scientific terminologies and resolve confusion; arrive at common understanding of tasks that needed to be completed and priorities therein; reach compromises regarding what represents "meaningful findings" in health services vs. cognitive science research; and address constraints regarding potential transportability of the final LP and CP to different health care settings. DISCUSSION Our study represents a process evaluation of an innovative research initiative to harness "big linguistic data" to estimate patient HL and physician linguistic complexity. Any of the challenges we identified, if left unaddressed, would have either rendered impossible the effort to generate LPs and CPs, or invalidated analytic results related to the LPs and CPs. Investigators undertaking similar research in HL or using computational linguistic methods to assess patient-clinician exchange will face similar challenges and may find our solutions helpful when designing and executing their health communications research.
Collapse
Affiliation(s)
- William Brown
- Center for AIDS Prevention Studies, University of California, San Francisco, San Francisco, CA, United States; Bakar Computational Health Science Institute, University of California, San Francisco, San Francisco, CA, United States; University of California San Francisco Center for Vulnerable Populations, Zuckerberg San Francisco General Hospital, San Francisco, CA, United States; Department of Medicine, University of California, San Francisco, San Francisco, CA, United States.
| | - Renu Balyan
- State University of New York Old Westbury, NY, United States; Department of Psychology, Arizona State University, Tempe, AZ, United States
| | - Andrew J Karter
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, United States
| | - Scott Crossley
- Department of Applied Linguistics and English as a Second Language, Georgia State University, Atlanta, GA, United States
| | - Wagahta Semere
- Department of Medicine, University of California, San Francisco, San Francisco, CA, United States
| | - Nicholas D Duran
- School of Social and Behavioral Sciences, Arizona State University, Glendale, AZ, United States
| | - Courtney Lyles
- University of California San Francisco Center for Vulnerable Populations, Zuckerberg San Francisco General Hospital, San Francisco, CA, United States; Department of Medicine, University of California, San Francisco, San Francisco, CA, United States; Division of Research, Kaiser Permanente Northern California, Oakland, CA, United States
| | - Jennifer Liu
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, United States
| | - Howard H Moffet
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, United States
| | - Ryane Daniels
- University of California San Francisco Center for Vulnerable Populations, Zuckerberg San Francisco General Hospital, San Francisco, CA, United States
| | - Danielle S McNamara
- Department of Psychology, Arizona State University, Tempe, AZ, United States
| | - Dean Schillinger
- University of California San Francisco Center for Vulnerable Populations, Zuckerberg San Francisco General Hospital, San Francisco, CA, United States; Department of Medicine, University of California, San Francisco, San Francisco, CA, United States; Division of Research, Kaiser Permanente Northern California, Oakland, CA, United States
| |
Collapse
|
20
|
Zhang M, Chow A, Smith H. COVID-19 Contact-Tracing Apps: Analysis of the Readability of Privacy Policies. J Med Internet Res 2020; 22:e21572. [PMID: 33170798 PMCID: PMC7717894 DOI: 10.2196/21572] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 09/25/2020] [Accepted: 10/08/2020] [Indexed: 12/30/2022] Open
Abstract
Apps that enable contact-tracing are instrumental in mitigating the transmission of COVID-19, but there have been concerns among users about the data collected by these apps and their management. Contact tracing is of paramount importance when dealing with a pandemic, as it allows for rapid identification of cases based on the information collected from infected individuals about other individuals they may have had recent contact with. Advances in digital technology have enabled devices such as mobile phones to be used in the contract-tracing process. However, there is a potential risk of users’ personal information and sensitive data being stolen should hackers be in the near vicinity of these devices. Thus, there is a need to develop privacy-preserving apps. Meanwhile, privacy policies that outline the risk associated with the use of contact-tracing apps are needed, in formats that are easily readable and comprehensible by the public. To our knowledge, no previous study has examined the readability of privacy policies of contact-tracings apps. Therefore, we performed a readability analysis to evaluate the comprehensibility of privacy policies of 7 contact-tracing apps currently in use. The contents of the privacy policies of these apps were assessed for readability using Readability Test Tool, a free web-based reliability calculator, which computes scores based on a number of statistics (ie, word count and the number of complex words) and indices (ie, Flesch Reading Ease, Flesch-Kincaid Reading Grade Level, Gunning Fog Index, and Simplified Measure of Gobbledygook index). Our analysis revealed that explanations used in the privacy policies of these apps require a reading grade between 7 and 14, which is considerably higher than the reading ability of the average individual. We believe that improving the readability of privacy policies of apps could be potentially reassuring for users and may help facilitate the increased use of such apps.
Collapse
Affiliation(s)
- Melvyn Zhang
- Family Medicine and Primary Care, Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
| | - Aloysius Chow
- Family Medicine and Primary Care, Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
| | - Helen Smith
- Family Medicine and Primary Care, Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
| |
Collapse
|
21
|
Readability Metrics of Provider Postoperative Handouts in Urology. Urology 2020; 146:49-53. [DOI: 10.1016/j.urology.2020.08.044] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Revised: 08/19/2020] [Accepted: 08/26/2020] [Indexed: 11/21/2022]
|
22
|
Mac OA, Thayre A, Tan S, Dodd RH. Web-Based Health Information Following the Renewal of the Cervical Screening Program in Australia: Evaluation of Readability, Understandability, and Credibility. J Med Internet Res 2020; 22:e16701. [PMID: 32442134 PMCID: PMC7381085 DOI: 10.2196/16701] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 02/13/2020] [Accepted: 04/09/2020] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Three main changes were implemented in the Australian National Cervical Screening Program (NCSP) in December 2017: an increase in the recommended age to start screening, extended screening intervals, and change from the Papanicolaou (Pap) test to primary human papillomavirus screening (cervical screening test). The internet is a readily accessible source of information to explain the reasons for these changes to the public. It is important that web-based health information about changes to national screening programs is accessible and understandable for the general population. OBJECTIVE This study aimed to evaluate Australian web-based resources that provide information about the changes to the cervical screening program. METHODS The term cervical screening was searched in 3 search engines. The first 10 relevant results across the first 3 pages of each search engine were selected. Overall, 2 authors independently evaluated each website for readability (Flesch Reading Ease [FRE], Flesch-Kincaid Grade Level, and Simple Measure of Gobbledygook [SMOG] index), quality of information (Patient Education Materials Assessment Tool [PEMAT] for printable materials), credibility (Journal of the American Medical Association [JAMA] benchmark criteria and presence of Health on the Net Foundation code of conduct [HONcode] certification), website design, and usability with 5 simulation questions to assess the relevance of information. A descriptive analysis was conducted for the readability measures, PEMAT, and the JAMA benchmark criteria. RESULTS Of the 49 websites identified in the search, 15 were eligible for inclusion. The consumer-focused websites were classed as fairly difficult to read (mean FRE score 51.8, SD 13.3). The highest FRE score (easiest to read) was 70.4 (Cancer Council Australia Cervical Screening Consumer Site), and the lowest FRE score (most difficult to read) was 33.0 (NCSP Clinical Guidelines). A total of 9 consumer-focused websites and 4 health care provider-focused websites met the recommended threshold (sixth to eighth grade; SMOG index) for readability. The mean PEMAT understandability scores were 87.7% (SD 6.0%) for consumer-focused websites and 64.9% (SD 13.8%) for health care provider-focused websites. The mean actionability scores were 58.1% (SD 19.1%) for consumer-focused websites and 36.7% (SD 11.0%) for health care provider-focused websites. Moreover, 9 consumer-focused and 3 health care provider-focused websites scored above 70% for understandability, and 2 consumer-focused websites had an actionability score above 70%. A total of 3 websites met all 4 of the JAMA benchmark criteria, and 2 websites displayed the HONcode. CONCLUSIONS It is important for women to have access to information that is at an appropriate reading level to better understand the implications of the changes to the cervical screening program. These findings can help health care providers direct their patients toward websites that provide information on cervical screening that is written at accessible reading levels and has high understandability.
Collapse
Affiliation(s)
- Olivia A Mac
- School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Amy Thayre
- School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Shumei Tan
- School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Rachael H Dodd
- School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| |
Collapse
|
23
|
Readability of online patient education material for the novel coronavirus disease (COVID-19): a cross-sectional health literacy study. Public Health 2020; 185:21-25. [PMID: 32516624 PMCID: PMC7260546 DOI: 10.1016/j.puhe.2020.05.041] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 05/20/2020] [Accepted: 05/20/2020] [Indexed: 11/23/2022]
Abstract
Objectives The internet has become one of the most important resources for the general population when searching for healthcare information. However, the information available is not always suitable for all readers because of its difficult readability. We sought to assess the readability of online information regarding the novel coronavirus disease (COVID-19) and establish whether they follow the patient educational information reading level recommendations. Study design This is a cross-sectional study. Methods We searched five key terms on Google and the first 30 results from each of the searches were considered for analysis. Five validated readability tests were utilized to establish the reading level for each article. Results Of the 150 gathered articles, 61 met the inclusion criteria and were evaluated. None (0%) of the articles met the recommended 5th to 6th grade reading level (of an 11-12-year-old). The mean readability scores were Flesch Reading Ease 44.14, Flesch-Kincaid Grade Level 12.04, Gunning-Fog Index 14.27, Simple Measure of Gobbledygook SMOG Index 10.71, and Coleman-Liau Index 12.69. Conclusions Online educational articles on COVID-19 provide information too difficult to read for the general population. The readability of articles regarding COVID-19 and other diseases needs to improve so that the general population may understand health information better and may respond adequately to protect themselves and limit the spread of infection. Online health information regarding COVID-19 is too difficult to read and understand. Based on past research, health articles too difficult to understand may cause misinformation to spread and public panic. The readability of COVID-19 needs to improve so that the general population may respond adequately to protect themselves and limit the spread of infection.
Collapse
|
24
|
Sobolewski J, Bryan JN, Duval D, O'Kell A, Tate DJ, Webb T, Moore S. Readability of consent forms in veterinary clinical research. J Vet Intern Med 2019; 33:350-355. [PMID: 30793806 PMCID: PMC6430880 DOI: 10.1111/jvim.15462] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Accepted: 02/06/2019] [Indexed: 12/03/2022] Open
Abstract
Background “Readability” of consent forms is vital to the informed consent process. The average human hospital consent form is written at a 10th grade reading level, whereas the average American adult reads at an 8th grade level. Limited information currently exists regarding the readability of veterinary general medical or clinical research consent forms. Hypothesis/Objectives The goal of this study was to assess the readability of veterinary clinical trial consent forms from a group of veterinary referral centers recently involved in a working group focused on veterinary clinical trial review and consent. We hypothesized that consent forms would not be optimized for client comprehension and would be written above the National Institutes of Health‐recommended 6th grade reading level. Animals None. Methods This was a prospective study assessing a convenience sample of veterinary clinical trial consent forms. Readability was assessed using 3 methods: the Flesch‐Kincaid (F‐K) Grade Level, Flesch Reading Ease Score (FRES), and the Readability Test Tool (RTT). Results were reported as mean (±SD) and compared across specialties. Results Fifty‐three consent forms were evaluated. Mean FRES was 37.5 ± 6.0 (target 60 or higher). Mean F‐K Grade Level was 13.0 ± 1.2 and mean RTT grade level was 12.75 ± 1.1 (target 6.0 or lower). There was substantial agreement between F‐K and RTT grade level scores (intraclass correlation coefficient 0.8). Conclusions and Clinical Importance No form evaluated met current health literacy recommendations for readability. A simple and readily available F‐K Microsoft‐based approach for evaluating grade level was in substantial agreement with other methods, suggesting that this approach might be sufficient for use by clinicians and administrators drafting forms for future studies.
Collapse
Affiliation(s)
- Josey Sobolewski
- Department of Veterinary Clinical Sciences, The Ohio State University, Columbus, Ohio.,Department of Biology, Georgetown College, Georgetown, Kentucky
| | - Jeffrey N Bryan
- Department of Veterinary Medicine and Surgery, University of Missouri College of Veterinary Medicine, Columbia, Missouri
| | - Dawn Duval
- Department of Clinical Sciences, Colorado State University College of Veterinary Medicine, Fort Collins, Colorado
| | - Allison O'Kell
- Department of Small Animal Clinical Sciences, Department of Clinical Sciences, University of Florida College of Veterinary Medicine, Gainesville, Florida
| | - Deborah J Tate
- Department of Veterinary Medicine and Surgery, University of Missouri College of Veterinary Medicine, Columbia, Missouri
| | - Tracy Webb
- Department of Clinical Sciences, Colorado State University College of Veterinary Medicine, Fort Collins, Colorado
| | - Sarah Moore
- Department of Veterinary Clinical Sciences, The Ohio State University, Columbus, Ohio
| |
Collapse
|
25
|
Grabeel KL, Tester E. Patient Education: A Change in Review. JOURNAL OF CONSUMER HEALTH ON THE INTERNET 2018. [DOI: 10.1080/15398285.2018.1514216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- Kelsey L. Grabeel
- Health Information Center/Preston Medical Library, University of Tennessee Medical Center/University of Tennessee Graduate School of Medicine, Knoxville, Tennessee, USA
| | - Emily Tester
- Health Information Center/Preston Medical Library, University of Tennessee Medical Center/University of Tennessee Graduate School of Medicine, Knoxville, Tennessee, USA
| |
Collapse
|