Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Seetharaman R. Revolutionizing Medical Education: Can ChatGPT Boost Subjective Learning and Expression? J Med Syst 2023;47:61. [PMID: 37160568 DOI: 10.1007/s10916-023-01957-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 04/20/2023] [Indexed: 05/11/2023]

For:	Seetharaman R. Revolutionizing Medical Education: Can ChatGPT Boost Subjective Learning and Expression? J Med Syst 2023;47:61. [PMID: 37160568 DOI: 10.1007/s10916-023-01957-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 04/20/2023] [Indexed: 05/11/2023]

Number

Cited by Other Article(s)

Zhui L, Yhap N, Liping L, Zhengjie W, Zhonghao X, Xiaoshu Y, Hong C, Xuexiu L, Wei R. Impact of Large Language Models on Medical Education and Teaching Adaptations. JMIR Med Inform 2024;12:e55933. [PMID: 39087590 PMCID: PMC11294775 DOI: 10.2196/55933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 04/25/2024] [Accepted: 06/08/2024] [Indexed: 08/02/2024] Open

Shang L, Li R, Xue M, Guo Q, Hou Y. Evaluating the application of ChatGPT in China's residency training education: An exploratory study. MEDICAL TEACHER 2024:1-7. [PMID: 38994848 DOI: 10.1080/0142159x.2024.2377808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 07/04/2024] [Indexed: 07/13/2024]

Miao Y, Luo Y, Zhao Y, Li J, Liu M, Wang H, Chen Y, Wu Y. Performance of GPT-4 on Chinese Nursing Examination: Potentials for AI-Assisted Nursing Education Using Large Language Models. Nurse Educ 2024:00006223-990000000-00488. [PMID: 38981035 DOI: 10.1097/nne.0000000000001679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/11/2024]

Brondani M, Alves C, Ribeiro C, Braga MM, Garcia RCM, Ardenghi T, Pattanaporn K. Artificial intelligence, ChatGPT, and dental education: Implications for reflective assignments and qualitative research. J Dent Educ 2024. [PMID: 38973069 DOI: 10.1002/jdd.13663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 06/02/2024] [Accepted: 06/21/2024] [Indexed: 07/09/2024]

Abstract

INTRODUCTION

Reflections enable students to gain additional value from a given experience. The use of Chat Generative Pre-training Transformer (ChatGPT, OpenAI Incorporated) has gained momentum, but its impact on dental education is understudied.

OBJECTIVES

To assess whether or not university instructors can differentiate reflections generated by ChatGPT from those generated by students, and to assess whether or not the content of a thematic analysis generated by ChatGPT differs from that generated by qualitative researchers on the same reflections.

METHODS

Hardcopies of 20 reflections (10 generated by undergraduate dental students and 10 generated by ChatGPT) were distributed to three instructors who had at least 5 years of teaching experience. Instructors were asked to assign either 'ChatGPT' or 'student' to each reflection. Ten of these reflections (five generated by undergraduate dental students and five generated by ChatGPT) were randomly selected and distributed to two qualitative researchers who were asked to perform a brief thematic analysis with codes and themes. The same ten reflections were also thematically analyzed by ChatGPT.

RESULTS

The three instructors correctly determined whether the reflections were student or ChatGPT generated 85% of the time. Most disagreements (40%) happened with the reflections generated by ChatGPT, as the instructors thought to be generated by students. The thematic analyses did not differ substantially when comparing the codes and themes produced by the two researchers with those generated by ChatGPT.

CONCLUSIONS

Instructors could differentiate between reflections generated by ChatGPT or by students most of the time. The overall content of a thematic analysis generated by the artificial intelligence program ChatGPT did not differ from that generated by qualitative researchers. Overall, the promising applications of ChatGPT will likely generate a paradigm shift in (dental) health education, research, and practice.

Collapse

Balasanjeevi G, Surapaneni KM. Comparison of ChatGPT version 3.5 & 4 for utility in respiratory medicine education using clinical case scenarios. Respir Med Res 2024;85:101091. [PMID: 38657295 DOI: 10.1016/j.resmer.2024.101091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 02/14/2024] [Accepted: 02/15/2024] [Indexed: 04/26/2024]

Temsah MH, Alhuzaimi AN, Almansour M, Aljamaan F, Alhasan K, Batarfi MA, Altamimi I, Alharbi A, Alsuhaibani AA, Alwakeel L, Alzahrani AA, Alsulaim KB, Jamal A, Khayat A, Alghamdi MH, Halwani R, Khan MK, Al-Eyadhy A, Nazer R. Art or Artifact: Evaluating the Accuracy, Appeal, and Educational Value of AI-Generated Imagery in DALL·E 3 for Illustrating Congenital Heart Diseases. J Med Syst 2024;48:54. [PMID: 38780839 DOI: 10.1007/s10916-024-02072-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 04/30/2024] [Indexed: 05/25/2024]

Abstract

Artificial Intelligence (AI), particularly AI-Generated Imagery, has the potential to impact medical and patient education. This research explores the use of AI-generated imagery, from text-to-images, in medical education, focusing on congenital heart diseases (CHD). Utilizing ChatGPT's DALL·E 3, the research aims to assess the accuracy and educational value of AI-created images for 20 common CHDs. In this study, we utilized DALL·E 3 to generate a comprehensive set of 110 images, comprising ten images depicting the normal human heart and five images for each of the 20 common CHDs. The generated images were evaluated by a diverse group of 33 healthcare professionals. This cohort included cardiology experts, pediatricians, non-pediatric faculty members, trainees (medical students, interns, pediatric residents), and pediatric nurses. Utilizing a structured framework, these professionals assessed each image for anatomical accuracy, the usefulness of in-picture text, its appeal to medical professionals, and the image's potential applicability in medical presentations. Each item was assessed on a Likert scale of three. The assessments produced a total of 3630 images' assessments. Most AI-generated cardiac images were rated poorly as follows: 80.8% of images were rated as anatomically incorrect or fabricated, 85.2% rated to have incorrect text labels, 78.1% rated as not usable for medical education. The nurses and medical interns were found to have a more positive perception about the AI-generated cardiac images compared to the faculty members, pediatricians, and cardiology experts. Complex congenital anomalies were found to be significantly more predicted to anatomical fabrication compared to simple cardiac anomalies. There were significant challenges identified in image generation. Based on our findings, we recommend a vigilant approach towards the use of AI-generated imagery in medical education at present, underscoring the imperative for thorough validation and the importance of collaboration across disciplines. While we advise against its immediate integration until further validations are conducted, the study advocates for future AI-models to be fine-tuned with accurate medical data, enhancing their reliability and educational utility.

Collapse

Affiliation(s)

Mohamad-Hani Temsah College of Medicine, King Saud University, Riyadh, Saudi Arabia. Pediatric Department, King Saud University Medical City, King Saud University, Riyadh, Saudi Arabia. Evidence-Based Health Care & Knowledge Translation Research Chair, Family & Community Medicine Department, College of Medicine, King Saud University, 11362, Riyadh, Saudi Arabia.
Abdullah N Alhuzaimi College of Medicine, King Saud University, Riyadh, Saudi Arabia Division of Pediatric Cardiology, Cardiac Science Department, College of Medicine, King Saud University Medical City, 11362, Riyadh, Saudi Arabia
Mohammed Almansour College of Medicine, King Saud University, Riyadh, Saudi Arabia Department of Medical Education, College of Medicine, King Saud University, Riyadh, Saudi Arabia
Fadi Aljamaan College of Medicine, King Saud University, Riyadh, Saudi Arabia Critical Care Department, King Saud University Medical City, Riyadh, Saudi Arabia
Khalid Alhasan College of Medicine, King Saud University, Riyadh, Saudi Arabia Pediatric Department, King Saud University Medical City, King Saud University, Riyadh, Saudi Arabia Kidney & Pancreas Health Center, Organ Transplant Center of Excellence, King Faisal Specialist Hospital & Research Center, Riyadh, Saudi Arabia
Munirah A Batarfi Basic Medical Sciences, College of Medicine King Saud bin Abdulaziz University for Health Sciences, King Abdullah International Medical Research Center, Riyadh, Saudi Arabia
Ibraheem Altamimi College of Medicine, King Saud University, Riyadh, Saudi Arabia
Amani Alharbi Pediatric Department, King Saud University Medical City, King Saud University, Riyadh, Saudi Arabia
Adel Abdulaziz Alsuhaibani Pediatric Department, King Saud University Medical City, King Saud University, Riyadh, Saudi Arabia
Leena Alwakeel Pediatric Department, King Saud University Medical City, King Saud University, Riyadh, Saudi Arabia
Abdulrahman Abdulkhaliq Alzahrani College of Medicine, King Saud University, Riyadh, Saudi Arabia
Khaled B Alsulaim College of Medicine, King Saud University, Riyadh, Saudi Arabia
Amr Jamal College of Medicine, King Saud University, Riyadh, Saudi Arabia Evidence-Based Health Care & Knowledge Translation Research Chair, Family & Community Medicine Department, College of Medicine, King Saud University, 11362, Riyadh, Saudi Arabia Department of Family and Community Medicine, King Saud University Medical City, 11362, Riyadh, Saudi Arabia
Afnan Khayat Health Information Management Department, Prince Sultan Military College of Health Sciences, Al Dhahran, Saudi Arabia
Mohammed Hussien Alghamdi College of Medicine, King Saud University, Riyadh, Saudi Arabia Division of Pediatric Cardiology, Cardiac Science Department, College of Medicine, King Saud University Medical City, 11362, Riyadh, Saudi Arabia Department of Medical Education, College of Medicine, King Saud University, Riyadh, Saudi Arabia
Rabih Halwani Department of Clinical Sciences, College of Medicine, University of Sharjah, 27272, Sharjah, United Arab Emirates Research Institute for Medical and Health Sciences, University of Sharjah, 27272, Sharjah, United Arab Emirates
Muhammad Khurram Khan Center of Excellence in Information Assurance, King Saud University, 11653, Riyadh, Saudi Arabia
Ayman Al-Eyadhy College of Medicine, King Saud University, Riyadh, Saudi Arabia Pediatric Department, King Saud University Medical City, King Saud University, Riyadh, Saudi Arabia
Rakan Nazer College of Medicine, King Saud University, Riyadh, Saudi Arabia Department of Cardiac Science, King Fahad Cardiac Center, College of Medicine, King Saud University, Riyadh, Saudi Arabia

Collapse

Moulin TC. Learning with AI Language Models: Guidelines for the Development and Scoring of Medical Questions for Higher Education. J Med Syst 2024;48:45. [PMID: 38652327 DOI: 10.1007/s10916-024-02069-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 04/11/2024] [Indexed: 04/25/2024]

Quttainah M, Mishra V, Madakam S, Lurie Y, Mark S. Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability Framework for Safe and Effective Large Language Models in Medical Education: Narrative Review and Qualitative Study. JMIR AI 2024;3:e51834. [PMID: 38875562 PMCID: PMC11077408 DOI: 10.2196/51834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 12/20/2023] [Accepted: 02/03/2024] [Indexed: 06/16/2024]

Abstract

BACKGROUND

The world has witnessed increased adoption of large language models (LLMs) in the last year. Although the products developed using LLMs have the potential to solve accessibility and efficiency problems in health care, there is a lack of available guidelines for developing LLMs for health care, especially for medical education.

OBJECTIVE

The aim of this study was to identify and prioritize the enablers for developing successful LLMs for medical education. We further evaluated the relationships among these identified enablers.

METHODS

A narrative review of the extant literature was first performed to identify the key enablers for LLM development. We additionally gathered the opinions of LLM users to determine the relative importance of these enablers using an analytical hierarchy process (AHP), which is a multicriteria decision-making method. Further, total interpretive structural modeling (TISM) was used to analyze the perspectives of product developers and ascertain the relationships and hierarchy among these enablers. Finally, the cross-impact matrix-based multiplication applied to a classification (MICMAC) approach was used to determine the relative driving and dependence powers of these enablers. A nonprobabilistic purposive sampling approach was used for recruitment of focus groups.

RESULTS

The AHP demonstrated that the most important enabler for LLMs was credibility, with a priority weight of 0.37, followed by accountability (0.27642) and fairness (0.10572). In contrast, usability, with a priority weight of 0.04, showed negligible importance. The results of TISM concurred with the findings of the AHP. The only striking difference between expert perspectives and user preference evaluation was that the product developers indicated that cost has the least importance as a potential enabler. The MICMAC analysis suggested that cost has a strong influence on other enablers. The inputs of the focus group were found to be reliable, with a consistency ratio less than 0.1 (0.084).

CONCLUSIONS

This study is the first to identify, prioritize, and analyze the relationships of enablers of effective LLMs for medical education. Based on the results of this study, we developed a comprehendible prescriptive framework, named CUC-FATE (Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability), for evaluating the enablers of LLMs in medical education. The study findings are useful for health care professionals, health technology experts, medical technology regulators, and policy makers.

Collapse

Shorey S, Mattar C, Pereira TLB, Choolani M. A scoping review of ChatGPT's role in healthcare education and research. NURSE EDUCATION TODAY 2024;135:106121. [PMID: 38340639 DOI: 10.1016/j.nedt.2024.106121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/05/2024] [Accepted: 02/04/2024] [Indexed: 02/12/2024]

Abstract

OBJECTIVES

To examine and consolidate literature regarding the advantages and disadvantages of utilizing ChatGPT in healthcare education and research.

DESIGN/METHODS

We searched seven electronic databases (PubMed/Medline, CINAHL, Embase, PsycINFO, Scopus, ProQuest Dissertations and Theses Global, and Web of Science) from November 2022 until September 2023. This scoping review adhered to Arksey and O'Malley's framework and followed reporting guidelines outlined in the PRISMA-ScR checklist. For analysis, we employed Thomas and Harden's thematic synthesis framework.

RESULTS

A total of 100 studies were included. An overarching theme, "Forging the Future: Bridging Theory and Integration of ChatGPT" emerged, accompanied by two main themes (1) Enhancing Healthcare Education, Research, and Writing with ChatGPT, (2) Controversies and Concerns about ChatGPT in Healthcare Education Research and Writing, and seven subthemes.

CONCLUSIONS

Our review underscores the importance of acknowledging legitimate concerns related to the potential misuse of ChatGPT such as 'ChatGPT hallucinations', its limited understanding of specialized healthcare knowledge, its impact on teaching methods and assessments, confidentiality and security risks, and the controversial practice of crediting it as a co-author on scientific papers, among other considerations. Furthermore, our review also recognizes the urgency of establishing timely guidelines and regulations, along with the active engagement of relevant stakeholders, to ensure the responsible and safe implementation of ChatGPT's capabilities. We advocate for the use of cross-verification techniques to enhance the precision and reliability of generated content, the adaptation of higher education curricula to incorporate ChatGPT's potential, educators' need to familiarize themselves with the technology to improve their literacy and teaching approaches, and the development of innovative methods to detect ChatGPT usage. Furthermore, data protection measures should be prioritized when employing ChatGPT, and transparent reporting becomes crucial when integrating ChatGPT into academic writing.

Collapse

Gordon M, Daniel M, Ajiboye A, Uraiby H, Xu NY, Bartlett R, Hanson J, Haas M, Spadafore M, Grafton-Clarke C, Gasiea RY, Michie C, Corral J, Kwan B, Dolmans D, Thammasitboon S. A scoping review of artificial intelligence in medical education: BEME Guide No. 84. MEDICAL TEACHER 2024;46:446-470. [PMID: 38423127 DOI: 10.1080/0142159x.2024.2314198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 01/31/2024] [Indexed: 03/02/2024]

Abstract

BACKGROUND

Artificial Intelligence (AI) is rapidly transforming healthcare, and there is a critical need for a nuanced understanding of how AI is reshaping teaching, learning, and educational practice in medical education. This review aimed to map the literature regarding AI applications in medical education, core areas of findings, potential candidates for formal systematic review and gaps for future research.

METHODS

This rapid scoping review, conducted over 16 weeks, employed Arksey and O'Malley's framework and adhered to STORIES and BEME guidelines. A systematic and comprehensive search across PubMed/MEDLINE, EMBASE, and MedEdPublish was conducted without date or language restrictions. Publications included in the review spanned undergraduate, graduate, and continuing medical education, encompassing both original studies and perspective pieces. Data were charted by multiple author pairs and synthesized into various thematic maps and charts, ensuring a broad and detailed representation of the current landscape.

RESULTS

The review synthesized 278 publications, with a majority (68%) from North American and European regions. The studies covered diverse AI applications in medical education, such as AI for admissions, teaching, assessment, and clinical reasoning. The review highlighted AI's varied roles, from augmenting traditional educational methods to introducing innovative practices, and underscores the urgent need for ethical guidelines in AI's application in medical education.

CONCLUSION

The current literature has been charted. The findings underscore the need for ongoing research to explore uncharted areas and address potential risks associated with AI use in medical education. This work serves as a foundational resource for educators, policymakers, and researchers in navigating AI's evolving role in medical education. A framework to support future high utility reporting is proposed, the FACETS framework.

Collapse

Yu J, Matava C. ChatGPT for Parents of Children Seeking Emergency Care - so much Hope, so much Caution. J Med Syst 2024;48:17. [PMID: 38305947 DOI: 10.1007/s10916-024-02036-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 01/17/2024] [Indexed: 02/03/2024]

Ting DSJ, Tan TF, Ting DSW. ChatGPT in ophthalmology: the dawn of a new era? Eye (Lond) 2024;38:4-7. [PMID: 37369764 PMCID: PMC10764795 DOI: 10.1038/s41433-023-02619-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 05/22/2023] [Accepted: 06/02/2023] [Indexed: 06/29/2023] Open

Huang CH, Hsiao HJ, Yeh PC, Wu KC, Kao CH. Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam. Digit Health 2024;10:20552076241233144. [PMID: 38371244 PMCID: PMC10874144 DOI: 10.1177/20552076241233144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 01/25/2024] [Indexed: 02/20/2024] Open

Abstract

Introduction

Since its release by OpenAI in November 2022, numerous studies have subjected ChatGPT to various tests to evaluate its performance in medical exams. The objective of this study is to evaluate ChatGPT's accuracy and logical reasoning across all 10 subjects featured in Stage 1 of Senior Professional and Technical Examinations for Medical Doctors (SPTEMD) in Taiwan, with questions that encompass both Chinese and English.

Methods

In this study, we tested ChatGPT-4 to complete SPTEMD Stage 1. The model was presented with multiple-choice questions extracted from three separate tests conducted in February 2022, July 2022, and February 2023. These questions encompass 10 subjects, namely biochemistry and molecular biology, anatomy, embryology and developmental biology, histology, physiology, microbiology and immunology, parasitology, pharmacology, pathology, and public health. Subsequently, we analyzed the model's accuracy for each subject.

Result

In all three tests, ChatGPT achieved scores surpassing the 60% passing threshold, resulting in an overall average score of 87.8%. Notably, its best performance was in biochemistry, where it garnered an average score of 93.8%. Conversely, the performance of the generative pre-trained transformer (GPT)-4 assistant on anatomy, parasitology, and embryology was not as good. In addition, its scores were highly variable in embryology and parasitology.

Conclusion

ChatGPT has the potential to facilitate not only exam preparation but also improve the accessibility of medical education and support continuous education for medical professionals. In conclusion, this study has demonstrated ChatGPT's potential competence across various subjects within the SPTEMD Stage 1 and suggests that it could be a helpful tool for learning and exam preparation for medical students and professionals.

Collapse

Zhang JS, Yoon C, Williams DKA, Pinkas A. Exploring the Usage of ChatGPT Among Medical Students in the United States. JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT 2024;11:23821205241264695. [PMID: 39092290 PMCID: PMC11292693 DOI: 10.1177/23821205241264695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 06/08/2024] [Indexed: 08/04/2024]

Abstract

OBJECTIVES

Chat Generative Pretrained Transformer (ChatGPT) is a large language model developed by OpenAI that has gained widespread interest. It has been cited for its potential impact on health care and its beneficial role in medical education. However, there is limited investigation into its use among medical students. In this study, we evaluated the frequency of ChatGPT use, motivations for use, and preference for ChatGPT over existing resources among medical students in the United States.

METHODS

Data was collected from an original survey consisting of 14 questions assessing the frequency and usage of ChatGPT in various contexts within medical education. The survey was distributed via email lists, group messaging applications, and classroom lectures to medical students across the United States. Responses were collected between August and October 2023.

RESULTS

One hundred thirty-one participants completed the survey and were included in the analysis. Of the total, 48.9% respondents responded that they have used ChatGPT in medical studies. Among ChatGPT users, 43.7% of respondents report using ChatGPT weekly, several times per week, or daily. ChatGPT is most used for writing, revising, editing, and summarizing purposes. 37.5% and 41.3% of respondents reported using ChatGPT more than 25% of the working time for these tasks respectively. Among respondents who have not used ChatGPT, more than 50% of respondents reported they were extremely unlikely or unlikely to use ChatGPT across all surveyed scenarios. ChatGPT users report they are more likely to use ChatGPT over directly asking professors or attendings (45.3%), textbooks (42.2%), and lectures (31.7%), and least likely to be used over popular flashcard application Anki (11.1%) and medical education videos (9.5%).

CONCLUSIONS

ChatGPT is an increasingly popular resource among medical students, with many preferring ChatGPT over other traditional resources such as professors, textbooks, and lectures. Its impact on medical education will only continue to grow as its capabilities improve.

Collapse

Madrid-García A, Rosales-Rosado Z, Freites-Nuñez D, Pérez-Sancristóbal I, Pato-Cour E, Plasencia-Rodríguez C, Cabeza-Osorio L, Abasolo-Alcázar L, León-Mateos L, Fernández-Gutiérrez B, Rodríguez-Rodríguez L. Harnessing ChatGPT and GPT-4 for evaluating the rheumatology questions of the Spanish access exam to specialized medical training. Sci Rep 2023;13:22129. [PMID: 38092821 PMCID: PMC10719375 DOI: 10.1038/s41598-023-49483-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 12/08/2023] [Indexed: 12/17/2023] Open

Affiliation(s)

Alfredo Madrid-García Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain.
Zulema Rosales-Rosado Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
Dalifer Freites-Nuñez Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
Inés Pérez-Sancristóbal Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
Esperanza Pato-Cour Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
Chamaida Plasencia-Rodríguez Reumatología, Hospital Universitario La Paz-IdiPaz, Paseo de La Castellana, 261, 28046, Madrid, Spain
Luis Cabeza-Osorio Medicina Interna, Hospital Universitario del Henares, Avenida de Marie Curie, 0, 28822, Madrid, Spain Facultad de Medicina, Universidad Francisco de Vitoria, Carretera Pozuelo, Km 1800, 28223, Madrid, Spain
Lydia Abasolo-Alcázar Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
Leticia León-Mateos Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
Benjamín Fernández-Gutiérrez Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain Facultad de Medicina, Universidad Complutense de Madrid, Madrid, Spain
Luis Rodríguez-Rodríguez Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain

Collapse

Agarwal M, Goswami A, Sharma P. Evaluating ChatGPT-3.5 and Claude-2 in Answering and Explaining Conceptual Medical Physiology Multiple-Choice Questions. Cureus 2023;15:e46222. [PMID: 37908959 PMCID: PMC10613833 DOI: 10.7759/cureus.46222] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/29/2023] [Indexed: 11/02/2023] Open

Abstract

Background Generative artificial intelligence (AI) systems such as ChatGPT-3.5 and Claude-2 may assist in explaining complex medical science topics. A few studies have shown that AI can solve complicated physiology problems that require critical thinking and analysis. However, further studies are required to validate the effectiveness of AI in answering conceptual multiple-choice questions (MCQs) in human physiology. Objective This study aimed to evaluate and compare the proficiency of ChatGPT-3.5 and Claude-2 in answering and explaining a curated set of MCQs in medical physiology. Methods In this cross-sectional study, a set of 55 MCQs from 10 competencies of medical physiology was purposefully constructed that required comprehension, problem-solving, and analytical skills to solve them. The MCQs and a structured prompt for response generation were presented to ChatGPT-3.5 and Claude-2. The explanations provided by both AI systems were documented in an Excel spreadsheet. All three authors subjected these explanations to a rating process using a scale of 0 to 3. A rating of 0 was assigned to an incorrect, 1 to a partially correct, 2 to a correct explanation with some aspects missing, and 3 to a perfectly correct explanation. Both AI models were evaluated for their ability to choose the correct answer (option) and provide clear and comprehensive explanations of the MCQs. The Mann-Whitney U test was used to compare AI responses. The Fleiss multi-rater kappa (κ) was used to determine the score agreement among the three raters. The statistical significance level was decided at P ≤ 0.05. Results Claude-2 answered 40 MCQs correctly, which was significantly higher than the 26 correct responses from ChatGPT-3.5. The rating distribution for the explanations generated by Claude-2 was significantly higher than that of ChatGPT-3.5. The κ values were 0.804 and 0.818 for Claude-2 and ChatGPT-3.5, respectively. Conclusion In terms of answering and elucidating conceptual MCQs in medical physiology, Claude-2 surpassed ChatGPT-3.5. However, accessing Claude-2 from India requires the use of a virtual private network, which may raise security concerns.

Collapse