1
|
Triplett S, Ness-Engle GL, Behnen EM. A comparison of drug information question responses by a drug information center and by ChatGPT. Am J Health Syst Pharm 2025; 82:448-460. [PMID: 39450858 DOI: 10.1093/ajhp/zxae316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Indexed: 10/26/2024] Open
Abstract
PURPOSE A study was conducted to assess the accuracy and ability of Chat Generative Pre-trained Transformer (ChatGPT) to systematically respond to drug information inquiries relative to responses of a drug information center (DIC). METHODS Ten drug information questions answered by the DIC in 2022 or 2023 were selected for analysis. Three pharmacists created new ChatGPT accounts and submitted each question to ChatGPT at the same time. Each question was submitted twice to identify consistency in responses. Two days later, the same process was conducted by a fourth pharmacist. Phase 1 of data analysis consisted of a drug information pharmacist assessing all 84 ChatGPT responses for accuracy relative to the DIC responses. In phase 2, 10 ChatGPT responses were selected to be assessed by 3 blinded reviewers. Reviewers utilized an 8-question predetermined rubric to evaluate the ChatGPT and DIC responses. RESULTS When comparing the ChatGPT responses (n = 84) to the DIC responses, ChatGPT had an overall accuracy rate of 50%. Accuracy across the different question types varied. In regards to the overall blinded score, ChatGPT responses scored higher than the responses by the DIC according to the rubric (overall scores of 67.5% and 55.0%, respectively). The DIC responses scored higher in the categories of references mentioned and references identified. CONCLUSION Responses generated by ChatGPT have been found to be better than those created by a DIC in clarity and readability; however, the accuracy of ChatGPT responses was lacking. ChatGPT responses to drug information questions would need to be carefully reviewed for accuracy and completeness.
Collapse
Affiliation(s)
- Samantha Triplett
- Belmont University College of Pharmacy and Health Sciences and HealthTrust, Nashville, TN, USA
| | - Genevieve Lynn Ness-Engle
- Christy Houston Foundation Drug Information Center, Nashville, TN, and Belmont University College of Pharmacy and Health Sciences, Nashville, TN, USA
| | - Erin M Behnen
- Belmont University College of Pharmacy and Health Sciences, Nashville, TN, USA
| |
Collapse
|
2
|
Heisinger S, Salzmann SN, Senker W, Aspalter S, Oberndorfer J, Matzner MP, Stienen MN, Motov S, Huber D, Grohs JG. ChatGPT's Performance in Spinal Metastasis Cases-Can We Discuss Our Complex Cases with ChatGPT? J Clin Med 2024; 13:7864. [PMID: 39768787 PMCID: PMC11727723 DOI: 10.3390/jcm13247864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2024] [Revised: 12/11/2024] [Accepted: 12/19/2024] [Indexed: 01/06/2025] Open
Abstract
Background: The integration of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT-4, is transforming healthcare. ChatGPT's potential to assist in decision-making for complex cases, such as spinal metastasis treatment, is promising but widely untested. Especially in cancer patients who develop spinal metastases, precise and personalized treatment is essential. This study examines ChatGPT-4's performance in treatment planning for spinal metastasis cases compared to experienced spine surgeons. Materials and Methods: Five spine metastasis cases were randomly selected from recent literature. Consequently, five spine surgeons and ChatGPT-4 were tasked with providing treatment recommendations for each case in a standardized manner. Responses were analyzed for frequency distribution, agreement, and subjective rater opinions. Results: ChatGPT's treatment recommendations aligned with the majority of human raters in 73% of treatment choices, with moderate to substantial agreement on systemic therapy, pain management, and supportive care. However, ChatGPT's recommendations tended towards generalized statements, with raters noting its generalized answers. Agreement among raters improved in sensitivity analyses excluding ChatGPT, particularly for controversial areas like surgical intervention and palliative care. Conclusions: ChatGPT shows potential in aligning with experienced surgeons on certain treatment aspects of spinal metastasis. However, its generalized approach highlights limitations, suggesting that training with specific clinical guidelines could potentially enhance its utility in complex case management. Further studies are necessary to refine AI applications in personalized healthcare decision-making.
Collapse
Affiliation(s)
- Stephan Heisinger
- Department of Orthopedics and Trauma Surgery, Medical University of Vienna, 1090 Vienna, Austria; (S.H.)
| | - Stephan N. Salzmann
- Department of Orthopedics and Trauma Surgery, Medical University of Vienna, 1090 Vienna, Austria; (S.H.)
| | - Wolfgang Senker
- Department of Neurosurgery, Kepler University Hospital, 4020 Linz, Austria (S.A.)
| | - Stefan Aspalter
- Department of Neurosurgery, Kepler University Hospital, 4020 Linz, Austria (S.A.)
| | - Johannes Oberndorfer
- Department of Neurosurgery, Kepler University Hospital, 4020 Linz, Austria (S.A.)
| | - Michael P. Matzner
- Department of Orthopedics and Trauma Surgery, Medical University of Vienna, 1090 Vienna, Austria; (S.H.)
| | - Martin N. Stienen
- Spine Center of Eastern Switzerland & Department of Neurosurgery, Kantonsspital St. Gallen, Medical School of St. Gallen, University of St.Gallen, 9000 St. Gallen, Switzerland
| | - Stefan Motov
- Spine Center of Eastern Switzerland & Department of Neurosurgery, Kantonsspital St. Gallen, Medical School of St. Gallen, University of St.Gallen, 9000 St. Gallen, Switzerland
| | - Dominikus Huber
- Division of Oncology, Department of Medicine I, Medical University of Vienna, 1090 Vienna, Austria
| | - Josef Georg Grohs
- Department of Orthopedics and Trauma Surgery, Medical University of Vienna, 1090 Vienna, Austria; (S.H.)
| |
Collapse
|
3
|
Munir F, Gehres A, Wai D, Song L. Evaluation of ChatGPT as a Tool for Answering Clinical Questions in Pharmacy Practice. J Pharm Pract 2024; 37:1303-1310. [PMID: 38775367 DOI: 10.1177/08971900241256731] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Background: In the healthcare field, there has been a growing interest in using artificial intelligence (AI)-powered tools to assist healthcare professionals, including pharmacists, in their daily tasks. Objectives: To provide commentary and insight into the potential for generative AI language models such as ChatGPT as a tool for answering practice-based, clinical questions and the challenges that need to be addressed before implementation in pharmacy practice settings. Methods: To assess ChatGPT, pharmacy-based questions were prompted to ChatGPT (Version 3.5; free version) and responses were recorded. Question types included 6 drug information questions, 6 enhanced prompt drug information questions, 5 patient case questions, 5 calculations questions, and 10 drug knowledge questions (e.g., top 200 drugs). After all responses were collected, ChatGPT responses were assessed for appropriateness. Results: ChatGPT responses were generated from 32 questions in 5 categories and evaluated on a total of 44 possible points. Among all ChatGPT responses and categories, the overall score was 21 of 44 points (47.73%). ChatGPT scored higher in pharmacy calculation (100%), drug information (83%), and top 200 drugs (80%) categories and lower in drug information enhanced prompt (33%) and patient case (20%) categories. Conclusion: This study suggests that ChatGPT has limited success as a tool to answer pharmacy-based questions. ChatGPT scored higher in calculation and multiple-choice questions but scored lower in drug information and patient case questions, generating misleading or fictional answers and citations.
Collapse
Affiliation(s)
- Faria Munir
- University of Illinois Chicago College of Pharmacy, Chicago, IL, USA
| | - Anna Gehres
- College of Pharmacy, The Ohio State University, Columbus, OH, USA
| | - David Wai
- Department of Pharmacy, Ohio State University Wexner Medical Center, Columbus, OH, USA
| | - Leah Song
- University of Illinois Chicago College of Pharmacy, Chicago, IL, USA
| |
Collapse
|
4
|
Bellinger JR, Kwak MW, Ramos GA, Mella JS, Mattos JL. Quantitative Comparison of Chatbots on Common Rhinology Pathologies. Laryngoscope 2024; 134:4225-4231. [PMID: 38666768 DOI: 10.1002/lary.31470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 03/07/2024] [Accepted: 04/10/2024] [Indexed: 10/19/2024]
Abstract
OBJECTIVES Understanding the strengths and weaknesses of chatbots as a source of patient information is critical for providers in the rising artificial intelligence landscape. This study is the first to quantitatively analyze and compare four of the most used chatbots available regarding treatments of common pathologies in rhinology. METHODS The treatment of epistaxis, chronic sinusitis, sinus infection, allergic rhinitis, allergies, and nasal polyps was asked to chatbots ChatGPT, ChatGPT Plus, Google Bard, and Microsoft Bing in May 2023. Individual responses were analyzed by reviewers for readability, quality, understandability, and actionability using validated scoring metrics. Accuracy and comprehensiveness were evaluated for each response by two experts in rhinology. RESULTS ChatGPT, Plus, Bard, and Bing had FRE readability scores of 33.17, 35.93, 46.50, and 46.32, respectively, indicating higher readability for Bard and Bing compared to ChatGPT (p = 0.003, p = 0.008) and Plus (p = 0.025, p = 0.048). ChatGPT, Plus, and Bard had mean DISCERN quality scores of 20.42, 20.89, and 20.61, respectively, which was higher than the score for Bing of 16.97 (p < 0.001). For understandability, ChatGPT and Bing had PEMAT scores of 76.67 and 66.61, respectively, which were lower than both Plus at 92.00 (p < 0.001, p < 0.001) and Bard at 92.67 (p < 0.001, p < 0.001). ChatGPT Plus had an accuracy score of 4.39 which was higher than ChatGPT (3.97, p = 0.118), Bard (3.72, p = 0.002), and Bing (3.19, p < 0.001). CONCLUSION On aggregate of the tested domains, our results suggest ChatGPT Plus and Google Bard are currently the most patient-friendly chatbots for the treatment of common pathologies in rhinology. LEVEL OF EVIDENCE N/A Laryngoscope, 134:4225-4231, 2024.
Collapse
Affiliation(s)
- Jeffrey R Bellinger
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, U.S.A
| | - Minhie W Kwak
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, U.S.A
| | - Gabriel A Ramos
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, U.S.A
| | - Jeffrey S Mella
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, U.S.A
| | - Jose L Mattos
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, U.S.A
| |
Collapse
|
5
|
Safrai M, Orwig KE. Utilizing artificial intelligence in academic writing: an in-depth evaluation of a scientific review on fertility preservation written by ChatGPT-4. J Assist Reprod Genet 2024; 41:1871-1880. [PMID: 38619763 PMCID: PMC11263262 DOI: 10.1007/s10815-024-03089-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 03/07/2024] [Indexed: 04/16/2024] Open
Abstract
PURPOSE To evaluate the ability of ChatGPT-4 to generate a biomedical review article on fertility preservation. METHODS ChatGPT-4 was prompted to create an outline for a review on fertility preservation in men and prepubertal boys. The outline provided by ChatGPT-4 was subsequently used to prompt ChatGPT-4 to write the different parts of the review and provide five references for each section. The different parts of the article and the references provided were combined to create a single scientific review that was evaluated by the authors, who are experts in fertility preservation. The experts assessed the article and the references for accuracy and checked for plagiarism using online tools. In addition, both experts independently scored the relevance, depth, and currentness of the ChatGPT-4's article using a scoring matrix ranging from 0 to 5 where higher scores indicate higher quality. RESULTS ChatGPT-4 successfully generated a relevant scientific article with references. Among 27 statements needing citations, four were inaccurate. Of 25 references, 36% were accurate, 48% had correct titles but other errors, and 16% were completely fabricated. Plagiarism was minimal (mean = 3%). Experts rated the article's relevance highly (5/5) but gave lower scores for depth (2-3/5) and currentness (3/5). CONCLUSION ChatGPT-4 can produce a scientific review on fertility preservation with minimal plagiarism. While precise in content, it showed factual and contextual inaccuracies and inconsistent reference reliability. These issues limit ChatGPT-4 as a sole tool for scientific writing but suggest its potential as an aid in the writing process.
Collapse
Affiliation(s)
- Myriam Safrai
- Department of Obstetrics, Gynecology and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213, USA.
- Department of Obstetrics and Gynecology, Chaim Sheba Medical Center (Tel Hashomer), Sackler Faculty of Medicine, Tel Aviv University, 52621, Tel Aviv, Israel.
| | - Kyle E Orwig
- Department of Obstetrics, Gynecology and Reproductive Sciences, Magee-Womens Research Institute, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213, USA
| |
Collapse
|
6
|
Lee JC, Hamill CS, Shnayder Y, Buczek E, Kakarala K, Bur AM. Exploring the Role of Artificial Intelligence Chatbots in Preoperative Counseling for Head and Neck Cancer Surgery. Laryngoscope 2024; 134:2757-2761. [PMID: 38126511 DOI: 10.1002/lary.31243] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 10/25/2023] [Accepted: 11/30/2023] [Indexed: 12/23/2023]
Abstract
OBJECTIVE To evaluate the potential use of artificial intelligence (AI) chatbots, such as ChatGPT, in preoperative counseling for patients undergoing head and neck cancer surgery. STUDY DESIGN Cross-Sectional Survey Study. SETTING Single institution tertiary care center. METHODS ChatGPT was used to generate presurgical educational information including indications, risks, and recovery time for five common head and neck surgeries. Chatbot-generated information was compared with information gathered from a simple browser search (first publicly available website excluding scholarly articles). The accuracy of the information, readability, thoroughness, and number of errors were compared by five experienced head and neck surgeons in a blinded fashion. Each surgeon then chose a preference between the two information sources for each surgery. RESULTS With the exception of total word count, ChatGPT-generated pre-surgical information has similar readability, content of knowledge, accuracy, thoroughness, and numbers of medical errors when compared to publicly available websites. Additionally, ChatGPT was preferred 48% of the time by experienced head and neck surgeons. CONCLUSION Head and neck surgeons rated ChatGPT-generated and readily available online educational materials similarly. Further refinement in AI technology may soon open more avenues for patient counseling. Future investigations into the medical safety of AI counseling and exploring patients' perspectives would be of strong interest. LEVEL OF EVIDENCE N/A. Laryngoscope, 134:2757-2761, 2024.
Collapse
Affiliation(s)
- Jason C Lee
- Department of Otolaryngology, University of Kansas Medical Center, Kansas City, Kansas, U.S.A
| | - Chelsea S Hamill
- Department of Otolaryngology, University of Kansas Medical Center, Kansas City, Kansas, U.S.A
| | - Yelizaveta Shnayder
- Department of Otolaryngology, University of Kansas Medical Center, Kansas City, Kansas, U.S.A
| | - Erin Buczek
- Department of Otolaryngology, University of Kansas Medical Center, Kansas City, Kansas, U.S.A
| | - Kiran Kakarala
- Department of Otolaryngology, University of Kansas Medical Center, Kansas City, Kansas, U.S.A
| | - Andrés M Bur
- Department of Otolaryngology, University of Kansas Medical Center, Kansas City, Kansas, U.S.A
| |
Collapse
|
7
|
Bellinger JR, De La Chapa JS, Kwak MW, Ramos GA, Morrison D, Kesser BW. BPPV Information on Google Versus AI (ChatGPT). Otolaryngol Head Neck Surg 2024; 170:1504-1511. [PMID: 37622581 DOI: 10.1002/ohn.506] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 07/29/2023] [Indexed: 08/26/2023]
Abstract
OBJECTIVE To quantitatively compare online patient education materials found using traditional search engines (Google) versus conversational Artificial Intelligence (AI) models (ChatGPT) for benign paroxysmal positional vertigo (BPPV). STUDY DESIGN The top 30 Google search results for "benign paroxysmal positional vertigo" were compared to the OpenAI conversational AI language model, ChatGPT, responses for 5 common patient questions posed about BPPV in February 2023. Metrics included readability, quality, understandability, and actionability. SETTING Online information. METHODS Validated online information metrics including Flesch-Kincaid Grade Level (FKGL), Flesch Reading Ease (FRE), DISCERN instrument score, and Patient Education Materials Assessment Tool for Printed Materials were analyzed and scored by reviewers. RESULTS Mean readability scores, FKGL and FRE, for the Google webpages were 10.7 ± 2.6 and 46.5 ± 14.3, respectively. ChatGPT responses had a higher FKGL score of 13.9 ± 2.5 (P < .001) and a lower FRE score of 34.9 ± 11.2 (P = .005), both corresponding to lower readability. The Google webpages had a DISCERN part 2 score of 25.4 ± 7.5 compared to the individual ChatGPT responses with a score of 17.5 ± 3.9 (P = .001), and the combined ChatGPT responses with a score of 25.0 ± 0.9 (P = .928). The average scores of the reviewers for all ChatGPT responses for accuracy were 4.19 ± 0.82 and 4.31 ± 0.67 for currency. CONCLUSION The results of this study suggest that the information on ChatGPT is more difficult to read, of lower quality, and more difficult to comprehend compared to information on Google searches.
Collapse
Affiliation(s)
- Jeffrey R Bellinger
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - Julian S De La Chapa
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - Minhie W Kwak
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - Gabriel A Ramos
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - Daniel Morrison
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - Bradley W Kesser
- Department of Otolaryngology-Head and Neck Surgery, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| |
Collapse
|
8
|
Ille AM, Markosian C, Burley SK, Mathews MB, Pasqualini R, Arap W. Generative artificial intelligence performs rudimentary structural biology modeling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.10.575113. [PMID: 38293060 PMCID: PMC10827103 DOI: 10.1101/2024.01.10.575113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Natural language-based generative artificial intelligence (AI) has become increasingly prevalent in scientific research. Intriguingly, capabilities of generative pre-trained transformer (GPT) language models beyond the scope of natural language tasks have recently been identified. Here we explored how GPT-4 might be able to perform rudimentary structural biology modeling. We prompted GPT-4 to model 3D structures for the 20 standard amino acids and an α-helical polypeptide chain, with the latter incorporating Wolfram mathematical computation. We also used GPT-4 to perform structural interaction analysis between nirmatrelvir and its target, the SARS-CoV-2 main protease. Geometric parameters of the generated structures typically approximated close to experimental references. However, modeling was sporadically error-prone and molecular complexity was not well tolerated. Interaction analysis further revealed the ability of GPT-4 to identify specific amino acid residues involved in ligand binding along with corresponding bond distances. Despite current limitations, we show the capacity of natural language generative AI to perform basic structural biology modeling and interaction analysis with atomic-scale accuracy.
Collapse
Affiliation(s)
- Alexander M. Ille
- School of Graduate Studies, Rutgers, The State University of New Jersey, Newark, New Jersey, USA
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey, USA
- Division of Cancer Biology, Department of Radiation Oncology, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - Christopher Markosian
- School of Graduate Studies, Rutgers, The State University of New Jersey, Newark, New Jersey, USA
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey, USA
- Division of Cancer Biology, Department of Radiation Oncology, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, New Jersey, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, California, USA
| | - Michael B. Mathews
- School of Graduate Studies, Rutgers, The State University of New Jersey, Newark, New Jersey, USA
- Division of Infectious Disease, Department of Medicine, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - Renata Pasqualini
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey, USA
- Division of Cancer Biology, Department of Radiation Oncology, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - Wadih Arap
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey, USA
- Division of Hematology/Oncology, Department of Medicine, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| |
Collapse
|
9
|
Kim MJ, Admane S, Chang YK, Shih KSK, Reddy A, Tang M, Cruz MDL, Taylor TP, Bruera E, Hui D. Chatbot Performance in Defining and Differentiating Palliative Care, Supportive Care, Hospice Care. J Pain Symptom Manage 2024; 67:e381-e391. [PMID: 38219964 DOI: 10.1016/j.jpainsymman.2024.01.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 12/22/2023] [Accepted: 01/03/2024] [Indexed: 01/16/2024]
Abstract
CONTEXT Artificial intelligence (AI) chatbot platforms are increasingly used by patients as sources of information. However, there is limited data on the performance of these platforms, especially regarding palliative care terms. OBJECTIVES We evaluated the accuracy, comprehensiveness, reliability, and readability of three AI platforms in defining and differentiating "palliative care," "supportive care," and "hospice care." METHODS We asked ChatGPT, Microsoft Bing Chat, Google Bard to define and differentiate "palliative care," "supportive care," and "hospice care" and provide three references. Outputs were randomized and assessed by six blinded palliative care physicians using 0-10 scales (10 = best) for accuracy, comprehensiveness, and reliability. Readability was assessed using Flesch Kincaid Grade Level and Flesch Reading Ease scores. RESULTS The mean (SD) accuracy scores for ChatGPT, Bard, and Bing Chat were 9.1 (1.3), 8.7 (1.5), and 8.2 (1.7), respectively; for comprehensiveness, the scores for the three platforms were 8.7 (1.5), 8.1 (1.9), and 5.6 (2.0), respectively; for reliability, the scores were 6.3 (2.5), 3.2 (3.1), and 7.1 (2.4), respectively. Despite generally high accuracy, we identified some major errors (e.g., Bard stated that supportive care had "the goal of prolonging life or even achieving a cure"). We found several major omissions, particularly with Bing Chat (e.g., no mention of interdisciplinary teams in palliative care or hospice care). References were often unreliable. Readability scores did not meet recommended levels for patient educational materials. CONCLUSION We identified important concerns regarding the accuracy, comprehensiveness, reliability, and readability of outputs from AI platforms. Further research is needed to improve their performance.
Collapse
Affiliation(s)
- Min Ji Kim
- Department of Palliative Care (M.J.K., S.A., Y.K.C., A.R., M.T., E.B., D.H.), Rehabilitation, and Integrative Medicine, University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
| | - Sonal Admane
- Department of Palliative Care (M.J.K., S.A., Y.K.C., A.R., M.T., E.B., D.H.), Rehabilitation, and Integrative Medicine, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Yuchieh Kathryn Chang
- Department of Palliative Care (M.J.K., S.A., Y.K.C., A.R., M.T., E.B., D.H.), Rehabilitation, and Integrative Medicine, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | | | - Akhila Reddy
- Department of Palliative Care (M.J.K., S.A., Y.K.C., A.R., M.T., E.B., D.H.), Rehabilitation, and Integrative Medicine, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Michael Tang
- Department of Palliative Care (M.J.K., S.A., Y.K.C., A.R., M.T., E.B., D.H.), Rehabilitation, and Integrative Medicine, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Maxine De La Cruz
- Beth Israel Deaconess Medical Center, Harvard Medical School (M.C.), Boston, Massachusetts, USA
| | - Terry Pham Taylor
- Department of Hospital Medicine, University of Texas MD Anderson Cancer Center (T.P.T.), Houston, Texas, USA
| | - Eduardo Bruera
- Department of Palliative Care (M.J.K., S.A., Y.K.C., A.R., M.T., E.B., D.H.), Rehabilitation, and Integrative Medicine, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - David Hui
- Department of Palliative Care (M.J.K., S.A., Y.K.C., A.R., M.T., E.B., D.H.), Rehabilitation, and Integrative Medicine, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| |
Collapse
|
10
|
Yuan S, Li F, Browning MHEM, Bardhan M, Zhang K, McAnirlin O, Patwary MM, Reuben A. Leveraging and exercising caution with ChatGPT and other generative artificial intelligence tools in environmental psychology research. Front Psychol 2024; 15:1295275. [PMID: 38650897 PMCID: PMC11033305 DOI: 10.3389/fpsyg.2024.1295275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 03/01/2024] [Indexed: 04/25/2024] Open
Abstract
Generative Artificial Intelligence (GAI) is an emerging and disruptive technology that has attracted considerable interest from researchers and educators across various disciplines. We discuss the relevance and concerns of ChatGPT and other GAI tools in environmental psychology research. We propose three use categories for GAI tools: integrated and contextualized understanding, practical and flexible implementation, and two-way external communication. These categories are exemplified by topics such as the health benefits of green space, theory building, visual simulation, and identifying practical relevance. However, we also highlight the balance of productivity with ethical issues, as well as the need for ethical guidelines, professional training, and changes in the academic performance evaluation systems. We hope this perspective can foster constructive dialogue and responsible practice of GAI tools.
Collapse
Affiliation(s)
- Shuai Yuan
- Virtual Reality and Nature Lab, Department of Parks, Recreation and Tourism Management, Clemson University, Clemson, SC, United States
| | - Fu Li
- Virtual Reality and Nature Lab, Department of Parks, Recreation and Tourism Management, Clemson University, Clemson, SC, United States
| | - Matthew H. E. M. Browning
- Virtual Reality and Nature Lab, Department of Parks, Recreation and Tourism Management, Clemson University, Clemson, SC, United States
| | - Mondira Bardhan
- Virtual Reality and Nature Lab, Department of Parks, Recreation and Tourism Management, Clemson University, Clemson, SC, United States
| | - Kuiran Zhang
- Virtual Reality and Nature Lab, Department of Parks, Recreation and Tourism Management, Clemson University, Clemson, SC, United States
| | - Olivia McAnirlin
- Virtual Reality and Nature Lab, Department of Parks, Recreation and Tourism Management, Clemson University, Clemson, SC, United States
| | - Muhammad Mainuddin Patwary
- Environment and Sustainability Research Initiative, Khulna, Bangladesh
- Environmental Science Discipline, Life Science School, Khulna University, Khulna, Bangladesh
| | - Aaron Reuben
- Department of Psychology and Neuroscience, Duke University, Durham, NC, United States
| |
Collapse
|
11
|
Ganjavi C, Eppler MB, Pekcan A, Biedermann B, Abreu A, Collins GS, Gill IS, Cacciamani GE. Publishers' and journals' instructions to authors on use of generative artificial intelligence in academic and scientific publishing: bibliometric analysis. BMJ 2024; 384:e077192. [PMID: 38296328 PMCID: PMC10828852 DOI: 10.1136/bmj-2023-077192] [Citation(s) in RCA: 32] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/29/2023] [Indexed: 02/05/2024]
Abstract
OBJECTIVES To determine the extent and content of academic publishers' and scientific journals' guidance for authors on the use of generative artificial intelligence (GAI). DESIGN Cross sectional, bibliometric study. SETTING Websites of academic publishers and scientific journals, screened on 19-20 May 2023, with the search updated on 8-9 October 2023. PARTICIPANTS Top 100 largest academic publishers and top 100 highly ranked scientific journals, regardless of subject, language, or country of origin. Publishers were identified by the total number of journals in their portfolio, and journals were identified through the Scimago journal rank using the Hirsch index (H index) as an indicator of journal productivity and impact. MAIN OUTCOME MEASURES The primary outcomes were the content of GAI guidelines listed on the websites of the top 100 academic publishers and scientific journals, and the consistency of guidance between the publishers and their affiliated journals. RESULTS Among the top 100 largest publishers, 24% provided guidance on the use of GAI, of which 15 (63%) were among the top 25 publishers. Among the top 100 highly ranked journals, 87% provided guidance on GAI. Of the publishers and journals with guidelines, the inclusion of GAI as an author was prohibited in 96% and 98%, respectively. Only one journal (1%) explicitly prohibited the use of GAI in the generation of a manuscript, and two (8%) publishers and 19 (22%) journals indicated that their guidelines exclusively applied to the writing process. When disclosing the use of GAI, 75% of publishers and 43% of journals included specific disclosure criteria. Where to disclose the use of GAI varied, including in the methods or acknowledgments, in the cover letter, or in a new section. Variability was also found in how to access GAI guidelines shared between journals and publishers. GAI guidelines in 12 journals directly conflicted with those developed by the publishers. The guidelines developed by top medical journals were broadly similar to those of academic journals. CONCLUSIONS Guidelines by some top publishers and journals on the use of GAI by authors are lacking. Among those that provided guidelines, the allowable uses of GAI and how it should be disclosed varied substantially, with this heterogeneity persisting in some instances among affiliated publishers and journals. Lack of standardization places a burden on authors and could limit the effectiveness of the regulations. As GAI continues to grow in popularity, standardized guidelines to protect the integrity of scientific output are needed.
Collapse
Affiliation(s)
- Conner Ganjavi
- Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | - Michael B Eppler
- Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | - Asli Pekcan
- Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | - Brett Biedermann
- Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | - Andre Abreu
- Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | - Gary S Collins
- UK EQUATOR Centre, Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Inderbir S Gill
- Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | - Giovanni E Cacciamani
- Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Artificial Intelligence Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
12
|
Ferreira RM. New evidence-based practice: Artificial intelligence as a barrier breaker. World J Methodol 2023; 13:384-389. [PMID: 38229944 PMCID: PMC10789101 DOI: 10.5662/wjm.v13.i5.384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 10/24/2023] [Accepted: 11/08/2023] [Indexed: 12/20/2023] Open
Abstract
The concept of evidence-based practice has persisted over several years and remains a cornerstone in clinical practice, representing the gold standard for optimal patient care. However, despite widespread recognition of its significance, practical application faces various challenges and barriers, including a lack of skills in interpreting studies, limited resources, time constraints, linguistic competencies, and more. Recently, we have witnessed the emergence of a groundbreaking technological revolution known as artificial intelligence. Although artificial intelligence has become increasingly integrated into our daily lives, some reluctance persists among certain segments of the public. This article explores the potential of artificial intelligence as a solution to some of the main barriers encountered in the application of evidence-based practice. It highlights how artificial intelligence can assist in staying updated with the latest evidence, enhancing clinical decision-making, addressing patient misinformation, and mitigating time constraints in clinical practice. The integration of artificial intelligence into evidence-based practice has the potential to revolutionize healthcare, leading to more precise diagnoses, personalized treatment plans, and improved doctor-patient interactions. This proposed synergy between evidence-based practice and artificial intelligence may necessitate adjustments to its core concept, heralding a new era in healthcare.
Collapse
Affiliation(s)
- Ricardo Maia Ferreira
- Department of Sports and Exercise, Polytechnic Institute of Maia (N2i), Maia 4475-690, Porto, Portugal
- Department of Physioterapy, Polytechnic Institute of Coimbra, Coimbra Health School, Coimbra 3046-854, Coimbra, Portugal
- Department of Physioterapy, Polytechnic Institute of Castelo Branco, Dr. Lopes Dias Health School, Castelo Branco 6000-767, Castelo Branco, Portugal
- Sport Physical Activity and Health Research & Innovation Center, Polytechnic Institute of Viana do Castelo, Melgaço, 4960-320, Viana do Castelo, Portugal
| |
Collapse
|
13
|
Ille AM, Mathews MB. AI interprets the Central Dogma and Genetic Code. Trends Biochem Sci 2023; 48:1014-1018. [PMID: 37833131 DOI: 10.1016/j.tibs.2023.09.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 09/01/2023] [Accepted: 09/13/2023] [Indexed: 10/15/2023]
Abstract
Generative artificial intelligence (AI) is a burgeoning field with widespread applications, including in science. Here, we explore two paradigms that provide insight into the capabilities and limitations of Chat Generative Pre-trained Transformer (ChatGPT): its ability to (i) define a core biological concept (the Central Dogma of molecular biology); and (ii) interpret the genetic code.
Collapse
Affiliation(s)
- Alexander M Ille
- School of Graduate Studies, Rutgers University, Newark, NJ, USA.
| | - Michael B Mathews
- School of Graduate Studies, Rutgers University, Newark, NJ, USA; Department of Medicine, Rutgers New Jersey Medical School, Newark, NJ, USA.
| |
Collapse
|
14
|
Garg RK, Urs VL, Agarwal AA, Chaudhary SK, Paliwal V, Kar SK. Exploring the role of ChatGPT in patient care (diagnosis and treatment) and medical research: A systematic review. Health Promot Perspect 2023; 13:183-191. [PMID: 37808939 PMCID: PMC10558973 DOI: 10.34172/hpp.2023.22] [Citation(s) in RCA: 63] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 07/06/2023] [Indexed: 10/10/2023] Open
Abstract
BACKGROUND ChatGPT is an artificial intelligence based tool developed by OpenAI (California, USA). This systematic review examines the potential of ChatGPT in patient care and its role in medical research. METHODS The systematic review was done according to the PRISMA guidelines. Embase, Scopus, PubMed and Google Scholar data bases were searched. We also searched preprint data bases. Our search was aimed to identify all kinds of publications, without any restrictions, on ChatGPT and its application in medical research, medical publishing and patient care. We used search term "ChatGPT". We reviewed all kinds of publications including original articles, reviews, editorial/ commentaries, and even letter to the editor. Each selected records were analysed using ChatGPT and responses generated were compiled in a table. The word table was transformed in to a PDF and was further analysed using ChatPDF. RESULTS We reviewed full texts of 118 articles. ChatGPT can assist with patient enquiries, note writing, decision-making, trial enrolment, data management, decision support, research support, and patient education. But the solutions it offers are usually insufficient and contradictory, raising questions about their originality, privacy, correctness, bias, and legality. Due to its lack of human-like qualities, ChatGPT's legitimacy as an author is questioned when used for academic writing. ChatGPT generated contents have concerns with bias and possible plagiarism. CONCLUSION Although it can help with patient treatment and research, there are issues with accuracy, authorship, and bias. ChatGPT can serve as a "clinical assistant" and be a help in research and scholarly writing.
Collapse
Affiliation(s)
| | - Vijeth L Urs
- Department of Neurology, King George’s Medical University, Lucknow, India
| | | | | | - Vimal Paliwal
- Department of Neurology, Sanjay Gandhi Institute of Medical Sciences, Lucknow, India
| | - Sujita Kumar Kar
- Department of Psychiatry, King George’s Medical University, Lucknow, India
| |
Collapse
|
15
|
Jain A. ChatGPT for scientific community: Boon or bane? Med J Armed Forces India 2023; 79:498-499. [PMID: 37719916 PMCID: PMC10499628 DOI: 10.1016/j.mjafi.2023.06.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 06/22/2023] [Indexed: 09/19/2023] Open
Affiliation(s)
- Ankur Jain
- Assistant Professor (Clinical Haematology), Vardhman Mahavir Medical College & Safdarjung Hospital, New Delhi, India
| |
Collapse
|
16
|
Watters C, Lemanski MK. Universal skepticism of ChatGPT: a review of early literature on chat generative pre-trained transformer. Front Big Data 2023; 6:1224976. [PMID: 37680954 PMCID: PMC10482048 DOI: 10.3389/fdata.2023.1224976] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 07/10/2023] [Indexed: 09/09/2023] Open
Abstract
ChatGPT, a new language model developed by OpenAI, has garnered significant attention in various fields since its release. This literature review provides an overview of early ChatGPT literature across multiple disciplines, exploring its applications, limitations, and ethical considerations. The review encompasses Scopus-indexed publications from November 2022 to April 2023 and includes 156 articles related to ChatGPT. The findings reveal a predominance of negative sentiment across disciplines, though subject-specific attitudes must be considered. The review highlights the implications of ChatGPT in many fields including healthcare, raising concerns about employment opportunities and ethical considerations. While ChatGPT holds promise for improved communication, further research is needed to address its capabilities and limitations. This literature review provides insights into early research on ChatGPT, informing future investigations and practical applications of chatbot technology, as well as development and usage of generative AI.
Collapse
Affiliation(s)
- Casey Watters
- Faculty of Law, Bond University, Gold Coast, QLD, Australia
| | | |
Collapse
|
17
|
Levin G, Meyer R, Yasmeen A, Yang B, Guigue PA, Bar-Noy T, Tatar A, Perelshtein Brezinov O, Brezinov Y. Chat Generative Pre-trained Transformer-written obstetrics and gynecology abstracts fool practitioners. Am J Obstet Gynecol MFM 2023; 5:100993. [PMID: 37127209 DOI: 10.1016/j.ajogmf.2023.100993] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/03/2023]
Affiliation(s)
- Gabriel Levin
- Division of Cardiology, Jewish General Hospital, McGill University, Montreal, QC, Canada.
| | - Raanan Meyer
- Division of Cardiology, Jewish General Hospital, McGill University, Montreal, QC, Canada
| | - Amber Yasmeen
- Lady Davis Institute for Cancer Research, Jewish General Hospital McGill University Quebec, Canada
| | - Bowen Yang
- Department of Gynecology and Obstetrics, West China Second University Hospital, Sichuan University, Chengdu 610041, China; Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, Chengdu 610041, China
| | - Paul-Adrien Guigue
- Division of Cardiology, Jewish General Hospital, McGill University, Montreal, QC, Canada
| | - Tomer Bar-Noy
- Division of Cardiology, Jewish General Hospital, McGill University, Montreal, QC, Canada
| | - Angela Tatar
- Division of Cardiology, Jewish General Hospital, McGill University, Montreal, QC, Canada
| | | | - Yoav Brezinov
- Experimental Surgery, McGill University, Montreal, Quebec, Canada; Lady Davis Institute, Jewish General Hospital, Montreal, Quebec, Canada; Kaplan Medical Center, Hebrew University, Jerusalem, Israel
| |
Collapse
|
18
|
Nuryana Z, Pranolo A. ChatGPT: The balance of future, honesty, and integrity. Asian J Psychiatr 2023; 84:103571. [PMID: 37001483 DOI: 10.1016/j.ajp.2023.103571] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 03/24/2023] [Accepted: 03/25/2023] [Indexed: 06/11/2023]
Affiliation(s)
- Zalik Nuryana
- Department of Islamic Education, Universitas Ahmad Dahlan, Indonesia.
| | - Andri Pranolo
- Department of Informatics, Universitas Ahmad Dahlan, Indonesia.
| |
Collapse
|
19
|
Abd-Alrazaq A, AlSaad R, Alhuwail D, Ahmed A, Healy PM, Latifi S, Aziz S, Damseh R, Alabed Alrazak S, Sheikh J. Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions. JMIR MEDICAL EDUCATION 2023; 9:e48291. [PMID: 37261894 DOI: 10.2196/48291] [Citation(s) in RCA: 150] [Impact Index Per Article: 75.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 05/15/2023] [Accepted: 05/17/2023] [Indexed: 06/02/2023]
Abstract
The integration of large language models (LLMs), such as those in the Generative Pre-trained Transformers (GPT) series, into medical education has the potential to transform learning experiences for students and elevate their knowledge, skills, and competence. Drawing on a wealth of professional and academic experience, we propose that LLMs hold promise for revolutionizing medical curriculum development, teaching methodologies, personalized study plans and learning materials, student assessments, and more. However, we also critically examine the challenges that such integration might pose by addressing issues of algorithmic bias, overreliance, plagiarism, misinformation, inequity, privacy, and copyright concerns in medical education. As we navigate the shift from an information-driven educational paradigm to an artificial intelligence (AI)-driven educational paradigm, we argue that it is paramount to understand both the potential and the pitfalls of LLMs in medical education. This paper thus offers our perspective on the opportunities and challenges of using LLMs in this context. We believe that the insights gleaned from this analysis will serve as a foundation for future recommendations and best practices in the field, fostering the responsible and effective use of AI technologies in medical education.
Collapse
Affiliation(s)
- Alaa Abd-Alrazaq
- AI Center for Precision Health, Weill Cornell Medicine-Qatar, Doha, Qatar
| | - Rawan AlSaad
- AI Center for Precision Health, Weill Cornell Medicine-Qatar, Doha, Qatar
- College of Computing and Information Technology, University of Doha for Science and Technology, Doha, Qatar
| | - Dari Alhuwail
- Information Science Department, College of Life Sciences, Kuwait University, Kuwait, Kuwait
| | - Arfan Ahmed
- AI Center for Precision Health, Weill Cornell Medicine-Qatar, Doha, Qatar
| | - Padraig Mark Healy
- Office of Educational Development, Division of Medical Education, Weill Cornell Medicine-Qatar, Doha, Qatar
| | - Syed Latifi
- Office of Educational Development, Division of Medical Education, Weill Cornell Medicine-Qatar, Doha, Qatar
| | - Sarah Aziz
- AI Center for Precision Health, Weill Cornell Medicine-Qatar, Doha, Qatar
| | - Rafat Damseh
- Department of Computer Science and Software Engineering, United Arab Emirates University, Abu Dhabi, United Arab Emirates
| | - Sadam Alabed Alrazak
- Department of Mechanical & Industrial Engineering, Faculty of Applied Science and Engineering, University of Toronto, Toronto, ON, Canada
| | - Javaid Sheikh
- AI Center for Precision Health, Weill Cornell Medicine-Qatar, Doha, Qatar
| |
Collapse
|
20
|
Abd-alrazaq A, Alsaad R, Alhuwail D, Ahmed A, Healy PM, Latifi S, Aziz S, Damseh R, Alabed Alrazak S, Sheikh J. Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions (Preprint).. [DOI: 10.2196/preprints.48291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
UNSTRUCTURED
The integration of large language models (LLMs), such as those in the Generative Pre-trained Transformers (GPT) series, into medical education has the potential to transform learning experiences for students and elevate their knowledge, skills, and competence. Drawing on a wealth of professional and academic experience, we propose that LLMs hold promise for revolutionizing medical curriculum development, teaching methodologies, personalized study plans and learning materials, student assessments, and more. However, we also critically examine the challenges that such integration might pose by addressing issues of algorithmic bias, overreliance, plagiarism, misinformation, inequity, privacy, and copyright concerns in medical education. As we navigate the shift from an information-driven educational paradigm to an artificial intelligence (AI)–driven educational paradigm, we argue that it is paramount to understand both the potential and the pitfalls of LLMs in medical education. This paper thus offers our perspective on the opportunities and challenges of using LLMs in this context. We believe that the insights gleaned from this analysis will serve as a foundation for future recommendations and best practices in the field, fostering the responsible and effective use of AI technologies in medical education.
Collapse
|