1
|
Suárez A, Jiménez J, Llorente de Pedro M, Andreu-Vázquez C, Díaz-Flores García V, Gómez Sánchez M, Freire Y. Beyond the Scalpel: Assessing ChatGPT's potential as an auxiliary intelligent virtual assistant in oral surgery. Comput Struct Biotechnol J 2024; 24:46-52. [PMID: 38162955 PMCID: PMC10755495 DOI: 10.1016/j.csbj.2023.11.058] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 11/28/2023] [Accepted: 11/28/2023] [Indexed: 01/03/2024] Open
Abstract
AI has revolutionized the way we interact with technology. Noteworthy advances in AI algorithms and large language models (LLM) have led to the development of natural generative language (NGL) systems such as ChatGPT. Although these LLM can simulate human conversations and generate content in real time, they face challenges related to the topicality and accuracy of the information they generate. This study aimed to assess whether ChatGPT-4 could provide accurate and reliable answers to general dentists in the field of oral surgery, and thus explore its potential as an intelligent virtual assistant in clinical decision making in oral surgery. Thirty questions related to oral surgery were posed to ChatGPT4, each question repeated 30 times. Subsequently, a total of 900 responses were obtained. Two surgeons graded the answers according to the guidelines of the Spanish Society of Oral Surgery, using a three-point Likert scale (correct, partially correct/incomplete, and incorrect). Disagreements were arbitrated by an experienced oral surgeon, who provided the final grade Accuracy was found to be 71.7%, and consistency of the experts' grading across iterations, ranged from moderate to almost perfect. ChatGPT-4, with its potential capabilities, will inevitably be integrated into dental disciplines, including oral surgery. In the future, it could be considered as an auxiliary intelligent virtual assistant, though it would never replace oral surgery experts. Proper training and verified information by experts will remain vital to the implementation of the technology. More comprehensive research is needed to ensure the safe and successful application of AI in oral surgery.
Collapse
Affiliation(s)
- Ana Suárez
- Department of Pre-Clinic Dentistry, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, 28670 Madrid, Spain
| | - Jaime Jiménez
- Department of Clinic Dentistry, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, 28670 Madrid, Spain
| | - María Llorente de Pedro
- Department of Pre-Clinic Dentistry, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, 28670 Madrid, Spain
| | - Cristina Andreu-Vázquez
- Department of Veterinary Medicine, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, 28670 Madrid, Spain
| | - Víctor Díaz-Flores García
- Department of Pre-Clinic Dentistry, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, 28670 Madrid, Spain
| | - Margarita Gómez Sánchez
- Department of Pre-Clinic Dentistry, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, 28670 Madrid, Spain
| | - Yolanda Freire
- Department of Pre-Clinic Dentistry, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Calle Tajo s/n, Villaviciosa de Odón, 28670 Madrid, Spain
| |
Collapse
|
2
|
Chow JCL, Li K. Ethical Considerations in Human-Centered AI: Advancing Oncology Chatbots Through Large Language Models. JMIR BIOINFORMATICS AND BIOTECHNOLOGY 2024; 5:e64406. [PMID: 39321336 DOI: 10.2196/64406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 08/23/2024] [Accepted: 09/23/2024] [Indexed: 09/27/2024]
Abstract
The integration of chatbots in oncology underscores the pressing need for human-centered artificial intelligence (AI) that addresses patient and family concerns with empathy and precision. Human-centered AI emphasizes ethical principles, empathy, and user-centric approaches, ensuring technology aligns with human values and needs. This review critically examines the ethical implications of using large language models (LLMs) like GPT-3 and GPT-4 (OpenAI) in oncology chatbots. It examines how these models replicate human-like language patterns, impacting the design of ethical AI systems. The paper identifies key strategies for ethically developing oncology chatbots, focusing on potential biases arising from extensive datasets and neural networks. Specific datasets, such as those sourced from predominantly Western medical literature and patient interactions, may introduce biases by overrepresenting certain demographic groups. Moreover, the training methodologies of LLMs, including fine-tuning processes, can exacerbate these biases, leading to outputs that may disproportionately favor affluent or Western populations while neglecting marginalized communities. By providing examples of biased outputs in oncology chatbots, the review highlights the ethical challenges LLMs present and the need for mitigation strategies. The study emphasizes integrating human-centric values into AI to mitigate these biases, ultimately advocating for the development of oncology chatbots that are aligned with ethical principles and capable of serving diverse patient populations equitably.
Collapse
Affiliation(s)
- James C L Chow
- Department of Radiation Oncology, University of Toronto, Toronto, ON, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Kay Li
- Department of English, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
3
|
Battisti ES, Roman MK, Bellei EA, Kirsten VR, De Marchi ACB, Da Silva Leal GV. A virtual assistant for primary care's food and nutrition surveillance system: Development and validation study in Brazil. PATIENT EDUCATION AND COUNSELING 2024; 130:108461. [PMID: 39413720 DOI: 10.1016/j.pec.2024.108461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 09/18/2024] [Accepted: 10/04/2024] [Indexed: 10/18/2024]
Abstract
OBJECTIVE The study aimed to develop and validate a conversational agent (chatbot) designed to support Food and Nutrition Surveillance (FNS) practices in primary health care settings. METHODS This mixed-methods research was conducted in three stages. Initially, the study identified barriers and challenges in FNS practices through a literature review and feedback from 655 health professionals and FNS experts across Brazil. Following this, a participatory design approach was employed to develop and validate the chatbot's content. The final stage involved evaluating the chatbot's user experience with FNS experts. RESULTS The chatbot could accurately understand and respond to 60 different intents or keywords related to FNS. Themes such as training, guidance, and access emerged as crucial for guiding FNS initiatives and addressing implementation challenges, primarily related to human resources. The chatbot achieved a Global Content Validation Index of 0.88. CONCLUSION The developed chatbot represents a significant advancement in supporting FNS practices within primary health care. PRACTICE IMPLICATION By providing an innovative, interactive, educational tool that is both accessible and reliable, this digital assistant has the potential to facilitate the operationalization of FNS practices, addressing the critical need for effective training and counseling in developing countries.
Collapse
Affiliation(s)
- Eliza Sella Battisti
- Graduate Program in Human Aging, Institute of Health, University of Passo Fundo (UPF), Passo Fundo, RS, Brazil; Graduate Program in Gerontology, Department of Foods and Nutrition, Federal University of Santa Maria (UFSM), Palmeira das Missões, RS, Brazil
| | - Mateus Klein Roman
- Graduate Program in Applied Computing, Institute of Technology, University of Passo Fundo (UPF), Passo Fundo, RS, Brazil
| | - Ericles Andrei Bellei
- Graduate Program in Human Aging, Institute of Health, University of Passo Fundo (UPF), Passo Fundo, RS, Brazil.
| | - Vanessa Ramos Kirsten
- Graduate Program in Gerontology, Department of Foods and Nutrition, Federal University of Santa Maria (UFSM), Palmeira das Missões, RS, Brazil
| | - Ana Carolina Bertoletti De Marchi
- Graduate Program in Human Aging, Institute of Health, University of Passo Fundo (UPF), Passo Fundo, RS, Brazil; Graduate Program in Applied Computing, Institute of Technology, University of Passo Fundo (UPF), Passo Fundo, RS, Brazil
| | - Greisse Viero Da Silva Leal
- Graduate Program in Gerontology, Department of Foods and Nutrition, Federal University of Santa Maria (UFSM), Palmeira das Missões, RS, Brazil
| |
Collapse
|
4
|
Huo B, Marfo N, Sylla P, Calabrese E, Kumar S, Slater BJ, Walsh DS, Vosburg W. Clinical artificial intelligence: teaching a large language model to generate recommendations that align with guidelines for the surgical management of GERD. Surg Endosc 2024; 38:5668-5677. [PMID: 39134725 DOI: 10.1007/s00464-024-11155-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 08/04/2024] [Indexed: 10/08/2024]
Abstract
BACKGROUND Large Language Models (LLMs) provide clinical guidance with inconsistent accuracy due to limitations with their training dataset. LLMs are "teachable" through customization. We compared the ability of the generic ChatGPT-4 model and a customized version of ChatGPT-4 to provide recommendations for the surgical management of gastroesophageal reflux disease (GERD) to both surgeons and patients. METHODS Sixty patient cases were developed using eligibility criteria from the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) & United European Gastroenterology (UEG)-European Association of Endoscopic. Surgery (EAES) guidelines for the surgical management of GERD. Standardized prompts were engineered for physicians as the end-user, with separate layperson prompts for patients. A customized GPT was developed to generate recommendations based on guidelines, called the GERD Tool for Surgery (GTS). Both the GTS and generic ChatGPT-4 were queried July 21st, 2024. Model performance was evaluated by comparing responses to SAGES & UEG-EAES guideline recommendations. Outcome data was presented using descriptive statistics including counts and percentages. RESULTS The GTS provided accurate recommendations for the surgical management of GERD for 60/60 (100.0%) surgeon inquiries and 40/40 (100.0%) patient inquiries based on guideline recommendations. The Generic ChatGPT-4 model generated accurate guidance for 40/60 (66.7%) surgeon inquiries and 19/40 (47.5%) patient inquiries. The GTS produced recommendations based on the 2021 SAGES & UEG-EAES guidelines on the surgical management of GERD, while the generic ChatGPT-4 model generated guidance without citing evidence to support its recommendations. CONCLUSION ChatGPT-4 can be customized to overcome limitations with its training dataset to provide recommendations for the surgical management of GERD with reliable accuracy and consistency. The training of LLM models can be used to help integrate this efficient technology into the creation of robust and accurate information for both surgeons and patients. Prospective data is needed to assess its effectiveness in a pragmatic clinical environment.
Collapse
Affiliation(s)
- Bright Huo
- Division of General Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Nana Marfo
- Ross University School of Medicine, Miramar, FL, USA
| | - Patricia Sylla
- Division of Colon and Rectal Surgery, Department of Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Sunjay Kumar
- Department of General Surgery, Thomas Jefferson University Hospital, Philadelphia, PA, USA
| | | | - Danielle S Walsh
- Department of Surgery, University of Kentucky, Lexington, KY, USA
| | - Wesley Vosburg
- Department of Surgery, Mount Auburn Hospital, Harvard Medical School, Cambridge, MA, USA.
| |
Collapse
|
5
|
Haran C, Allan P, Dholakia J, Lai S, Lim E, Xu W, Hart O, Cain J, Narayanan A, Khashram M. The application and uses of telemedicine in vascular surgery: A narrative review. Semin Vasc Surg 2024; 37:290-297. [PMID: 39277344 DOI: 10.1053/j.semvascsurg.2024.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 07/14/2024] [Accepted: 07/22/2024] [Indexed: 09/17/2024]
Abstract
Technological advances over the past century have accelerated the pace and breadth of medical and surgical care. From the initial delivery of "telemedicine" over the radio in the 1920s, the delivery of medicine and surgery in the 21st century is no longer limited by connectivity. The COVID-19 pandemic hastened the uptake of telemedicine to ensure that health care can be maintained despite limited face-to-face contact. Like other areas of medicine, vascular surgery has adopted telemedicine, although its role is not well described in the literature. This narrative review explores how telemedicine has been delivered in vascular surgery. Specific themes of telemedicine are outlined with real-world examples, including consultation, triaging, collaboration, mentoring, monitoring and surveillance, mobile health, and education. This review also explores possible future advances in telemedicine and issues around equity of care. Finally, important ethical considerations and limitations related to the applications of telemedicine are outlined.
Collapse
Affiliation(s)
- Cheyaanthan Haran
- Department of Vascular Surgery, Waikato Hospital, 183 Pembroke Street, Hamilton 3204, New Zealand; Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand
| | - Philip Allan
- Department of Vascular Surgery, Waikato Hospital, 183 Pembroke Street, Hamilton 3204, New Zealand
| | - Jhanvi Dholakia
- Department of Vascular Surgery, Waikato Hospital, 183 Pembroke Street, Hamilton 3204, New Zealand
| | - Simon Lai
- Department of Vascular Surgery, Waikato Hospital, 183 Pembroke Street, Hamilton 3204, New Zealand; Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand
| | - Eric Lim
- Department of Vascular Surgery, Waikato Hospital, 183 Pembroke Street, Hamilton 3204, New Zealand
| | - William Xu
- Department of Vascular Surgery, Waikato Hospital, 183 Pembroke Street, Hamilton 3204, New Zealand
| | - Odette Hart
- Department of Vascular Surgery, Waikato Hospital, 183 Pembroke Street, Hamilton 3204, New Zealand; Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand
| | - Justin Cain
- Department of Vascular Surgery, Waikato Hospital, 183 Pembroke Street, Hamilton 3204, New Zealand
| | - Anantha Narayanan
- Department of Vascular Surgery, Waikato Hospital, 183 Pembroke Street, Hamilton 3204, New Zealand; Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand
| | - Manar Khashram
- Department of Vascular Surgery, Waikato Hospital, 183 Pembroke Street, Hamilton 3204, New Zealand; Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand.
| |
Collapse
|
6
|
Lee JW, Yoo IS, Kim JH, Kim WT, Jeon HJ, Yoo HS, Shin JG, Kim GH, Hwang S, Park S, Kim YJ. Development of AI-generated medical responses using the ChatGPT for cancer patients. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108302. [PMID: 38996805 DOI: 10.1016/j.cmpb.2024.108302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 05/28/2024] [Accepted: 06/22/2024] [Indexed: 07/14/2024]
Abstract
BACKGROUND AND OBJECTIVE To develop a healthcare chatbot service (AI-guided bot) that conducts real-time conversations using large language models to provide accurate health information to patients. METHODS To provide accurate and specialized medical responses, we integrated several cancer practice guidelines. The size of the integrated meta-dataset was 1.17 million tokens. The integrated and classified metadata were extracted, transformed into text, segmented to specific character lengths, and vectorized using the embedding model. The AI-guide bot was implemented using Python 3.9. To enhance the scalability and incorporate the integrated dataset, we combined the AI-guide bot with OpenAI and the LangChain framework. To generate user-friendly conversations, a language model was developed based on Chat-Generative Pretrained Transformer (ChatGPT), an interactive conversational chatbot powered by GPT-3.5. The AI-guide bot was implemented using ChatGPT3.5 from Sep. 2023 to Jan. 2024. RESULTS The AI-guide bot allowed users to select their desired cancer type and language for conversational interactions. The AI-guided bot was designed to expand its capabilities to encompass multiple major cancer types. The performance of the AI-guide bot responses was 90.98 ± 4.02 (obtained by summing up the Likert scores). CONCLUSIONS The AI-guide bot can provide medical information quickly and accurately to patients with cancer who are concerned about their health.
Collapse
Affiliation(s)
- Jae-Woo Lee
- Department of Family Medicine, Chungbuk National University Hospital, Cheongju, Republic of Korea; Department of Family Medicine, Chungbuk National University College of Medicine, Cheongju, Republic of Korea
| | - In-Sang Yoo
- Department of Biomedical Engineering, Chungbuk National University Hospital, Cheongju, Republic of Korea; Department of Medicine, Chungbuk National University College of Medicine, Cheongju, Republic of Korea
| | - Ji-Hye Kim
- Department of Biomedical Engineering, Chungbuk National University Hospital, Cheongju, Republic of Korea
| | - Won Tae Kim
- Department of Urology, Chungbuk National University Hospital, Cheongju, Republic of Korea; Department of Urology, Chungbuk National University College of Medicine, 1 Chungdae-ro, Seowon-gu, Cheongju, Chungcheongbuk-do 28644, Republic of Korea
| | - Hyun Jeong Jeon
- Department of Internal Medicine, Chungbuk National University Hospital, Cheongju, Republic of Korea; Department of Internal Medicine, College of Medicine, Chungbuk National University, Cheongju, Republic of Korea
| | - Hyo-Sun Yoo
- Department of Family Medicine, Chungbuk National University Hospital, Cheongju, Republic of Korea
| | - Jae Gwang Shin
- Department of Biomedical Engineering, Chungbuk National University Hospital, Cheongju, Republic of Korea
| | - Geun-Hyeong Kim
- Department of Biomedical Engineering, Chungbuk National University Hospital, Cheongju, Republic of Korea
| | - ShinJi Hwang
- Department of Biomedical Engineering, Chungbuk National University Hospital, Cheongju, Republic of Korea
| | - Seung Park
- Department of Biomedical Engineering, Chungbuk National University Hospital, Cheongju, Republic of Korea; Department of Medicine, Chungbuk National University College of Medicine, Cheongju, Republic of Korea
| | - Yong-June Kim
- Department of Urology, Chungbuk National University Hospital, Cheongju, Republic of Korea; Department of Urology, Chungbuk National University College of Medicine, 1 Chungdae-ro, Seowon-gu, Cheongju, Chungcheongbuk-do 28644, Republic of Korea.
| |
Collapse
|
7
|
Mihalache A, Grad J, Patil NS, Huang RS, Popovic MM, Mallipatna A, Kertes PJ, Muni RH. Google Gemini and Bard artificial intelligence chatbot performance in ophthalmology knowledge assessment. Eye (Lond) 2024; 38:2530-2535. [PMID: 38615098 PMCID: PMC11383935 DOI: 10.1038/s41433-024-03067-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 03/08/2024] [Accepted: 04/04/2024] [Indexed: 04/15/2024] Open
Abstract
PURPOSE With the popularization of ChatGPT (Open AI, San Francisco, California, United States) in recent months, understanding the potential of artificial intelligence (AI) chatbots in a medical context is important. Our study aims to evaluate Google Gemini and Bard's (Google, Mountain View, California, United States) knowledge in ophthalmology. METHODS In this study, we evaluated Google Gemini and Bard's performance on EyeQuiz, a platform containing ophthalmology board certification examination practice questions, when used from the United States (US). Accuracy, response length, response time, and provision of explanations were evaluated. Subspecialty-specific performance was noted. A secondary analysis was conducted using Bard from Vietnam, and Gemini from Vietnam, Brazil, and the Netherlands. RESULTS Overall, Google Gemini and Bard both had accuracies of 71% across 150 text-based multiple-choice questions. The secondary analysis revealed an accuracy of 67% using Bard from Vietnam, with 32 questions (21%) answered differently than when using Bard from the US. Moreover, the Vietnam version of Gemini achieved an accuracy of 74%, with 23 (15%) answered differently than the US version of Gemini. While the Brazil (68%) and Netherlands (65%) versions of Gemini performed slightly worse than the US version, differences in performance across the various country-specific versions of Bard and Gemini were not statistically significant. CONCLUSION Google Gemini and Bard had an acceptable performance in responding to ophthalmology board examination practice questions. Subtle variability was noted in the performance of the chatbots across different countries. The chatbots also tended to provide a confident explanation even when providing an incorrect answer.
Collapse
Affiliation(s)
- Andrew Mihalache
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Justin Grad
- Michael G. DeGroote School of Medicine, McMaster University, Hamilton, ON, Canada
| | - Nikhil S Patil
- Michael G. DeGroote School of Medicine, McMaster University, Hamilton, ON, Canada
| | - Ryan S Huang
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Marko M Popovic
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, ON, Canada
| | - Ashwin Mallipatna
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, ON, Canada
- Department of Ophthalmology, Hospital for Sick Children, University of Toronto, Toronto, ON, Canada
| | - Peter J Kertes
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, ON, Canada
- John and Liz Tory Eye Centre, Sunnybrook Health Sciences Centre, Toronto, ON, Canada
| | - Rajeev H Muni
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, ON, Canada.
- Department of Ophthalmology, St. Michael's Hospital/Unity Health Toronto, Toronto, ON, Canada.
| |
Collapse
|
8
|
Cherrez-Ojeda I, Gallardo-Bastidas JC, Robles-Velasco K, Osorio MF, Velez Leon EM, Leon Velastegui M, Pauletto P, Aguilar-Díaz FC, Squassi A, González Eras SP, Cordero Carrasco E, Chavez Gonzalez KL, Calderon JC, Bousquet J, Bedbrook A, Faytong-Haro M. Understanding Health Care Students' Perceptions, Beliefs, and Attitudes Toward AI-Powered Language Models: Cross-Sectional Study. JMIR MEDICAL EDUCATION 2024; 10:e51757. [PMID: 39137029 DOI: 10.2196/51757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 09/26/2023] [Accepted: 04/30/2024] [Indexed: 08/15/2024]
Abstract
BACKGROUND ChatGPT was not intended for use in health care, but it has potential benefits that depend on end-user understanding and acceptability, which is where health care students become crucial. There is still a limited amount of research in this area. OBJECTIVE The primary aim of our study was to assess the frequency of ChatGPT use, the perceived level of knowledge, the perceived risks associated with its use, and the ethical issues, as well as attitudes toward the use of ChatGPT in the context of education in the field of health. In addition, we aimed to examine whether there were differences across groups based on demographic variables. The second part of the study aimed to assess the association between the frequency of use, the level of perceived knowledge, the level of risk perception, and the level of perception of ethics as predictive factors for participants' attitudes toward the use of ChatGPT. METHODS A cross-sectional survey was conducted from May to June 2023 encompassing students of medicine, nursing, dentistry, nutrition, and laboratory science across the Americas. The study used descriptive analysis, chi-square tests, and ANOVA to assess statistical significance across different categories. The study used several ordinal logistic regression models to analyze the impact of predictive factors (frequency of use, perception of knowledge, perception of risk, and ethics perception scores) on attitude as the dependent variable. The models were adjusted for gender, institution type, major, and country. Stata was used to conduct all the analyses. RESULTS Of 2661 health care students, 42.99% (n=1144) were unaware of ChatGPT. The median score of knowledge was "minimal" (median 2.00, IQR 1.00-3.00). Most respondents (median 2.61, IQR 2.11-3.11) regarded ChatGPT as neither ethical nor unethical. Most participants (median 3.89, IQR 3.44-4.34) "somewhat agreed" that ChatGPT (1) benefits health care settings, (2) provides trustworthy data, (3) is a helpful tool for clinical and educational medical information access, and (4) makes the work easier. In total, 70% (7/10) of people used it for homework. As the perceived knowledge of ChatGPT increased, there was a stronger tendency with regard to having a favorable attitude toward ChatGPT. Higher ethical consideration perception ratings increased the likelihood of considering ChatGPT as a source of trustworthy health care information (odds ratio [OR] 1.620, 95% CI 1.498-1.752), beneficial in medical issues (OR 1.495, 95% CI 1.452-1.539), and useful for medical literature (OR 1.494, 95% CI 1.426-1.564; P<.001 for all results). CONCLUSIONS Over 40% of American health care students (1144/2661, 42.99%) were unaware of ChatGPT despite its extensive use in the health field. Our data revealed the positive attitudes toward ChatGPT and the desire to learn more about it. Medical educators must explore how chatbots may be included in undergraduate health care education programs.
Collapse
Affiliation(s)
- Ivan Cherrez-Ojeda
- Universidad Espiritu Santo, Samborondon, Ecuador
- Respiralab Research Group, Guayaquil, Ecuador
| | | | - Karla Robles-Velasco
- Universidad Espiritu Santo, Samborondon, Ecuador
- Respiralab Research Group, Guayaquil, Ecuador
| | - María F Osorio
- Universidad Espiritu Santo, Samborondon, Ecuador
- Respiralab Research Group, Guayaquil, Ecuador
| | | | | | | | - F C Aguilar-Díaz
- Departamento Salud Pública, Escuela Nacional de Estudios Superiores, Universidad Nacional Autónoma de México, Guanajuato, Mexico
| | - Aldo Squassi
- Universidad de Buenos Aires, Facultad de Odontologìa, Cátedra de Odontología Preventiva y Comunitaria, Buenos Aires, Argentina
| | | | - Erita Cordero Carrasco
- Departamento de cirugía y traumatología bucal y maxilofacial, Universidad de Chile, Santiago, Chile
| | | | - Juan C Calderon
- Universidad Espiritu Santo, Samborondon, Ecuador
- Respiralab Research Group, Guayaquil, Ecuador
| | - Jean Bousquet
- Institute of Allergology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Allergology and Immunology, Berlin, Germany
- MASK-air, Montpellier, France
| | | | - Marco Faytong-Haro
- Respiralab Research Group, Guayaquil, Ecuador
- Universidad Estatal de Milagro, Cdla Universitaria "Dr. Rómulo Minchala Murillo", Milagro, Ecuador
- Ecuadorian Development Research Lab, Daule, Ecuador
| |
Collapse
|
9
|
Takahashi H, Shikino K, Kondo T, Komori A, Yamada Y, Saita M, Naito T. Educational Utility of Clinical Vignettes Generated in Japanese by ChatGPT-4: Mixed Methods Study. JMIR MEDICAL EDUCATION 2024; 10:e59133. [PMID: 39137031 DOI: 10.2196/59133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 05/22/2024] [Accepted: 06/27/2024] [Indexed: 08/15/2024]
Abstract
BACKGROUND Evaluating the accuracy and educational utility of artificial intelligence-generated medical cases, especially those produced by large language models such as ChatGPT-4 (developed by OpenAI), is crucial yet underexplored. OBJECTIVE This study aimed to assess the educational utility of ChatGPT-4-generated clinical vignettes and their applicability in educational settings. METHODS Using a convergent mixed methods design, a web-based survey was conducted from January 8 to 28, 2024, to evaluate 18 medical cases generated by ChatGPT-4 in Japanese. In the survey, 6 main question items were used to evaluate the quality of the generated clinical vignettes and their educational utility, which are information quality, information accuracy, educational usefulness, clinical match, terminology accuracy (TA), and diagnosis difficulty. Feedback was solicited from physicians specializing in general internal medicine or general medicine and experienced in medical education. Chi-square and Mann-Whitney U tests were performed to identify differences among cases, and linear regression was used to examine trends associated with physicians' experience. Thematic analysis of qualitative feedback was performed to identify areas for improvement and confirm the educational utility of the cases. RESULTS Of the 73 invited participants, 71 (97%) responded. The respondents, primarily male (64/71, 90%), spanned a broad range of practice years (from 1976 to 2017) and represented diverse hospital sizes throughout Japan. The majority deemed the information quality (mean 0.77, 95% CI 0.75-0.79) and information accuracy (mean 0.68, 95% CI 0.65-0.71) to be satisfactory, with these responses being based on binary data. The average scores assigned were 3.55 (95% CI 3.49-3.60) for educational usefulness, 3.70 (95% CI 3.65-3.75) for clinical match, 3.49 (95% CI 3.44-3.55) for TA, and 2.34 (95% CI 2.28-2.40) for diagnosis difficulty, based on a 5-point Likert scale. Statistical analysis showed significant variability in content quality and relevance across the cases (P<.001 after Bonferroni correction). Participants suggested improvements in generating physical findings, using natural language, and enhancing medical TA. The thematic analysis highlighted the need for clearer documentation, clinical information consistency, content relevance, and patient-centered case presentations. CONCLUSIONS ChatGPT-4-generated medical cases written in Japanese possess considerable potential as resources in medical education, with recognized adequacy in quality and accuracy. Nevertheless, there is a notable need for enhancements in the precision and realism of case details. This study emphasizes ChatGPT-4's value as an adjunctive educational tool in the medical field, requiring expert oversight for optimal application.
Collapse
Affiliation(s)
- Hiromizu Takahashi
- Department of General Medicine, Juntendo University Faculty of Medicine, Tokyo, Japan
| | - Kiyoshi Shikino
- Department of Community-Oriented Medical Education, Chiba University Graduate School of Medicine, Chiba, Japan
| | - Takeshi Kondo
- Center for Postgraduate Clinical Training and Career Development, Nagoya University Hospital, Aichi, Japan
| | - Akira Komori
- Department of General Medicine, Juntendo University Faculty of Medicine, Tokyo, Japan
- Department of Emergency and Critical Care Medicine, Tsukuba Memorial Hospital, Tsukuba, Japan
| | - Yuji Yamada
- Brookdale Department of Geriatrics and Palliative Medicine, Icahn School of Medicine at Mount Sinai, NY, NY, United States
| | - Mizue Saita
- Department of General Medicine, Juntendo University Faculty of Medicine, Tokyo, Japan
| | - Toshio Naito
- Department of General Medicine, Juntendo University Faculty of Medicine, Tokyo, Japan
| |
Collapse
|
10
|
Sharma H, Ruikar M. Artificial intelligence at the pen's edge: Exploring the ethical quagmires in using artificial intelligence models like ChatGPT for assisted writing in biomedical research. Perspect Clin Res 2024; 15:108-115. [PMID: 39140014 PMCID: PMC11318783 DOI: 10.4103/picr.picr_196_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 08/09/2023] [Accepted: 08/11/2023] [Indexed: 08/15/2024] Open
Abstract
Chat generative pretrained transformer (ChatGPT) is a conversational language model powered by artificial intelligence (AI). It is a sophisticated language model that employs deep learning methods to generate human-like text outputs to inputs in the natural language. This narrative review aims to shed light on ethical concerns about using AI models like ChatGPT in writing assistance in the health care and medical domains. Currently, all the AI models like ChatGPT are in the infancy stage; there is a risk of inaccuracy of the generated content, lack of contextual understanding, dynamic knowledge gaps, limited discernment, lack of responsibility and accountability, issues of privacy, data security, transparency, and bias, lack of nuance, and originality. Other issues such as authorship, unintentional plagiarism, falsified and fabricated content, and the threat of being red-flagged as AI-generated content highlight the need for regulatory compliance, transparency, and disclosure. If the legitimate issues are proactively considered and addressed, the potential applications of AI models as writing assistance could be rewarding.
Collapse
Affiliation(s)
- Hunny Sharma
- Department of Community and Family Medicine, All India Institute of Medical Sciences, Raipur, Chhattisgarh, India
| | - Manisha Ruikar
- Department of Community and Family Medicine, All India Institute of Medical Sciences, Raipur, Chhattisgarh, India
| |
Collapse
|
11
|
Lahat A, Sharif K, Zoabi N, Shneor Patt Y, Sharif Y, Fisher L, Shani U, Arow M, Levin R, Klang E. Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4. J Med Internet Res 2024; 26:e54571. [PMID: 38935937 PMCID: PMC11240076 DOI: 10.2196/54571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 02/02/2024] [Accepted: 04/29/2024] [Indexed: 06/29/2024] Open
Abstract
BACKGROUND Artificial intelligence, particularly chatbot systems, is becoming an instrumental tool in health care, aiding clinical decision-making and patient engagement. OBJECTIVE This study aims to analyze the performance of ChatGPT-3.5 and ChatGPT-4 in addressing complex clinical and ethical dilemmas, and to illustrate their potential role in health care decision-making while comparing seniors' and residents' ratings, and specific question types. METHODS A total of 4 specialized physicians formulated 176 real-world clinical questions. A total of 8 senior physicians and residents assessed responses from GPT-3.5 and GPT-4 on a 1-5 scale across 5 categories: accuracy, relevance, clarity, utility, and comprehensiveness. Evaluations were conducted within internal medicine, emergency medicine, and ethics. Comparisons were made globally, between seniors and residents, and across classifications. RESULTS Both GPT models received high mean scores (4.4, SD 0.8 for GPT-4 and 4.1, SD 1.0 for GPT-3.5). GPT-4 outperformed GPT-3.5 across all rating dimensions, with seniors consistently rating responses higher than residents for both models. Specifically, seniors rated GPT-4 as more beneficial and complete (mean 4.6 vs 4.0 and 4.6 vs 4.1, respectively; P<.001), and GPT-3.5 similarly (mean 4.1 vs 3.7 and 3.9 vs 3.5, respectively; P<.001). Ethical queries received the highest ratings for both models, with mean scores reflecting consistency across accuracy and completeness criteria. Distinctions among question types were significant, particularly for the GPT-4 mean scores in completeness across emergency, internal, and ethical questions (4.2, SD 1.0; 4.3, SD 0.8; and 4.5, SD 0.7, respectively; P<.001), and for GPT-3.5's accuracy, beneficial, and completeness dimensions. CONCLUSIONS ChatGPT's potential to assist physicians with medical issues is promising, with prospects to enhance diagnostics, treatments, and ethics. While integration into clinical workflows may be valuable, it must complement, not replace, human expertise. Continued research is essential to ensure safe and effective implementation in clinical environments.
Collapse
Affiliation(s)
- Adi Lahat
- Department of Gastroenterology, Chaim Sheba Medical Center, Affiliated with Tel Aviv University, Ramat Gan, Israel
- Department of Gastroenterology, Samson Assuta Ashdod Medical Center, Affiliated with Ben Gurion University of the Negev, Be'er Sheva, Israel
| | - Kassem Sharif
- Department of Gastroenterology, Chaim Sheba Medical Center, Affiliated with Tel Aviv University, Ramat Gan, Israel
- Department of Internal Medicine B, Sheba Medical Centre, Tel Aviv, Israel
| | - Narmin Zoabi
- Department of Gastroenterology, Chaim Sheba Medical Center, Affiliated with Tel Aviv University, Ramat Gan, Israel
| | | | - Yousra Sharif
- Department of Internal Medicine C, Hadassah Medical Center, Jerusalem, Israel
| | - Lior Fisher
- Department of Internal Medicine B, Sheba Medical Centre, Tel Aviv, Israel
| | - Uria Shani
- Department of Internal Medicine B, Sheba Medical Centre, Tel Aviv, Israel
| | - Mohamad Arow
- Department of Internal Medicine B, Sheba Medical Centre, Tel Aviv, Israel
| | - Roni Levin
- Department of Internal Medicine B, Sheba Medical Centre, Tel Aviv, Israel
| | - Eyal Klang
- Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, United States
| |
Collapse
|
12
|
Hussain T, Wang D, Li B. The influence of the COVID-19 pandemic on the adoption and impact of AI ChatGPT: Challenges, applications, and ethical considerations. Acta Psychol (Amst) 2024; 246:104264. [PMID: 38626597 DOI: 10.1016/j.actpsy.2024.104264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 04/08/2024] [Accepted: 04/09/2024] [Indexed: 04/18/2024] Open
Abstract
DESIGN/METHODOLOGY/APPROACH This article employs qualitative thematic modeling to gather insights from 30 informants. The study explores various aspects related to the impact of the COVID-19 pandemic on AI ChatGPT technologies. PURPOSE The purpose of this research is to examine how the COVID-19 pandemic has influenced the increased usage and adoption of AI ChatGPT. It aims to explore the pandemic's impact on AI ChatGPT and its applications in specific domains, as well as the challenges and opportunities it presents. FINDINGS The findings highlight that the pandemic has led to a surge in online activities, resulting in a heightened demand for AI ChatGPT. It has been widely used in areas such as healthcare, mental health support, remote collaboration, and personalized customer experiences. The article showcases examples of AI ChatGPT's application during the pandemic. STRENGTH OF STUDY This qualitative framework enables the study to delve deeply into the multifaceted dimensions of AI ChatGPT's role during the pandemic, capturing the diverse experiences and insights of users, practitioners, and experts. By embracing the qualitative nature of inquiry and this research offers a comprehensive understanding of the challenges, opportunities, and ethical considerations associated with the adoption and utilization of AI ChatGPT in crisis contexts. PRACTICAL IMPLICATIONS The insights from this research have practical implications for policymakers, developers, and researchers. This reserach emphasize the need for responsible and ethical implementation of AI ChatGPT to fully harness its potential in addressing societal needs during and beyond the pandemic. SOCIAL IMPLICATIONS The increased reliance on AI ChatGPT during the pandemic has led to changes in user behavior, expectations, and interactions. However, it has also unveiled ethical considerations and potential risks. Addressing societal and ethical concerns, such as user impact and autonomy, privacy and security, bias and fairness, and transparency and accountability, is crucial for the responsible deployment of AI ChatGPT. ORIGINALITY/VALUE This research contributes to the understanding of the novel role of AI ChatGPT in times of crisis, particularly in the era of COVID-19 pandemic. It highlights the necessity of responsible and ethical implementation of AI ChatGPT and provides valuable insights for the development and application of AI technology in the future.
Collapse
Affiliation(s)
- Talib Hussain
- School of Media and Communication, Shanghai Jiao Tong University, 800 Dongchuan Road, 2002240 Shanghai, China; Department of Media Management, University of Religions and Denominations, Qom 37491-13357, Iran.
| | - Dake Wang
- School of Media and Communication, Shanghai Jiao Tong University, 800 Dongchuan Road, 2002240 Shanghai, China.
| | - Benqian Li
- School of Media and Communication, Shanghai Jiao Tong University, 800 Dongchuan Road, 2002240 Shanghai, China.
| |
Collapse
|
13
|
Denecke K, May R, Rivera Romero O. Potential of Large Language Models in Health Care: Delphi Study. J Med Internet Res 2024; 26:e52399. [PMID: 38739445 PMCID: PMC11130776 DOI: 10.2196/52399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 10/10/2023] [Accepted: 04/19/2024] [Indexed: 05/14/2024] Open
Abstract
BACKGROUND A large language model (LLM) is a machine learning model inferred from text data that captures subtle patterns of language use in context. Modern LLMs are based on neural network architectures that incorporate transformer methods. They allow the model to relate words together through attention to multiple words in a text sequence. LLMs have been shown to be highly effective for a range of tasks in natural language processing (NLP), including classification and information extraction tasks and generative applications. OBJECTIVE The aim of this adapted Delphi study was to collect researchers' opinions on how LLMs might influence health care and on the strengths, weaknesses, opportunities, and threats of LLM use in health care. METHODS We invited researchers in the fields of health informatics, nursing informatics, and medical NLP to share their opinions on LLM use in health care. We started the first round with open questions based on our strengths, weaknesses, opportunities, and threats framework. In the second and third round, the participants scored these items. RESULTS The first, second, and third rounds had 28, 23, and 21 participants, respectively. Almost all participants (26/28, 93% in round 1 and 20/21, 95% in round 3) were affiliated with academic institutions. Agreement was reached on 103 items related to use cases, benefits, risks, reliability, adoption aspects, and the future of LLMs in health care. Participants offered several use cases, including supporting clinical tasks, documentation tasks, and medical research and education, and agreed that LLM-based systems will act as health assistants for patient education. The agreed-upon benefits included increased efficiency in data handling and extraction, improved automation of processes, improved quality of health care services and overall health outcomes, provision of personalized care, accelerated diagnosis and treatment processes, and improved interaction between patients and health care professionals. In total, 5 risks to health care in general were identified: cybersecurity breaches, the potential for patient misinformation, ethical concerns, the likelihood of biased decision-making, and the risk associated with inaccurate communication. Overconfidence in LLM-based systems was recognized as a risk to the medical profession. The 6 agreed-upon privacy risks included the use of unregulated cloud services that compromise data security, exposure of sensitive patient data, breaches of confidentiality, fraudulent use of information, vulnerabilities in data storage and communication, and inappropriate access or use of patient data. CONCLUSIONS Future research related to LLMs should not only focus on testing their possibilities for NLP-related tasks but also consider the workflows the models could contribute to and the requirements regarding quality, integration, and regulations needed for successful implementation in practice.
Collapse
Affiliation(s)
| | - Richard May
- Harz University of Applied Sciences, Wernigerode, Germany
| | - Octavio Rivera Romero
- Instituto de Ingeniería Informática (I3US), Universidad de Sevilla, Sevilla, Spain
- Department of Electronic Technology, Universidad de Sevilla, Sevilla, Spain
| |
Collapse
|
14
|
Pinto DS, Noronha SM, Saigal G, Quencer RM. Comparison of an AI-Generated Case Report With a Human-Written Case Report: Practical Considerations for AI-Assisted Medical Writing. Cureus 2024; 16:e60461. [PMID: 38883028 PMCID: PMC11179998 DOI: 10.7759/cureus.60461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/15/2024] [Indexed: 06/18/2024] Open
Abstract
INTRODUCTION The utility of ChatGPT has recently caused consternation in the medical world. While it has been utilized to write manuscripts, only a few studies have evaluated the quality of manuscripts generated by AI (artificial intelligence). OBJECTIVE We evaluate the ability of ChatGPT to write a case report when provided with a framework. We also provide practical considerations for manuscript writing using AI. METHODS We compared a manuscript written by a blinded human author (10 years of medical experience) with a manuscript written by ChatGPT on a rare presentation of a common disease. We used multiple iterations of the manuscript generation request to derive the best ChatGPT output. Participants, outcomes, and measures: 22 human reviewers compared the manuscripts using parameters that characterize human writing and relevant standard manuscript assessment criteria, viz., scholarly impact quotient (SIQ). We also compared the manuscripts using the "average perplexity score" (APS), "burstiness score" (BS), and "highest perplexity of a sentence" (GPTZero parameters to detect AI-generated content). RESULTS The human manuscript had a significantly higher quality of presentation and nuanced writing (p<0.05). Both manuscripts had a logical flow. 12/22 reviewers were able to identify the AI-generated manuscript (p<0.05), but 4/22 reviewers wrongly identified the human-written manuscript as AI-generated. GPTZero software erroneously identified four sentences of the human-written manuscript to be AI-generated. CONCLUSION Though AI showed an ability to highlight the novelty of the case report and project a logical flow comparable to the human manuscript, it could not outperform the human writer on all parameters. The human manuscript showed a better quality of presentation and more nuanced writing. The practical considerations we provide for AI-assisted medical writing will help to better utilize AI in manuscript writing.
Collapse
Affiliation(s)
| | | | - Gaurav Saigal
- Radiology, University of Miami Miller School of Medicine, Miami, USA
| | - Robert M Quencer
- Radiology, University of Miami Miller School of Medicine, Miami, USA
| |
Collapse
|
15
|
Lang S, Vitale J, Fekete TF, Haschtmann D, Reitmeir R, Ropelato M, Puhakka J, Galbusera F, Loibl M. Are large language models valid tools for patient information on lumbar disc herniation? The spine surgeons' perspective. BRAIN & SPINE 2024; 4:102804. [PMID: 38706800 PMCID: PMC11067000 DOI: 10.1016/j.bas.2024.102804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 02/19/2024] [Accepted: 04/04/2024] [Indexed: 05/07/2024]
Abstract
Introduction Generative AI is revolutionizing patient education in healthcare, particularly through chatbots that offer personalized, clear medical information. Reliability and accuracy are vital in AI-driven patient education. Research question How effective are Large Language Models (LLM), such as ChatGPT and Google Bard, in delivering accurate and understandable patient education on lumbar disc herniation? Material and methods Ten Frequently Asked Questions about lumbar disc herniation were selected from 133 questions and were submitted to three LLMs. Six experienced spine surgeons rated the responses on a scale from "excellent" to "unsatisfactory," and evaluated the answers for exhaustiveness, clarity, empathy, and length. Statistical analysis involved Fleiss Kappa, Chi-square, and Friedman tests. Results Out of the responses, 27.2% were excellent, 43.9% satisfactory with minimal clarification, 18.3% satisfactory with moderate clarification, and 10.6% unsatisfactory. There were no significant differences in overall ratings among the LLMs (p = 0.90); however, inter-rater reliability was not achieved, and large differences among raters were detected in the distribution of answer frequencies. Overall, ratings varied among the 10 answers (p = 0.043). The average ratings for exhaustiveness, clarity, empathy, and length were above 3.5/5. Discussion and conclusion LLMs show potential in patient education for lumbar spine surgery, with generally positive feedback from evaluators. The new EU AI Act, enforcing strict regulation on AI systems, highlights the need for rigorous oversight in medical contexts. In the current study, the variability in evaluations and occasional inaccuracies underline the need for continuous improvement. Future research should involve more advanced models to enhance patient-physician communication.
Collapse
Affiliation(s)
- Siegmund Lang
- Department of Trauma Surgery, University Hospital Regensburg, Regensburg, Germany
| | - Jacopo Vitale
- Spine Center, Schulthess Klinik, Zurich, Switzerland
| | | | | | | | | | - Jani Puhakka
- Spine Center, Schulthess Klinik, Zurich, Switzerland
| | | | - Markus Loibl
- Spine Center, Schulthess Klinik, Zurich, Switzerland
| |
Collapse
|
16
|
Valentini M, Szkandera J, Smolle MA, Scheipl S, Leithner A, Andreou D. Artificial intelligence large language model ChatGPT: is it a trustworthy and reliable source of information for sarcoma patients? Front Public Health 2024; 12:1303319. [PMID: 38584922 PMCID: PMC10995284 DOI: 10.3389/fpubh.2024.1303319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 03/06/2024] [Indexed: 04/09/2024] Open
Abstract
Introduction Since its introduction in November 2022, the artificial intelligence large language model ChatGPT has taken the world by storm. Among other applications it can be used by patients as a source of information on diseases and their treatments. However, little is known about the quality of the sarcoma-related information ChatGPT provides. We therefore aimed at analyzing how sarcoma experts evaluate the quality of ChatGPT's responses on sarcoma-related inquiries and assess the bot's answers in specific evaluation metrics. Methods The ChatGPT responses to a sample of 25 sarcoma-related questions (5 definitions, 9 general questions, and 11 treatment-related inquiries) were evaluated by 3 independent sarcoma experts. Each response was compared with authoritative resources and international guidelines and graded on 5 different metrics using a 5-point Likert scale: completeness, misleadingness, accuracy, being up-to-date, and appropriateness. This resulted in maximum 25 and minimum 5 points per answer, with higher scores indicating a higher response quality. Scores ≥21 points were rated as very good, between 16 and 20 as good, while scores ≤15 points were classified as poor (11-15) and very poor (≤10). Results The median score that ChatGPT's answers achieved was 18.3 points (IQR, i.e., Inter-Quartile Range, 12.3-20.3 points). Six answers were classified as very good, 9 as good, while 5 answers each were rated as poor and very poor. The best scores were documented in the evaluation of how appropriate the response was for patients (median, 3.7 points; IQR, 2.5-4.2 points), which were significantly higher compared to the accuracy scores (median, 3.3 points; IQR, 2.0-4.2 points; p = 0.035). ChatGPT fared considerably worse with treatment-related questions, with only 45% of its responses classified as good or very good, compared to general questions (78% of responses good/very good) and definitions (60% of responses good/very good). Discussion The answers ChatGPT provided on a rare disease, such as sarcoma, were found to be of very inconsistent quality, with some answers being classified as very good and others as very poor. Sarcoma physicians should be aware of the risks of misinformation that ChatGPT poses and advise their patients accordingly.
Collapse
Affiliation(s)
- Marisa Valentini
- Department of Orthopaedics and Trauma, Medical University of Graz, Graz, Austria
| | - Joanna Szkandera
- Division of Oncology, Department of Internal Medicine, Medical University of Graz, Graz, Austria
| | - Maria Anna Smolle
- Department of Orthopaedics and Trauma, Medical University of Graz, Graz, Austria
| | - Susanne Scheipl
- Department of Orthopaedics and Trauma, Medical University of Graz, Graz, Austria
| | - Andreas Leithner
- Department of Orthopaedics and Trauma, Medical University of Graz, Graz, Austria
| | - Dimosthenis Andreou
- Department of Orthopaedics and Trauma, Medical University of Graz, Graz, Austria
| |
Collapse
|
17
|
Mu Y, He D. The Potential Applications and Challenges of ChatGPT in the Medical Field. Int J Gen Med 2024; 17:817-826. [PMID: 38476626 PMCID: PMC10929156 DOI: 10.2147/ijgm.s456659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 02/26/2024] [Indexed: 03/14/2024] Open
Abstract
ChatGPT, an AI-driven conversational large language model (LLM), has garnered significant scholarly attention since its inception, owing to its manifold applications in the realm of medical science. This study primarily examines the merits, limitations, anticipated developments, and practical applications of ChatGPT in clinical practice, healthcare, medical education, and medical research. It underscores the necessity for further research and development to enhance its performance and deployment. Moreover, future research avenues encompass ongoing enhancements and standardization of ChatGPT, mitigating its limitations, and exploring its integration and applicability in translational and personalized medicine. Reflecting the narrative nature of this review, a focused literature search was performed to identify relevant publications on ChatGPT's use in medicine. This process was aimed at gathering a broad spectrum of insights to provide a comprehensive overview of the current state and future prospects of ChatGPT in the medical domain. The objective is to aid healthcare professionals in understanding the groundbreaking advancements associated with the latest artificial intelligence tools, while also acknowledging the opportunities and challenges presented by ChatGPT.
Collapse
Affiliation(s)
- Yonglin Mu
- Department of Urology, Children’s Hospital of Chongqing Medical University, Chongqing, People’s Republic of China
| | - Dawei He
- Department of Urology, Children’s Hospital of Chongqing Medical University, Chongqing, People’s Republic of China
| |
Collapse
|
18
|
Bellini V, Semeraro F, Montomoli J, Cascella M, Bignami E. Between human and AI: assessing the reliability of AI text detection tools. Curr Med Res Opin 2024; 40:353-358. [PMID: 38265047 DOI: 10.1080/03007995.2024.2310086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 01/22/2024] [Indexed: 01/25/2024]
Abstract
OBJECTIVE Large language models (LLMs) such as ChatGPT-4 have raised critical questions regarding their distinguishability from human-generated content. In this research, we evaluated the effectiveness of online detection tools in identifying ChatGPT-4 vs human-written text. METHODS A two texts produced by ChatGPT-4 using differing prompts and one text created by a human author were analytically assessed using the following online detection tools: GPTZero, ZeroGPT, Writer ACD, and Originality. RESULTS The findings revealed a notable variance in the detection capabilities of the employed detection tools. GPTZero and ZeroGPT exhibited inconsistent assessments regarding the AI-origin of the texts. Writer ACD predominantly identified texts as human-written, whereas Originality consistently recognized the AI-generated content in both samples from ChatGPT-4. This highlights Originality's enhanced sensitivity to patterns characteristic of AI-generated text. CONCLUSION The study demonstrates that while automatic detection tools may discern texts generated by ChatGPT-4 significant variability exists in their accuracy. Undoubtedly, there is an urgent need for advanced detection tools to ensure the authenticity and integrity of content, especially in scientific and academic research. However, our findings underscore an urgent need for more refined detection methodologies to prevent the misdetection of human-written content as AI-generated and vice versa.
Collapse
Affiliation(s)
- Valentina Bellini
- Anesthesiology, Critical Care and Pain Medicine Division, Department of Medicine and Surgery, University of Parma, Parma, Italy
| | - Federico Semeraro
- Department of Anesthesia, Intensive Care and Prehospital Emergency, Maggiore Hospital Carlo Alberto Pizzardi, Bologna, Italy
| | - Jonathan Montomoli
- Department of Anesthesia and Intensive Care, Infermi Hospital, Romagna Local Health Authority, Rimini, Italy
| | - Marco Cascella
- Anesthesia and Pain Medicine. Department of Medicine, Surgery and Dentistry "Scuola Medica Salernitana", University of Salerno, Baronissi, Italy
| | - Elena Bignami
- Anesthesiology, Critical Care and Pain Medicine Division, Department of Medicine and Surgery, University of Parma, Parma, Italy
| |
Collapse
|
19
|
Abi-Rafeh J, Xu HH, Kazan R, Tevlin R, Furnas H. Large Language Models and Artificial Intelligence: A Primer for Plastic Surgeons on the Demonstrated and Potential Applications, Promises, and Limitations of ChatGPT. Aesthet Surg J 2024; 44:329-343. [PMID: 37562022 DOI: 10.1093/asj/sjad260] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 08/02/2023] [Accepted: 08/04/2023] [Indexed: 08/12/2023] Open
Abstract
BACKGROUND The rapidly evolving field of artificial intelligence (AI) holds great potential for plastic surgeons. ChatGPT, a recently released AI large language model (LLM), promises applications across many disciplines, including healthcare. OBJECTIVES The aim of this article was to provide a primer for plastic surgeons on AI, LLM, and ChatGPT, including an analysis of current demonstrated and proposed clinical applications. METHODS A systematic review was performed identifying medical and surgical literature on ChatGPT's proposed clinical applications. Variables assessed included applications investigated, command tasks provided, user input information, AI-emulated human skills, output validation, and reported limitations. RESULTS The analysis included 175 articles reporting on 13 plastic surgery applications and 116 additional clinical applications, categorized by field and purpose. Thirty-four applications within plastic surgery are thus proposed, with relevance to different target audiences, including attending plastic surgeons (n = 17, 50%), trainees/educators (n = 8, 24.0%), researchers/scholars (n = 7, 21%), and patients (n = 2, 6%). The 15 identified limitations of ChatGPT were categorized by training data, algorithm, and ethical considerations. CONCLUSIONS Widespread use of ChatGPT in plastic surgery will depend on rigorous research of proposed applications to validate performance and address limitations. This systemic review aims to guide research, development, and regulation to safely adopt AI in plastic surgery.
Collapse
|
20
|
Ajagunde J, Das NK. ChatGPT Versus Medical Professionals. Health Serv Insights 2024; 17:11786329241230161. [PMID: 38322596 PMCID: PMC10845989 DOI: 10.1177/11786329241230161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2024] Open
Affiliation(s)
- Jyoti Ajagunde
- Department of Microbiology, Dr. D Y Patil Medical College, Dr. D Y Patil Vidyapeeth, Pimpri, Pune, Maharashtra, India
| | - Nikunja Kumar Das
- Department of Microbiology, Dr. D Y Patil Medical College, Dr. D Y Patil Vidyapeeth, Pimpri, Pune, Maharashtra, India
| |
Collapse
|
21
|
Yan S, Du D, Liu X, Dai Y, Kim MK, Zhou X, Wang L, Zhang L, Jiang X. Assessment of the Reliability and Clinical Applicability of ChatGPT's Responses to Patients' Common Queries About Rosacea. Patient Prefer Adherence 2024; 18:249-253. [PMID: 38313827 PMCID: PMC10838492 DOI: 10.2147/ppa.s444928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 01/22/2024] [Indexed: 02/06/2024] Open
Abstract
Objective Artificial intelligence chatbot, particularly ChatGPT (Chat Generative Pre-trained Transformer), is capable of analyzing human input and generating human-like responses, which shows its potential application in healthcare. People with rosacea often have questions about alleviating symptoms and daily skin-care, which is suitable for ChatGPT to response. This study aims to assess the reliability and clinical applicability of ChatGPT 3.5 in responding to patients' common queries about rosacea and to evaluate the extent of ChatGPT's coverage in dermatology resources. Methods Based on a qualitative analysis of the literature on the queries from rosacea patients, we have extracted 20 questions of patients' greatest concerns, covering four main categories: treatment, triggers and diet, skincare, and special manifestations of rosacea. Each question was inputted into ChatGPT separately for three rounds of question-and-answer conversations. The generated answers will be evaluated by three experienced dermatologists with postgraduate degrees and over five years of clinical experience in dermatology, to assess their reliability and applicability for clinical practice. Results The analysis results indicate that the reviewers unanimously agreed that ChatGPT achieved a high reliability of 92.22% to 97.78% in responding to patients' common queries about rosacea. Additionally, almost all answers were applicable for supporting rosacea patient education, with a clinical applicability ranging from 98.61% to 100.00%. The consistency of the expert ratings was excellent (all significance levels were less than 0.05), with a consistency coefficient of 0.404 for content reliability and 0.456 for clinical practicality, indicating significant consistency in the results and a high level of agreement among the expert ratings. Conclusion ChatGPT 3.5 exhibits excellent reliability and clinical applicability in responding to patients' common queries about rosacea. This artificial intelligence tool is applicable for supporting rosacea patient education.
Collapse
Affiliation(s)
- Sihan Yan
- Department of Dermatology, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
- Laboratory of Dermatology, Clinical Institute of Inflammation and Immunology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
| | - Dan Du
- Department of Dermatology, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
- Laboratory of Dermatology, Clinical Institute of Inflammation and Immunology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
| | - Xu Liu
- Department of Dermatology, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
- Laboratory of Dermatology, Clinical Institute of Inflammation and Immunology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
| | - Yingying Dai
- Department of Dermatology, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
- Laboratory of Dermatology, Clinical Institute of Inflammation and Immunology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
| | - Min-Kyu Kim
- Department of Dermatology, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
- Laboratory of Dermatology, Clinical Institute of Inflammation and Immunology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
| | - Xinyu Zhou
- Department of Dermatology, Nanbu County People’s Hospital, Nanbu County, Nanchong, Sichuan, People’s Republic of China
| | - Lian Wang
- Department of Dermatology, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
- Laboratory of Dermatology, Clinical Institute of Inflammation and Immunology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
| | - Lu Zhang
- Department of Dermatology, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
- Laboratory of Dermatology, Clinical Institute of Inflammation and Immunology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
| | - Xian Jiang
- Department of Dermatology, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
- Laboratory of Dermatology, Clinical Institute of Inflammation and Immunology, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, People’s Republic of China
| |
Collapse
|
22
|
Zaleski AL, Berkowsky R, Craig KJT, Pescatello LS. Comprehensiveness, Accuracy, and Readability of Exercise Recommendations Provided by an AI-Based Chatbot: Mixed Methods Study. JMIR MEDICAL EDUCATION 2024; 10:e51308. [PMID: 38206661 PMCID: PMC10811574 DOI: 10.2196/51308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 10/05/2023] [Accepted: 12/11/2023] [Indexed: 01/12/2024]
Abstract
BACKGROUND Regular physical activity is critical for health and disease prevention. Yet, health care providers and patients face barriers to implement evidence-based lifestyle recommendations. The potential to augment care with the increased availability of artificial intelligence (AI) technologies is limitless; however, the suitability of AI-generated exercise recommendations has yet to be explored. OBJECTIVE The purpose of this study was to assess the comprehensiveness, accuracy, and readability of individualized exercise recommendations generated by a novel AI chatbot. METHODS A coding scheme was developed to score AI-generated exercise recommendations across ten categories informed by gold-standard exercise recommendations, including (1) health condition-specific benefits of exercise, (2) exercise preparticipation health screening, (3) frequency, (4) intensity, (5) time, (6) type, (7) volume, (8) progression, (9) special considerations, and (10) references to the primary literature. The AI chatbot was prompted to provide individualized exercise recommendations for 26 clinical populations using an open-source application programming interface. Two independent reviewers coded AI-generated content for each category and calculated comprehensiveness (%) and factual accuracy (%) on a scale of 0%-100%. Readability was assessed using the Flesch-Kincaid formula. Qualitative analysis identified and categorized themes from AI-generated output. RESULTS AI-generated exercise recommendations were 41.2% (107/260) comprehensive and 90.7% (146/161) accurate, with the majority (8/15, 53%) of inaccuracy related to the need for exercise preparticipation medical clearance. Average readability level of AI-generated exercise recommendations was at the college level (mean 13.7, SD 1.7), with an average Flesch reading ease score of 31.1 (SD 7.7). Several recurring themes and observations of AI-generated output included concern for liability and safety, preference for aerobic exercise, and potential bias and direct discrimination against certain age-based populations and individuals with disabilities. CONCLUSIONS There were notable gaps in the comprehensiveness, accuracy, and readability of AI-generated exercise recommendations. Exercise and health care professionals should be aware of these limitations when using and endorsing AI-based technologies as a tool to support lifestyle change involving exercise.
Collapse
Affiliation(s)
- Amanda L Zaleski
- Clinical Evidence Development, Aetna Medical Affairs, CVS Health Corporation, Hartford, CT, United States
- Department of Preventive Cardiology, Hartford Hospital, Hartford, CT, United States
| | - Rachel Berkowsky
- Department of Kinesiology, University of Connecticut, Storrs, CT, United States
| | - Kelly Jean Thomas Craig
- Clinical Evidence Development, Aetna Medical Affairs, CVS Health Corporation, Hartford, CT, United States
| | - Linda S Pescatello
- Department of Kinesiology, University of Connecticut, Storrs, CT, United States
| |
Collapse
|
23
|
Younis HA, Eisa TAE, Nasser M, Sahib TM, Noor AA, Alyasiri OM, Salisu S, Hayder IM, Younis HA. A Systematic Review and Meta-Analysis of Artificial Intelligence Tools in Medicine and Healthcare: Applications, Considerations, Limitations, Motivation and Challenges. Diagnostics (Basel) 2024; 14:109. [PMID: 38201418 PMCID: PMC10802884 DOI: 10.3390/diagnostics14010109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 12/02/2023] [Accepted: 12/04/2023] [Indexed: 01/12/2024] Open
Abstract
Artificial intelligence (AI) has emerged as a transformative force in various sectors, including medicine and healthcare. Large language models like ChatGPT showcase AI's potential by generating human-like text through prompts. ChatGPT's adaptability holds promise for reshaping medical practices, improving patient care, and enhancing interactions among healthcare professionals, patients, and data. In pandemic management, ChatGPT rapidly disseminates vital information. It serves as a virtual assistant in surgical consultations, aids dental practices, simplifies medical education, and aids in disease diagnosis. A total of 82 papers were categorised into eight major areas, which are G1: treatment and medicine, G2: buildings and equipment, G3: parts of the human body and areas of the disease, G4: patients, G5: citizens, G6: cellular imaging, radiology, pulse and medical images, G7: doctors and nurses, and G8: tools, devices and administration. Balancing AI's role with human judgment remains a challenge. A systematic literature review using the PRISMA approach explored AI's transformative potential in healthcare, highlighting ChatGPT's versatile applications, limitations, motivation, and challenges. In conclusion, ChatGPT's diverse medical applications demonstrate its potential for innovation, serving as a valuable resource for students, academics, and researchers in healthcare. Additionally, this study serves as a guide, assisting students, academics, and researchers in the field of medicine and healthcare alike.
Collapse
Affiliation(s)
- Hussain A. Younis
- College of Education for Women, University of Basrah, Basrah 61004, Iraq
| | | | - Maged Nasser
- Computer & Information Sciences Department, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Malaysia;
| | - Thaeer Mueen Sahib
- Kufa Technical Institute, Al-Furat Al-Awsat Technical University, Kufa 54001, Iraq;
| | - Ameen A. Noor
- Computer Science Department, College of Education, University of Almustansirya, Baghdad 10045, Iraq;
| | | | - Sani Salisu
- Department of Information Technology, Federal University Dutse, Dutse 720101, Nigeria;
| | - Israa M. Hayder
- Qurna Technique Institute, Southern Technical University, Basrah 61016, Iraq;
| | - Hameed AbdulKareem Younis
- Department of Cybersecurity, College of Computer Science and Information Technology, University of Basrah, Basrah 61016, Iraq;
| |
Collapse
|
24
|
Malik S, Zaheer S. ChatGPT as an aid for pathological diagnosis of cancer. Pathol Res Pract 2024; 253:154989. [PMID: 38056135 DOI: 10.1016/j.prp.2023.154989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 11/26/2023] [Accepted: 11/27/2023] [Indexed: 12/08/2023]
Abstract
Diagnostic workup of cancer patients is highly reliant on the science of pathology using cytopathology, histopathology, and other ancillary techniques like immunohistochemistry and molecular cytogenetics. Data processing and learning by means of artificial intelligence (AI) has become a spearhead for the advancement of medicine, with pathology and laboratory medicine being no exceptions. ChatGPT, an artificial intelligence (AI)-based chatbot, that was recently launched by OpenAI, is currently a talk of the town, and its role in cancer diagnosis is also being explored meticulously. Pathology workflow by integration of digital slides, implementation of advanced algorithms, and computer-aided diagnostic techniques extend the frontiers of the pathologist's view beyond a microscopic slide and enables effective integration, assimilation, and utilization of knowledge that is beyond human limits and boundaries. Despite of it's numerous advantages in the pathological diagnosis of cancer, it comes with several challenges like integration of digital slides with input language parameters, problems of bias, and legal issues which have to be addressed and worked up soon so that we as a pathologists diagnosing malignancies are on the same band wagon and don't miss the train.
Collapse
Affiliation(s)
- Shaivy Malik
- Department of Pathology, Vardhman Mahavir Medical College and Safdarjung Hospital, New Delhi, India
| | - Sufian Zaheer
- Department of Pathology, Vardhman Mahavir Medical College and Safdarjung Hospital, New Delhi, India.
| |
Collapse
|
25
|
Alotaibi SS, Rehman A, Hasnain M. Revolutionizing ocular cancer management: a narrative review on exploring the potential role of ChatGPT. Front Public Health 2023; 11:1338215. [PMID: 38192545 PMCID: PMC10773849 DOI: 10.3389/fpubh.2023.1338215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 12/04/2023] [Indexed: 01/10/2024] Open
Abstract
This paper pioneers the exploration of ocular cancer, and its management with the help of Artificial Intelligence (AI) technology. Existing literature presents a significant increase in new eye cancer cases in 2023, experiencing a higher incidence rate. Extensive research was conducted using online databases such as PubMed, ACM Digital Library, ScienceDirect, and Springer. To conduct this review, Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines are used. Of the collected 62 studies, only 20 documents met the inclusion criteria. The review study identifies seven ocular cancer types. Important challenges associated with ocular cancer are highlighted, including limited awareness about eye cancer, restricted healthcare access, financial barriers, and insufficient infrastructure support. Financial barriers is one of the widely examined ocular cancer challenges in the literature. The potential role and limitations of ChatGPT are discussed, emphasizing its usefulness in providing general information to physicians, noting its inability to deliver up-to-date information. The paper concludes by presenting the potential future applications of ChatGPT to advance research on ocular cancer globally.
Collapse
Affiliation(s)
- Saud S. Alotaibi
- Information Systems Department, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Amna Rehman
- Department of Computer Science, Lahore Leads University, Lahore, Pakistan
| | - Muhammad Hasnain
- Department of Computer Science, Lahore Leads University, Lahore, Pakistan
| |
Collapse
|
26
|
Chatterjee S, Bhattacharya M, Pal S, Lee SS, Chakraborty C. ChatGPT and large language models in orthopedics: from education and surgery to research. J Exp Orthop 2023; 10:128. [PMID: 38038796 PMCID: PMC10692045 DOI: 10.1186/s40634-023-00700-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 11/16/2023] [Indexed: 12/02/2023] Open
Abstract
ChatGPT has quickly popularized since its release in November 2022. Currently, large language models (LLMs) and ChatGPT have been applied in various domains of medical science, including in cardiology, nephrology, orthopedics, ophthalmology, gastroenterology, and radiology. Researchers are exploring the potential of LLMs and ChatGPT for clinicians and surgeons in every domain. This study discusses how ChatGPT can help orthopedic clinicians and surgeons perform various medical tasks. LLMs and ChatGPT can help the patient community by providing suggestions and diagnostic guidelines. In this study, the use of LLMs and ChatGPT to enhance and expand the field of orthopedics, including orthopedic education, surgery, and research, is explored. Present LLMs have several shortcomings, which are discussed herein. However, next-generation and future domain-specific LLMs are expected to be more potent and transform patients' quality of life.
Collapse
Affiliation(s)
- Srijan Chatterjee
- Institute for Skeletal Aging & Orthopaedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-Si, 24252, Gangwon-Do, Republic of Korea
| | - Manojit Bhattacharya
- Department of Zoology, Fakir Mohan University, Vyasa Vihar, Balasore, 756020, Odisha, India
| | - Soumen Pal
- School of Mechanical Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Sang-Soo Lee
- Institute for Skeletal Aging & Orthopaedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-Si, 24252, Gangwon-Do, Republic of Korea.
| | - Chiranjib Chakraborty
- Department of Biotechnology, School of Life Science and Biotechnology, Adamas University, Kolkata, West Bengal, 700126, India.
| |
Collapse
|
27
|
Au K, Yang W. Auxiliary use of ChatGPT in surgical diagnosis and treatment. Int J Surg 2023; 109:3940-3943. [PMID: 37678271 PMCID: PMC10720849 DOI: 10.1097/js9.0000000000000686] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 08/09/2023] [Indexed: 09/09/2023]
Abstract
ChatGPT can be used as an auxiliary tool in surgical diagnosis and treatment in several ways. One of the most incredible values of using ChatGPT is its ability to quickly process and handle large amounts of data and provide relatively accurate information to healthcare workers. Due to its high accuracy and ability to process big data, ChatGPT has been widely used in the healthcare industry for tasks such as assisting medical diagnosis, giving predictions of some diseases, and analyzing some medical cases. Surgical diagnosis and treatment can serve as an auxiliary tool to help healthcare professionals. Process large amounts of medical data, provide real-time guidance and feedback, and increase healthcare's overall speed and quality. Although it has great acceptance, it still faces issues such as ethics, patient privacy, data security, law, trustworthiness, and accuracy. This study aimed to explore the auxiliary use of ChatGPT in surgical diagnosis and treatment.
Collapse
Affiliation(s)
- Kahei Au
- School of Medicine, Jinan University
| | - Wah Yang
- Department of Metabolic and Bariatric Surgery, The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong Province, People’s Republic of China
| |
Collapse
|
28
|
Abu-Farha R, Fino L, Al-Ashwal FY, Zawiah M, Gharaibeh L, Harahsheh MM, Darwish Elhajji F. Evaluation of community pharmacists' perceptions and willingness to integrate ChatGPT into their pharmacy practice: A study from Jordan. J Am Pharm Assoc (2003) 2023; 63:1761-1767.e2. [PMID: 37648157 DOI: 10.1016/j.japh.2023.08.020] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 08/10/2023] [Accepted: 08/22/2023] [Indexed: 09/01/2023]
Abstract
OBJECTIVES This study aimed to examine the extent of community pharmacists' awareness of Chat Generative Pretraining Transformer (ChatGPT), their willingness to embark on this new development of artificial intelligence (AI) development, and barriers that face the incorporation of this nonconventional source of information into pharmacy practice. METHODS A cross-sectional study was conducted among community pharmacists in Jordanian cities between April 26, 2023, and May 10, 2023. Convenience and snowball sampling techniques were used to select study participants owing to resource and time constraints. The questionnaire was distributed by research assistants through popular social media platforms. Logistic regression analysis was used to assess predictors affecting their willingness to use this service in the future. RESULTS A total of 221 community pharmacists participated in the current study (response rate was not calculated because opt-in recruitment strategies were used). Remarkably, nearly half of the pharmacists (n = 107, 48.4%) indicated a willingness to incorporate the ChatGPT into their pharmacy practice. Nearly half of the pharmacists (n = 105, 47.5%) demonstrated a high perceived benefit score for ChatGPT, whereas approximately 37% of pharmacists (n = 81) expressed a high concern score about ChatGPT. More than 70% of pharmacists believed that ChatGPT lacked the ability to use human judgment and make complicated ethical judgments in its responses (n = 168). Finally, logistics regression analysis showed that pharmacists who had previous experience in using ChatGPT were more willing to integrate ChatGPT in their pharmacy practice than those with no previous experience in using ChatGPT (odds ratio 2.312, P = 0.035). CONCLUSION Although pharmacists show a willingness to incorporate ChatGPT into their practice, especially those with previous experience, there are major concerns. These mainly revolve around the tool's ability to make human-like judgments and ethical decisions. These findings are crucial for the future development and integration of AI tools in pharmacy practice.
Collapse
|
29
|
Karakas C, Brock D, Lakhotia A. Leveraging ChatGPT in the Pediatric Neurology Clinic: Practical Considerations for Use to Improve Efficiency and Outcomes. Pediatr Neurol 2023; 148:157-163. [PMID: 37725885 DOI: 10.1016/j.pediatrneurol.2023.08.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 08/17/2023] [Accepted: 08/25/2023] [Indexed: 09/21/2023]
Abstract
BACKGROUND Artificial intelligence (AI) is progressively influencing healthcare sectors, including pediatric neurology. This paper aims to investigate the potential and limitations of using ChatGPT, a large language model (LLM) developed by OpenAI, in an outpatient pediatric neurology clinic. The analysis focuses on the tool's capabilities in enhancing clinical efficiency, productivity, and patient education. METHOD This is an opinion-based exploration supplemented with practical examples. We assessed ChatGPT's utility in administrative and educational tasks such as drafting medical necessity letters and creating patient educational materials. RESULTS ChatGPT showed efficacy in streamlining administrative work, particularly in drafting administrative letters and formulating personalized patient education materials. However, the model has limitations in performing higher-order tasks like formulating nuanced differential diagnoses. Additionally, ethical and legal concerns, including data privacy and the potential dissemination of misinformation, warrant cautious implementation. CONCLUSIONS The integration of AI tools like ChatGPT in pediatric neurology clinics has demonstrated promising results in boosting efficiency and patient education, despite present limitations and ethical concerns. As technology advances, we anticipate future applications may extend to more complex clinical tasks like precise differential diagnoses and treatment strategy guidance. Careful, patient-centered implementation is essential for leveraging the potential benefits of AI in pediatric neurology effectively.
Collapse
Affiliation(s)
- Cemal Karakas
- Division of Pediatric Neurology, Department of Neurology, University of Louisville, Louisville, Kentucky; Norton Neuroscience Institute, Louisville, Kentucky.
| | - Dylan Brock
- Division of Pediatric Neurology, Department of Neurology, University of Louisville, Louisville, Kentucky; Norton Neuroscience Institute, Louisville, Kentucky
| | - Arpita Lakhotia
- Division of Pediatric Neurology, Department of Neurology, University of Louisville, Louisville, Kentucky; Norton Neuroscience Institute, Louisville, Kentucky
| |
Collapse
|
30
|
Mese I, Taslicay CA, Sivrioglu AK. Improving radiology workflow using ChatGPT and artificial intelligence. Clin Imaging 2023; 103:109993. [PMID: 37812965 DOI: 10.1016/j.clinimag.2023.109993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 08/19/2023] [Accepted: 09/28/2023] [Indexed: 10/11/2023]
Abstract
Artificial Intelligence is a branch of computer science that aims to create intelligent machines capable of performing tasks that typically require human intelligence. One of the branches of artificial intelligence is natural language processing, which is dedicated to studying the interaction between computers and human language. ChatGPT is a sophisticated natural language processing tool that can understand and respond to complex questions and commands in natural language. Radiology is a vital aspect of modern medicine that involves the use of imaging technologies to diagnose and treat medical conditions artificial intelligence, including ChatGPT, can be integrated into radiology workflows to improve efficiency, accuracy, and patient care. ChatGPT can streamline various radiology workflow steps, including patient registration, scheduling, patient check-in, image acquisition, interpretation, and reporting. While ChatGPT has the potential to transform radiology workflows, there are limitations to the technology that must be addressed, such as the potential for bias in artificial intelligence algorithms and ethical concerns. As technology continues to advance, ChatGPT is likely to become an increasingly important tool in the field of radiology, and in healthcare more broadly.
Collapse
Affiliation(s)
- Ismail Mese
- Department of Radiology, Health Sciences University, Erenkoy Mental Health and Neurology Training and Research Hospital, 19 Mayıs, Sinan Ercan Cd. No: 23, Kadıköy/Istanbul 34736, Turkey.
| | | | - Ali Kemal Sivrioglu
- Department of Radiology, Liv Hospital Vadistanbul, Ayazağa Mahallesi, Kemerburgaz Caddesi, Vadistanbul Park Etabı, 7F Blok, 34396 Sarıyer/İstanbul, Turkey
| |
Collapse
|
31
|
Chakraborty C, Pal S, Bhattacharya M, Dash S, Lee SS. Overview of Chatbots with special emphasis on artificial intelligence-enabled ChatGPT in medical science. Front Artif Intell 2023; 6:1237704. [PMID: 38028668 PMCID: PMC10644239 DOI: 10.3389/frai.2023.1237704] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 10/05/2023] [Indexed: 12/01/2023] Open
Abstract
The release of ChatGPT has initiated new thinking about AI-based Chatbot and its application and has drawn huge public attention worldwide. Researchers and doctors have started thinking about the promise and application of AI-related large language models in medicine during the past few months. Here, the comprehensive review highlighted the overview of Chatbot and ChatGPT and their current role in medicine. Firstly, the general idea of Chatbots, their evolution, architecture, and medical use are discussed. Secondly, ChatGPT is discussed with special emphasis of its application in medicine, architecture and training methods, medical diagnosis and treatment, research ethical issues, and a comparison of ChatGPT with other NLP models are illustrated. The article also discussed the limitations and prospects of ChatGPT. In the future, these large language models and ChatGPT will have immense promise in healthcare. However, more research is needed in this direction.
Collapse
Affiliation(s)
- Chiranjib Chakraborty
- Department of Biotechnology, School of Life Science and Biotechnology, Adamas University, Kolkata, West Bengal, India
| | - Soumen Pal
- School of Mechanical Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | | | - Snehasish Dash
- School of Mechanical Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Sang-Soo Lee
- Institute for Skeletal Aging and Orthopedic Surgery, Hallym University Chuncheon Sacred Heart Hospital, Chuncheon-si, Gangwon-do, Republic of Korea
| |
Collapse
|
32
|
Irfan B, Yaqoob A. ChatGPT's Epoch in Rheumatological Diagnostics: A Critical Assessment in the Context of Sjögren's Syndrome. Cureus 2023; 15:e47754. [PMID: 38022092 PMCID: PMC10676288 DOI: 10.7759/cureus.47754] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/26/2023] [Indexed: 12/01/2023] Open
Abstract
INTRODUCTION The rise of artificial intelligence in medical practice is reshaping clinical care. Large language models (LLMs) like ChatGPT have the potential to assist in rheumatology by personalizing scientific information retrieval, particularly in the context of Sjögren's Syndrome. This study aimed to evaluate the efficacy of ChatGPT in providing insights into Sjögren's Syndrome, differentiating it from other rheumatological conditions. MATERIALS AND METHODS A database of peer-reviewed articles and clinical guidelines focused on Sjögren's Syndrome was compiled. Clinically relevant questions were presented to ChatGPT, with responses assessed for accuracy, relevance, and comprehensiveness. Techniques such as blinding, random control queries, and temporal analysis ensured unbiased evaluation. ChatGPT's responses were also assessed using the 15-questionnaire DISCERN tool. RESULTS ChatGPT effectively highlighted key immunopathological and histopathological characteristics of Sjögren's Syndrome, though some crucial data and citation inconsistencies were noted. For a given clinical vignette, ChatGPT correctly identified potential etiological considerations with Sjögren's Syndrome being prominent. DISCUSSION LLMs like ChatGPT offer rapid access to vast amounts of data, beneficial for both patients and providers. While it democratizes information, limitations like potential oversimplification and reference inaccuracies were observed. The balance between LLM insights and clinical judgment, as well as continuous model refinement, is crucial. CONCLUSION LLMs like ChatGPT offer significant potential in rheumatology, providing swift and broad medical insights. However, a cautious approach is vital, ensuring rigorous training and ethical application for optimal patient care and clinical practice.
Collapse
Affiliation(s)
- Bilal Irfan
- Microbiology and Immunology, University of Michigan, Ann Arbor, USA
| | | |
Collapse
|
33
|
Turner JH. Cancer Care by Committee to be Superseded by Personal Physician-Patient Partnership Informed by Artificial Intelligence. Cancer Biother Radiopharm 2023; 38:497-505. [PMID: 37366774 DOI: 10.1089/cbr.2023.0058] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2023] Open
Abstract
Multidisciplinary tumor boards (MTBs) have become the reference standard of cancer management, founded upon randomized controlled trial (RCT) evidence-based guidelines. The inordinate delays inherent in awaiting formal regulatory agency approvals of novel therapeutic agents, and the rigidities and nongeneralizability of this regimented approach, often deny cancer patients timely access to effective innovative treatment. Reluctance of MTBs to accept theranostic care of patients with advanced neuroendocrine tumors (NETs) and metastatic castrate-resistant prostate cancer resulted in decades of delay in the incorporation of 177Lu-octreotate and 177Lu-prostate-specific membrane antigen (PSMA) into routine clinical oncology practice. Recent developments in immunotherapy and molecular targeted precision therapy, based on N-of-One individual multifactorial genome analyses, have greatly increased the complexity of decision-making. Burgeoning specialist workload and tight time frames now threaten to overwhelm the logistically, and emotionally, demanding MTB system. It is hypothesized that the advent of advanced artificial intelligence technology and Chatbot natural language algorithms will shift the cancer care paradigm from a MTB management model toward a personal physician-patient shared-care partnership for real-world practice of precision individualized holistic oncology.
Collapse
Affiliation(s)
- J Harvey Turner
- Department of Nuclear Medicine, Fiona Stanley Fremantle Hospitals Group, The University of Western Australia, Murdoch, Australia
| |
Collapse
|
34
|
Chou YH, Lin C, Lee SH, Chang Chien YW, Cheng LC. Potential Mobile Health Applications for Improving the Mental Health of the Elderly: A Systematic Review. Clin Interv Aging 2023; 18:1523-1534. [PMID: 37727447 PMCID: PMC10506600 DOI: 10.2147/cia.s410396] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 09/05/2023] [Indexed: 09/21/2023] Open
Abstract
The rapid aging of the global population presents challenges in providing mental health care resources for older adults aged 65 and above. The COVID-19 pandemic has further exacerbated the global population's psychological distress due to social isolation and distancing. Thus, there is an urgent need to update scholarly knowledge on the effectiveness of mHealth applications to improve older people's mental health. This systematic review summarizes recent literature on chatbots aimed at enhancing mental health and well-being. Sixteen papers describing six apps or prototypes were reviewed, indicating the practicality, feasibility, and acceptance of chatbots for promoting mental health in older adults. Engaging with chatbots led to improvements in well-being and stress reduction, as well as a decrement in depressive symptoms. Mobile health applications addressing these studies are categorized for reference.
Collapse
Affiliation(s)
- Ya-Hsin Chou
- Department of Psychiatry, Taoyuan Chang Gung Memorial Hospital, Taoyuan County, Taiwan
| | - Chemin Lin
- College of Medicine, Chang Gung University, Taoyuan County, Taiwan
- Department of Psychiatry, Keelung Chang Gung Memorial Hospital, Keelung City, Taiwan
- Community Medicine Research Center, Chang Gung Memorial Hospital, Keelung, Taiwan
| | - Shwu-Hua Lee
- College of Medicine, Chang Gung University, Taoyuan County, Taiwan
- Department of Psychiatry, Linkou Chang Gung Memorial Hospital, Taoyuan County, Taiwan
| | - Ya-Wen Chang Chien
- Department of Photography and Virtual Reality Design, Huafan University, New Taipei, Taiwan
| | - Li-Chen Cheng
- Department of Information and Finance Management, National Taipei University of Technology, Taipei, Taiwan
| |
Collapse
|
35
|
Stanbrook MB, Weinhold M, Kelsall D. Nouvelle politique sur l’utilisation des outils d’intelligence artificielle dans les manuscrits soumis au JAMC. CMAJ 2023; 195:E1168-E1169. [PMID: 37669792 PMCID: PMC10479997 DOI: 10.1503/cmaj.230949-f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023] Open
|
36
|
Lim ZW, Pushpanathan K, Yew SME, Lai Y, Sun CH, Lam JSH, Chen DZ, Goh JHL, Tan MCJ, Sheng B, Cheng CY, Koh VTC, Tham YC. Benchmarking large language models' performances for myopia care: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard. EBioMedicine 2023; 95:104770. [PMID: 37625267 PMCID: PMC10470220 DOI: 10.1016/j.ebiom.2023.104770] [Citation(s) in RCA: 78] [Impact Index Per Article: 78.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 07/21/2023] [Accepted: 08/08/2023] [Indexed: 08/27/2023] Open
Abstract
BACKGROUND Large language models (LLMs) are garnering wide interest due to their human-like and contextually relevant responses. However, LLMs' accuracy across specific medical domains has yet been thoroughly evaluated. Myopia is a frequent topic which patients and parents commonly seek information online. Our study evaluated the performance of three LLMs namely ChatGPT-3.5, ChatGPT-4.0, and Google Bard, in delivering accurate responses to common myopia-related queries. METHODS We curated thirty-one commonly asked myopia care-related questions, which were categorised into six domains-pathogenesis, risk factors, clinical presentation, diagnosis, treatment and prevention, and prognosis. Each question was posed to the LLMs, and their responses were independently graded by three consultant-level paediatric ophthalmologists on a three-point accuracy scale (poor, borderline, good). A majority consensus approach was used to determine the final rating for each response. 'Good' rated responses were further evaluated for comprehensiveness on a five-point scale. Conversely, 'poor' rated responses were further prompted for self-correction and then re-evaluated for accuracy. FINDINGS ChatGPT-4.0 demonstrated superior accuracy, with 80.6% of responses rated as 'good', compared to 61.3% in ChatGPT-3.5 and 54.8% in Google Bard (Pearson's chi-squared test, all p ≤ 0.009). All three LLM-Chatbots showed high mean comprehensiveness scores (Google Bard: 4.35; ChatGPT-4.0: 4.23; ChatGPT-3.5: 4.11, out of a maximum score of 5). All LLM-Chatbots also demonstrated substantial self-correction capabilities: 66.7% (2 in 3) of ChatGPT-4.0's, 40% (2 in 5) of ChatGPT-3.5's, and 60% (3 in 5) of Google Bard's responses improved after self-correction. The LLM-Chatbots performed consistently across domains, except for 'treatment and prevention'. However, ChatGPT-4.0 still performed superiorly in this domain, receiving 70% 'good' ratings, compared to 40% in ChatGPT-3.5 and 45% in Google Bard (Pearson's chi-squared test, all p ≤ 0.001). INTERPRETATION Our findings underscore the potential of LLMs, particularly ChatGPT-4.0, for delivering accurate and comprehensive responses to myopia-related queries. Continuous strategies and evaluations to improve LLMs' accuracy remain crucial. FUNDING Dr Yih-Chung Tham was supported by the National Medical Research Council of Singapore (NMRC/MOH/HCSAINV21nov-0001).
Collapse
Affiliation(s)
- Zhi Wei Lim
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Krithi Pushpanathan
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore
| | - Samantha Min Er Yew
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore
| | - Yien Lai
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Department of Ophthalmology, National University Hospital, Singapore
| | - Chen-Hsin Sun
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Department of Ophthalmology, National University Hospital, Singapore
| | - Janice Sing Harn Lam
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Department of Ophthalmology, National University Hospital, Singapore
| | - David Ziyou Chen
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Department of Ophthalmology, National University Hospital, Singapore
| | | | - Marcus Chun Jin Tan
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Department of Ophthalmology, National University Hospital, Singapore
| | - Bin Sheng
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China; Department of Endocrinology and Metabolism, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai Diabetes Institute, Shanghai Clinical Center for Diabetes, Shanghai, China; MoE Key Lab of Artificial Intelligence, Artificial Intelligence Institute, Shanghai Jiao Tong University, Shanghai, China
| | - Ching-Yu Cheng
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Eye Academic Clinical Program (Eye ACP), Duke NUS Medical School, Singapore
| | - Victor Teck Chang Koh
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Department of Ophthalmology, National University Hospital, Singapore
| | - Yih-Chung Tham
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Eye Academic Clinical Program (Eye ACP), Duke NUS Medical School, Singapore.
| |
Collapse
|
37
|
Chow JCL, Wong V, Sanders L, Li K. Developing an AI-Assisted Educational Chatbot for Radiotherapy Using the IBM Watson Assistant Platform. Healthcare (Basel) 2023; 11:2417. [PMID: 37685452 PMCID: PMC10487627 DOI: 10.3390/healthcare11172417] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 08/25/2023] [Accepted: 08/26/2023] [Indexed: 09/10/2023] Open
Abstract
Objectives: This study aims to make radiotherapy knowledge regarding healthcare accessible to the general public by developing an AI-powered chatbot. The interactive nature of the chatbot is expected to facilitate better understanding of information on radiotherapy through communication with users. Methods: Using the IBM Watson Assistant platform on IBM Cloud, the chatbot was constructed following a pre-designed flowchart that outlines the conversation flow. This approach ensured the development of the chatbot with a clear mindset and allowed for effective tracking of the conversation. The chatbot is equipped to furnish users with information and quizzes on radiotherapy to assess their understanding of the subject. Results: By adopting a question-and-answer approach, the chatbot can engage in human-like communication with users seeking information about radiotherapy. As some users may feel anxious and struggle to articulate their queries, the chatbot is designed to be user-friendly and reassuring, providing a list of questions for the user to choose from. Feedback on the chatbot's content was mostly positive, despite a few limitations. The chatbot performed well and successfully conveyed knowledge as intended. Conclusions: There is a need to enhance the chatbot's conversation approach to improve user interaction. Including translation capabilities to cater to individuals with different first languages would also be advantageous. Lastly, the newly launched ChatGPT could potentially be developed into a medical chatbot to facilitate knowledge transfer.
Collapse
Affiliation(s)
- James C. L. Chow
- Radiation Medicine Program, Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 1X6, Canada
- Department of Radiation Oncology, University of Toronto, Toronto, ON M5T 1P5, Canada
| | - Valerie Wong
- Department of Physics, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada;
| | - Leslie Sanders
- Department of Humanities, York University, Toronto, ON M3J 1P3, Canada;
| | - Kay Li
- Department of English, University of Toronto, Toronto, ON M5R 2M8, Canada;
| |
Collapse
|
38
|
Nazir T, Ahmad U, Mal M, Rehman MU, Saeed R, Kalia J. Microsoft Bing vs Google Bard in Neurology: A Comparative Study of AI-Generated Patient Education Material.. [DOI: 10.1101/2023.08.25.23294641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
AbstractBackgroundPatient education is an essential component of healthcare, and artificial intelligence (AI) language models such as Google Bard and Microsoft Bing have the potential to improve information transmission and enhance patient care. However, it is crucial to evaluate the quality, accuracy, and understandability of the materials generated by these models before applying them in medical practice. This study aimed to assess and compare the quality of patient education materials produced by Google Bard and Microsoft Bing in response to questions related to neurological conditions.MethodsA cross-sectional study design was used to evaluate and compare the ability of Google Bard and Microsoft Bing to generate patient education materials. The study included the top ten prevalent neurological diseases based on WHO prevalence data. Ten board-certified neurologists and four neurology residents evaluated the responses generated by the models on six quality metrics. The scores for each model were compiled and averaged across all measures, and the significance of any observed variations was assessed using an independent t-test.ResultsGoogle Bard performed better than Microsoft Bing in all six-quality metrics, with an overall mean score of 79% and 69%, respectively. Google Bard outperformed Microsoft Bing in all measures for eight questions, while Microsoft Bing performed marginally better in terms of objectivity and clarity for the epilepsy query.ConclusionThis study showed that Google Bard performs better than Microsoft Bing in generating patient education materials for neurological diseases. However, healthcare professionals should take into account both AI models’ advantages and disadvantages when providing support for health information requirements. Future studies can help determine the underlying causes of these variations and guide cooperative initiatives to create more user-focused AI-generated patient education materials. Finally, researchers should consider the perception of patients regarding AI-generated patient education material and its impact on implementing these solutions in healthcare settings.
Collapse
|
39
|
Watters C, Lemanski MK. Universal skepticism of ChatGPT: a review of early literature on chat generative pre-trained transformer. Front Big Data 2023; 6:1224976. [PMID: 37680954 PMCID: PMC10482048 DOI: 10.3389/fdata.2023.1224976] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 07/10/2023] [Indexed: 09/09/2023] Open
Abstract
ChatGPT, a new language model developed by OpenAI, has garnered significant attention in various fields since its release. This literature review provides an overview of early ChatGPT literature across multiple disciplines, exploring its applications, limitations, and ethical considerations. The review encompasses Scopus-indexed publications from November 2022 to April 2023 and includes 156 articles related to ChatGPT. The findings reveal a predominance of negative sentiment across disciplines, though subject-specific attitudes must be considered. The review highlights the implications of ChatGPT in many fields including healthcare, raising concerns about employment opportunities and ethical considerations. While ChatGPT holds promise for improved communication, further research is needed to address its capabilities and limitations. This literature review provides insights into early research on ChatGPT, informing future investigations and practical applications of chatbot technology, as well as development and usage of generative AI.
Collapse
Affiliation(s)
- Casey Watters
- Faculty of Law, Bond University, Gold Coast, QLD, Australia
| | | |
Collapse
|
40
|
Erren TC, Lewis P, Shaw DM. Brave (in a) new world: an ethical perspective on chatbots for medical advice. Front Public Health 2023; 11:1254334. [PMID: 37663854 PMCID: PMC10470018 DOI: 10.3389/fpubh.2023.1254334] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 07/31/2023] [Indexed: 09/05/2023] Open
Affiliation(s)
- Thomas C. Erren
- University of Cologne, University Hospital of Cologne, Cologne, North Rhine-Westphalia, Germany
| | - Philip Lewis
- University of Cologne, University Hospital of Cologne, Cologne, North Rhine-Westphalia, Germany
| | - David M. Shaw
- Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
- Institute for Biomedical Ethics, University of Basel, Basel, Switzerland
| |
Collapse
|
41
|
Rawashdeh B, Kim J, AlRyalat SA, Prasad R, Cooper M. ChatGPT and Artificial Intelligence in Transplantation Research: Is It Always Correct? Cureus 2023; 15:e42150. [PMID: 37602076 PMCID: PMC10438857 DOI: 10.7759/cureus.42150] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/18/2023] [Indexed: 08/22/2023] Open
Abstract
INTRODUCTION ChatGPT (OpenAI, San Francisco, California, United States) is a chatbot powered by language-based artificial intelligence (AI). It generates text based on the information provided by users. It is currently being evaluated in medical research, publishing, and healthcare. However, there has been no prior study on the evaluation of its ability to help in kidney transplant research. This feasibility study aimed to evaluate the application and accuracy of ChatGPT in the field of kidney transplantation. METHODS On two separate dates, February 21 and March 2, 2023, ChatGPT 3.5 was questioned regarding the medical treatment of kidney transplants and related scientific facts. The responses provided by the chatbot were compiled, and a panel of two specialists reviewed the correctness of each answer. RESULTS We demonstrated that ChatGPT possessed substantial general knowledge of kidney transplantation; however, they lacked sufficient information and had inaccurate information that necessitates a deeper understanding of the topic. Moreover, ChatGPT failed to provide references for any of the scientific data it provided regarding kidney transplants, and when requested for references, it provided inaccurate ones. CONCLUSION The results of this short feasibility study indicate that ChatGPT may have the ability to assist in data collecting when a particular query is posed. However, caution should be exercised and it should not be used in isolation as a supplement to research or decisions regarding healthcare because there are still challenges with data accuracy and missing information.
Collapse
Affiliation(s)
- Badi Rawashdeh
- Transplant Surgery, Medical College of Wisconsin, Milwaukee, USA
| | - Joohyun Kim
- Transplant Surgery, Medical College of Wisconsin, Milwaukee, USA
| | | | - Raj Prasad
- Transplant Surgery, Medical College of Wisconsin, Milwaukee, USA
| | - Matthew Cooper
- Transplant Surgery, Medical College of Wisconsin, Milwaukee, USA
| |
Collapse
|
42
|
Martínez-Ezquerro JD. Response to: Impact of ChatGPT and Artificial Intelligence in the Contemporary Medical Landscape. Arch Med Res 2023; 54:102838. [PMID: 37364482 DOI: 10.1016/j.arcmed.2023.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 06/14/2023] [Indexed: 06/28/2023]
Affiliation(s)
- José Darío Martínez-Ezquerro
- Epidemiological Research and Health Services Unit, Aging Area, 21(st) Century National Medical Center, Instituto Mexicano del Seguro Social, Mexico City, Mexico.
| |
Collapse
|
43
|
Abstract
The OpenAI chatbot ChatGPT is an artificial intelligence (AI) application that uses state-of-the-art language processing AI. It can perform a vast number of tasks, from writing poetry and explaining complex quantum mechanics, to translating language and writing research articles with a human-like understanding and legitimacy. Since its initial release to the public in November 2022, ChatGPT has garnered considerable attention due to its ability to mimic the patterns of human language, and it has attracted billion-dollar investments from Microsoft and PricewaterhouseCoopers. The scope of ChatGPT and other large language models appears infinite, but there are several important limitations. This editorial provides an introduction to the basic functionality of ChatGPT and other large language models, their current applications and limitations, and the associated implications for clinical practice and research.
Collapse
Affiliation(s)
- Kyle N Kunze
- Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, USA
| | - Seong J Jang
- Weill Cornell Medical College, New York, New York, USA
| | | | - Jonathan M Vigdorchik
- Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, USA
- Adult Reconstruction and Joint Replacement Service, Hospital for Special Surgery, New York, New York, USA
| | - Fares S Haddad
- The Bone & Joint Journal , London, UK
- University College London Hospitals, and The NIHR Biomedical Research Centre at UCLH, London, UK
- Princess Grace Hospital, London, UK
| |
Collapse
|