1
|
Kral J, Hradis M, Buzga M, Kunovsky L. Exploring the benefits and challenges of AI-driven large language models in gastroenterology: Think out of the box. Biomed Pap Med Fac Univ Palacky Olomouc Czech Repub 2024. [PMID: 39234774 DOI: 10.5507/bp.2024.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2024] Open
Abstract
Artificial Intelligence (AI) has evolved significantly over the past decades, from its early concepts in the 1950s to the present era of deep learning and natural language processing. Advanced large language models (LLMs), such as Chatbot Generative Pre-Trained Transformer (ChatGPT) is trained to generate human-like text responses. This technology has the potential to revolutionize various aspects of gastroenterology, including diagnosis, treatment, education, and decision-making support. The benefits of using LLMs in gastroenterology could include accelerating diagnosis and treatment, providing personalized care, enhancing education and training, assisting in decision-making, and improving communication with patients. However, drawbacks and challenges such as limited AI capability, training on possibly biased data, data errors, security and privacy concerns, and implementation costs must be addressed to ensure the responsible and effective use of this technology. The future of LLMs in gastroenterology relies on the ability to process and analyse large amounts of data, identify patterns, and summarize information and thus assist physicians in creating personalized treatment plans. As AI advances, LLMs will become more accurate and efficient, allowing for faster diagnosis and treatment of gastroenterological conditions. Ensuring effective collaboration between AI developers, healthcare professionals, and regulatory bodies is essential for the responsible and effective use of this technology. By finding the right balance between AI and human expertise and addressing the limitations and risks associated with its use, LLMs can play an increasingly significant role in gastroenterology, contributing to better patient care and supporting doctors in their work.
Collapse
Affiliation(s)
- Jan Kral
- Department of Internal Medicine, University Hospital Motol and Second Faculty of Medicine, Charles University, Prague, Czech Republic
- Department of Hepatogastroenterology, Institute for Clinical and Experimental Medicine, Prague, Czech Republic
| | - Michal Hradis
- MAIA LABS s.r.o., Brno, Czech Republic
- Faculty of Information Technology, University of Technology, Brno, Czech Republic
| | - Marek Buzga
- Department of Physiology and Pathophysiology, Faculty of Medicine, University of Ostrava, Ostrava, Czech Republic
- Institute of Laboratory Medicine, University Hospital Ostrava, Ostrava, Czech Republic
| | - Lumir Kunovsky
- 2nd Department of Internal Medicine - Gastroenterology and Geriatrics, University Hospital Olomouc and Faculty of Medicine and Dentistry, Palacky University Olomouc, Olomouc, Czech Republic
- Department of Surgery, University Hospital Brno and Faculty of Medicine, Masaryk University, Brno, Czech Republic
- Department of Gastroenterology and Digestive Endoscopy, Masaryk Memorial Cancer Institute, Brno, Czech Republic
| |
Collapse
|
2
|
Fatima A, Shafique MA, Alam K, Fadlalla Ahmed TK, Mustafa MS. ChatGPT in medicine: A cross-disciplinary systematic review of ChatGPT's (artificial intelligence) role in research, clinical practice, education, and patient interaction. Medicine (Baltimore) 2024; 103:e39250. [PMID: 39121303 PMCID: PMC11315549 DOI: 10.1097/md.0000000000039250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 07/19/2024] [Indexed: 08/11/2024] Open
Abstract
BACKGROUND ChatGPT, a powerful AI language model, has gained increasing prominence in medicine, offering potential applications in healthcare, clinical decision support, patient communication, and medical research. This systematic review aims to comprehensively assess the applications of ChatGPT in healthcare education, research, writing, patient communication, and practice while also delineating potential limitations and areas for improvement. METHOD Our comprehensive database search retrieved relevant papers from PubMed, Medline and Scopus. After the screening process, 83 studies met the inclusion criteria. This review includes original studies comprising case reports, analytical studies, and editorials with original findings. RESULT ChatGPT is useful for scientific research and academic writing, and assists with grammar, clarity, and coherence. This helps non-English speakers and improves accessibility by breaking down linguistic barriers. However, its limitations include probable inaccuracy and ethical issues, such as bias and plagiarism. ChatGPT streamlines workflows and offers diagnostic and educational potential in healthcare but exhibits biases and lacks emotional sensitivity. It is useful in inpatient communication, but requires up-to-date data and faces concerns about the accuracy of information and hallucinatory responses. CONCLUSION Given the potential for ChatGPT to transform healthcare education, research, and practice, it is essential to approach its adoption in these areas with caution due to its inherent limitations.
Collapse
Affiliation(s)
- Afia Fatima
- Department of Medicine, Jinnah Sindh Medical University, Karachi, Pakistan
| | | | - Khadija Alam
- Department of Medicine, Liaquat National Medical College, Karachi, Pakistan
| | | | | |
Collapse
|
3
|
Pfeffer MA, Ling SSH, Wong JKW. Exploring the frontier: Transformer-based models in EEG signal analysis for brain-computer interfaces. Comput Biol Med 2024; 178:108705. [PMID: 38865781 DOI: 10.1016/j.compbiomed.2024.108705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Revised: 05/01/2024] [Accepted: 06/01/2024] [Indexed: 06/14/2024]
Abstract
This review systematically explores the application of transformer-based models in EEG signal processing and brain-computer interface (BCI) development, with a distinct focus on ensuring methodological rigour and adhering to empirical validations within the existing literature. By examining various transformer architectures, such as the Temporal Spatial Transformer Network (TSTN) and EEG Conformer, this review delineates their capabilities in mitigating challenges intrinsic to EEG data, such as noise and artifacts, and their subsequent implications on decoding and classification accuracies across disparate mental tasks. The analytical scope extends to a meticulous examination of attention mechanisms within transformer models, delineating their role in illuminating critical temporal and spatial EEG features and facilitating interpretability in model decision-making processes. The discourse additionally encapsulates emerging works that substantiate the efficacy of transformer models in noise reduction of EEG signals and diversifying applications beyond the conventional motor imagery paradigm. Furthermore, this review elucidates evident gaps and propounds exploratory avenues in the applications of pre-trained transformers in EEG analysis and the potential expansion into real-time and multi-task BCI applications. Collectively, this review distils extant knowledge, navigates through the empirical findings, and puts forward a structured synthesis, thereby serving as a conduit for informed future research endeavours in transformer-enhanced, EEG-based BCI systems.
Collapse
Affiliation(s)
- Maximilian Achim Pfeffer
- Faculty of Engineering and Information Technology, University of Technology Sydney, CB11 81-113, Broadway, Ultimo, 2007, New South Wales, Australia.
| | - Steve Sai Ho Ling
- Faculty of Engineering and Information Technology, University of Technology Sydney, CB11 81-113, Broadway, Ultimo, 2007, New South Wales, Australia.
| | - Johnny Kwok Wai Wong
- Faculty of Design, Architecture and Building, University of Technology Sydney, 15 Broadway, Ultimo, 2007, New South Wales, Australia.
| |
Collapse
|
4
|
Kim SE, Lee JH, Choi BS, Han HS, Lee MC, Ro DH. Performance of ChatGPT on Solving Orthopedic Board-Style Questions: A Comparative Analysis of ChatGPT 3.5 and ChatGPT 4. Clin Orthop Surg 2024; 16:669-673. [PMID: 39092297 PMCID: PMC11262944 DOI: 10.4055/cios23179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 01/29/2024] [Accepted: 01/29/2024] [Indexed: 08/04/2024] Open
Abstract
Background The application of artificial intelligence and large language models in the medical field requires an evaluation of their accuracy in providing medical information. This study aimed to assess the performance of Chat Generative Pre-trained Transformer (ChatGPT) models 3.5 and 4 in solving orthopedic board-style questions. Methods A total of 160 text-only questions from the Orthopedic Surgery Department at Seoul National University Hospital, conforming to the format of the Korean Orthopedic Association board certification examinations, were input into the ChatGPT 3.5 and ChatGPT 4 programs. The questions were divided into 11 subcategories. The accuracy rates of the initial answers provided by Chat GPT 3.5 and ChatGPT 4 were analyzed. In addition, inconsistency rates of answers were evaluated by regenerating the responses. Results ChatGPT 3.5 answered 37.5% of the questions correctly, while ChatGPT 4 showed an accuracy rate of 60.0% (p < 0.001). ChatGPT 4 demonstrated superior performance across most subcategories, except for the tumor-related questions. The rates of inconsistency in answers were 47.5% for ChatGPT 3.5 and 9.4% for ChatGPT 4. Conclusions ChatGPT 4 showed the ability to pass orthopedic board-style examinations, outperforming ChatGPT 3.5 in accuracy rate. However, inconsistencies in response generation and instances of incorrect answers with misleading explanations require caution when applying ChatGPT in clinical settings or for educational purposes.
Collapse
Affiliation(s)
- Sung Eun Kim
- Department of Orthopedic Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Ji Han Lee
- Department of Orthopedic Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Byung Sun Choi
- Department of Orthopedic Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Hyuk-Soo Han
- Department of Orthopedic Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Myung Chul Lee
- Department of Orthopedic Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Du Hyun Ro
- Department of Orthopedic Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| |
Collapse
|
5
|
Jo E, Song S, Kim JH, Lim S, Kim JH, Cha JJ, Kim YM, Joo HJ. Assessing GPT-4's Performance in Delivering Medical Advice: Comparative Analysis With Human Experts. JMIR MEDICAL EDUCATION 2024; 10:e51282. [PMID: 38989848 PMCID: PMC11250047 DOI: 10.2196/51282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 04/10/2024] [Accepted: 04/19/2024] [Indexed: 07/12/2024]
Abstract
Background Accurate medical advice is paramount in ensuring optimal patient care, and misinformation can lead to misguided decisions with potentially detrimental health outcomes. The emergence of large language models (LLMs) such as OpenAI's GPT-4 has spurred interest in their potential health care applications, particularly in automated medical consultation. Yet, rigorous investigations comparing their performance to human experts remain sparse. Objective This study aims to compare the medical accuracy of GPT-4 with human experts in providing medical advice using real-world user-generated queries, with a specific focus on cardiology. It also sought to analyze the performance of GPT-4 and human experts in specific question categories, including drug or medication information and preliminary diagnoses. Methods We collected 251 pairs of cardiology-specific questions from general users and answers from human experts via an internet portal. GPT-4 was tasked with generating responses to the same questions. Three independent cardiologists (SL, JHK, and JJC) evaluated the answers provided by both human experts and GPT-4. Using a computer interface, each evaluator compared the pairs and determined which answer was superior, and they quantitatively measured the clarity and complexity of the questions as well as the accuracy and appropriateness of the responses, applying a 3-tiered grading scale (low, medium, and high). Furthermore, a linguistic analysis was conducted to compare the length and vocabulary diversity of the responses using word count and type-token ratio. Results GPT-4 and human experts displayed comparable efficacy in medical accuracy ("GPT-4 is better" at 132/251, 52.6% vs "Human expert is better" at 119/251, 47.4%). In accuracy level categorization, humans had more high-accuracy responses than GPT-4 (50/237, 21.1% vs 30/238, 12.6%) but also a greater proportion of low-accuracy responses (11/237, 4.6% vs 1/238, 0.4%; P=.001). GPT-4 responses were generally longer and used a less diverse vocabulary than those of human experts, potentially enhancing their comprehensibility for general users (sentence count: mean 10.9, SD 4.2 vs mean 5.9, SD 3.7; P<.001; type-token ratio: mean 0.69, SD 0.07 vs mean 0.79, SD 0.09; P<.001). Nevertheless, human experts outperformed GPT-4 in specific question categories, notably those related to drug or medication information and preliminary diagnoses. These findings highlight the limitations of GPT-4 in providing advice based on clinical experience. Conclusions GPT-4 has shown promising potential in automated medical consultation, with comparable medical accuracy to human experts. However, challenges remain particularly in the realm of nuanced clinical judgment. Future improvements in LLMs may require the integration of specific clinical reasoning pathways and regulatory oversight for safe use. Further research is needed to understand the full potential of LLMs across various medical specialties and conditions.
Collapse
Affiliation(s)
- Eunbeen Jo
- Department of Medical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
| | - Sanghoun Song
- Department of Linguistics, Korea University, Seoul, Republic of Korea
| | - Jong-Ho Kim
- Korea University Research Institute for Medical Bigdata Science, Korea University, Seoul, Republic of Korea
- Department of Cardiology, Cardiovascular Center, Korea University College of Medicine, Seoul, Republic of Korea
| | - Subin Lim
- Division of Cardiology, Department of Internal Medicine, Korea University Anam Hospital, Seoul, Republic of Korea
| | - Ju Hyeon Kim
- Division of Cardiology, Department of Internal Medicine, Korea University Anam Hospital, Seoul, Republic of Korea
| | - Jung-Joon Cha
- Division of Cardiology, Department of Internal Medicine, Korea University Anam Hospital, Seoul, Republic of Korea
| | - Young-Min Kim
- School of Interdisciplinary Industrial Studies, Hanyang University, Seoul, Republic of Korea
| | - Hyung Joon Joo
- Department of Medical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
- Korea University Research Institute for Medical Bigdata Science, Korea University, Seoul, Republic of Korea
- Department of Cardiology, Cardiovascular Center, Korea University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
6
|
Yao JJ, Aggarwal M, Lopez RD, Namdari S. Current Concepts Review: Large Language Models in Orthopaedics: Definitions, Uses, and Limitations. J Bone Joint Surg Am 2024:00004623-990000000-01136. [PMID: 38896652 DOI: 10.2106/jbjs.23.01417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
➤ Large language models are a subset of artificial intelligence. Large language models are powerful tools that excel in natural language text processing and generation.➤ There are many potential clinical, research, and educational applications of large language models in orthopaedics, but the development of these applications needs to be focused on patient safety and the maintenance of high standards.➤ There are numerous methodological, ethical, and regulatory concerns with regard to the use of large language models. Orthopaedic surgeons need to be aware of the controversies and advocate for an alignment of these models with patient and caregiver priorities.
Collapse
Affiliation(s)
- Jie J Yao
- Rothman Orthopaedic Institute, Thomas Jefferson University, Philadelphia, Pennsylvania
| | | | - Ryan D Lopez
- Rothman Orthopaedic Institute, Thomas Jefferson University, Philadelphia, Pennsylvania
| | - Surena Namdari
- Rothman Orthopaedic Institute, Thomas Jefferson University, Philadelphia, Pennsylvania
| |
Collapse
|
7
|
Buitrago-Esquinas EM, Puig-Cabrera M, Santos JAC, Custódio-Santos M, Yñiguez-Ovando R. Developing a hetero-intelligence methodological framework for sustainable policy-making based on the assessment of large language models. MethodsX 2024; 12:102707. [PMID: 38650999 PMCID: PMC11033193 DOI: 10.1016/j.mex.2024.102707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 04/09/2024] [Indexed: 04/25/2024] Open
Abstract
This work delves into the increasing relevance of Large Language Models (LLMs) in the realm of sustainable policy-making, proposing an innovative hetero-intelligence framework that blends human and artificial intelligence (AI) for tackling modern sustainability challenges. The research methodology includes a hetero-intelligence performance test, which juxtaposes human intelligence with AI in the formulation and implementation of sustainable policies. After testing this hetero-intelligence methodology, seven steps are rigorously described so that it can be replicated in any sustainability planning related context. The results underscore the capabilities and limitations of LLMs, underscoring the critical role of human intelligence in enhancing the efficacy of hetero-intelligence systems. This work fulfils the need of a rigorous methodological framework based on empirical steps that can provide unbiased outcomes to be integrated into sustainable planning and decision-making processes.•Assesses LLMs' limitations and capabilities regarding sustainable planning issues•A replicable methodology is proposed based on the combination of both human and artificial intelligence•It proposes and systematises the integration of a hetero-intelligent approach into the formulation of sustainability policies to be more efficient and effective.
Collapse
Affiliation(s)
- Eva M. Buitrago-Esquinas
- Faculty of Economics and Business Sciences, Universidad de Sevilla, Spain
- Research Centre for Tourism, Sustainability and Well-being (CinTurs), Universidade do Algarve, Faro, Portugal
| | - Miguel Puig-Cabrera
- Faculty of Economics and Business Sciences, Universidad de Sevilla, Spain
- Research Centre for Tourism, Sustainability and Well-being (CinTurs), Universidade do Algarve, Faro, Portugal
| | - José António C. Santos
- Faculty of Economics and Business Sciences, Universidad de Sevilla, Spain
- Research Centre for Tourism, Sustainability and Well-being (CinTurs), Universidade do Algarve, Faro, Portugal
| | - Margarida Custódio-Santos
- Faculty of Economics and Business Sciences, Universidad de Sevilla, Spain
- Research Centre for Tourism, Sustainability and Well-being (CinTurs), Universidade do Algarve, Faro, Portugal
| | - Rocío Yñiguez-Ovando
- Faculty of Economics and Business Sciences, Universidad de Sevilla, Spain
- Research Centre for Tourism, Sustainability and Well-being (CinTurs), Universidade do Algarve, Faro, Portugal
| |
Collapse
|
8
|
UYGUN İLİKHAN S, ÖZER M, TANBERKAN H, BOZKURT V. How to mitigate the risks of deployment of artificial intelligence in medicine? Turk J Med Sci 2024; 54:483-492. [PMID: 39050000 PMCID: PMC11265878 DOI: 10.55730/1300-0144.5814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 06/12/2024] [Accepted: 05/20/2024] [Indexed: 07/27/2024] Open
Abstract
The aim of this study is to examine the risks associated with the use of artificial intelligence (AI) in medicine and to offer policy suggestions to reduce these risks and optimize the benefits of AI technology. AI is a multifaceted technology. If harnessed effectively, it has the capacity to significantly impact the future of humanity in the field of health, as well as in several other areas. However, the rapid spread of this technology also raises significant ethical, legal, and social issues. This study examines the potential dangers of AI integration in medicine by reviewing current scientific work and exploring strategies to mitigate these risks. Biases in data sets for AI systems can lead to inequities in health care. Educational data that is narrowly represented based on a demographic group can lead to biased results from AI systems for those who do not belong to that group. In addition, the concepts of explainability and accountability in AI systems could create challenges for healthcare professionals in understanding and evaluating AI-generated diagnoses or treatment recommendations. This could jeopardize patient safety and lead to the selection of inappropriate treatments. Ensuring the security of personal health information will be critical as AI systems become more widespread. Therefore, improving patient privacy and security protocols for AI systems is imperative. The report offers suggestions for reducing the risks associated with the increasing use of AI systems in the medical sector. These include increasing AI literacy, implementing a participatory society-in-the-loop management strategy, and creating ongoing education and auditing systems. Integrating ethical principles and cultural values into the design of AI systems can help reduce healthcare disparities and improve patient care. Implementing these recommendations will ensure the efficient and equitable use of AI systems in medicine, improve the quality of healthcare services, and ensure patient safety.
Collapse
Affiliation(s)
- Sevil UYGUN İLİKHAN
- Department of Internal Medicine Sciences, Gülhane Faculty of Medicine, University of Health Sciences, Ankara,
Turkiye
| | - Mahmut ÖZER
- Commission of National Education, Culture, Youth and Sports of the Parliament, Ankara,
Turkiye
| | | | - Veysel BOZKURT
- Department of Economic Sociology, Faculty of Economics, İstanbul University, İstanbul,
Turkiye
| |
Collapse
|
9
|
Niko MM, Karbasi Z, Kazemi M, Zahmatkeshan M. Comparing ChatGPT and Bing, in response to the Home Blood Pressure Monitoring (HBPM) knowledge checklist. Hypertens Res 2024; 47:1401-1409. [PMID: 38438722 DOI: 10.1038/s41440-024-01624-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/23/2024] [Accepted: 01/27/2024] [Indexed: 03/06/2024]
Abstract
High blood pressure is one of the major public health problems that is prevalent worldwide. Due to the rapid increase in the number of users of artificial intelligence tools such as ChatGPT and Bing, it is expected that patients will use these tools as a source of information to obtain information about high blood pressure. The purpose of this study is to check the accuracy, completeness, and reproducibility of answers provided by ChatGPT and Bing to the knowledge questionnaire of blood pressure control at home. In this study, ChatGPT and Bing's responses to the HBPM 10-question knowledge checklist on blood pressure measurement were independently reviewed by three cardiologists. The mean accuracy rating of ChatGPT was 5.96 (SD = 0.17) indicating the responses were highly accurate overall, with the vast majority receiving the top score. The mean accuracy and completeness of ChatGPT were 5.96 (SD = 0.17) and 2.93 (SD = 0.25) and in Bing were 5.31 (SD = 0.67), and 2.13 (SD = 0.53) Respectively. Due to the expansion of artificial intelligence applications, patients can use new tools such as ChatGPT and Bing to search for information and at the same time can trust the information obtained. we found that the answers obtained from ChatGPT are reliable and valuable for patients, while Bing is also considered a powerful tool, it has more limitations than ChatGPT, and the answers should be interpreted with caution.
Collapse
Affiliation(s)
| | - Zahra Karbasi
- Department of Health Information Sciences, Faculty of Management and Medical Information Sciences, Kerman University of Medical Sciences, Kerman, Iran
| | - Maryam Kazemi
- Noncommunicable Diseases Research Center, Fasa University of Medical Sciences, Fasa, Iran
| | - Maryam Zahmatkeshan
- Noncommunicable Diseases Research Center, Fasa University of Medical Sciences, Fasa, Iran.
- School of Allied Medical Sciences, Fasa University of Medical Sciences, Fasa, Iran.
| |
Collapse
|
10
|
Feng Y, Han J, Lan X. After one year of ChatGPT's launch: reflections on artificial intelligence in scientific writing. Eur J Nucl Med Mol Imaging 2024; 51:1203-1204. [PMID: 38236428 DOI: 10.1007/s00259-023-06579-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Affiliation(s)
- Yuan Feng
- Department of Nuclear Medicine, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Key Laboratory of Molecular Imaging, Wuhan, China
- Key Laboratory of Biological Targeted Therapy, The Ministry of Education, Wuhan, 430022, China
| | | | - Xiaoli Lan
- Department of Nuclear Medicine, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
- Hubei Key Laboratory of Molecular Imaging, Wuhan, China.
- Key Laboratory of Biological Targeted Therapy, The Ministry of Education, Wuhan, 430022, China.
| |
Collapse
|
11
|
Cil G, Dogan K. The efficacy of artificial intelligence in urology: a detailed analysis of kidney stone-related queries. World J Urol 2024; 42:158. [PMID: 38483582 PMCID: PMC10940482 DOI: 10.1007/s00345-024-04847-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 01/24/2024] [Indexed: 03/17/2024] Open
Abstract
PURPOSE The study aimed to assess the efficacy of OpenAI's advanced AI model, ChatGPT, in diagnosing urological conditions, focusing on kidney stones. MATERIALS AND METHODS A set of 90 structured questions, compliant with EAU Guidelines 2023, was curated by seasoned urologists for this investigation. We evaluated ChatGPT's performance based on the accuracy and completeness of its responses to two types of questions [binary (true/false) and descriptive (multiple-choice)], stratified into difficulty levels: easy, moderate, and complex. Furthermore, we analyzed the model's learning and adaptability capacity by reassessing the initially incorrect responses after a 2 week interval. RESULTS The model demonstrated commendable accuracy, correctly answering 80% of binary questions (n:45) and 93.3% of descriptive questions (n:45). The model's performance showed no significant variation across different question difficulty levels, with p-values of 0.548 for accuracy and 0.417 for completeness, respectively. Upon reassessment of initially 12 incorrect responses (9 binary to 3 descriptive) after two weeks, ChatGPT's accuracy showed substantial improvement. The mean accuracy score significantly increased from 1.58 ± 0.51 to 2.83 ± 0.93 (p = 0.004), underlining the model's ability to learn and adapt over time. CONCLUSION These findings highlight the potential of ChatGPT in urological diagnostics, but also underscore areas requiring enhancement, especially in the completeness of responses to complex queries. The study endorses AI's incorporation into healthcare, while advocating for prudence and professional supervision in its application.
Collapse
Affiliation(s)
- Gökhan Cil
- Department of Urology, Bagcilar Training and Research Hospital, University of Health Sciences, Istanbul, Turkey.
| | - Kazim Dogan
- Department of Urology, Faculty of Medicine, Istinye University, Istanbul, Turkey
| |
Collapse
|
12
|
Raza MM, Venkatesh KP, Kvedar JC. Generative AI and large language models in health care: pathways to implementation. NPJ Digit Med 2024; 7:62. [PMID: 38454007 PMCID: PMC10920625 DOI: 10.1038/s41746-023-00988-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/06/2023] [Indexed: 03/09/2024] Open
|
13
|
Wei Q, Yao Z, Cui Y, Wei B, Jin Z, Xu X. Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis. J Biomed Inform 2024; 151:104620. [PMID: 38462064 DOI: 10.1016/j.jbi.2024.104620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/27/2024] [Accepted: 02/29/2024] [Indexed: 03/12/2024]
Abstract
OBJECTIVE Large language models (LLMs) such as ChatGPT are increasingly explored in medical domains. However, the absence of standard guidelines for performance evaluation has led to methodological inconsistencies. This study aims to summarize the available evidence on evaluating ChatGPT's performance in answering medical questions and provide direction for future research. METHODS An extensive literature search was conducted on June 15, 2023, across ten medical databases. The keyword used was "ChatGPT," without restrictions on publication type, language, or date. Studies evaluating ChatGPT's performance in answering medical questions were included. Exclusions comprised review articles, comments, patents, non-medical evaluations of ChatGPT, and preprint studies. Data was extracted on general study characteristics, question sources, conversation processes, assessment metrics, and performance of ChatGPT. An evaluation framework for LLM in medical inquiries was proposed by integrating insights from selected literature. This study is registered with PROSPERO, CRD42023456327. RESULTS A total of 3520 articles were identified, of which 60 were reviewed and summarized in this paper and 17 were included in the meta-analysis. ChatGPT displayed an overall integrated accuracy of 56 % (95 % CI: 51 %-60 %, I2 = 87 %) in addressing medical queries. However, the studies varied in question resource, question-asking process, and evaluation metrics. As per our proposed evaluation framework, many studies failed to report methodological details, such as the date of inquiry, version of ChatGPT, and inter-rater consistency. CONCLUSION This review reveals ChatGPT's potential in addressing medical inquiries, but the heterogeneity of the study design and insufficient reporting might affect the results' reliability. Our proposed evaluation framework provides insights for the future study design and transparent reporting of LLM in responding to medical questions.
Collapse
Affiliation(s)
- Qiuhong Wei
- Big Data Center for Children's Medical Care, Children's Hospital of Chongqing Medical University, Chongqing, China; Children Nutrition Research Center, Children's Hospital of Chongqing Medical University, Chongqing, China; National Clinical Research Center for Child Health and Disorders, Ministry of Education Key Laboratory of Child Development and Disorders, China International Science and Technology Cooperation Base of Child Development and Critical Disorders, Chongqing Key Laboratory of Child Neurodevelopment and Cognitive Disorders, Chongqing, China
| | - Zhengxiong Yao
- Department of Neurology, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Ying Cui
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Bo Wei
- Department of Global Statistics and Data Science, BeiGene USA Inc., San Mateo, CA, USA
| | - Zhezhen Jin
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - Ximing Xu
- Big Data Center for Children's Medical Care, Children's Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
14
|
Abi-Rafeh J, Xu HH, Kazan R, Tevlin R, Furnas H. Large Language Models and Artificial Intelligence: A Primer for Plastic Surgeons on the Demonstrated and Potential Applications, Promises, and Limitations of ChatGPT. Aesthet Surg J 2024; 44:329-343. [PMID: 37562022 DOI: 10.1093/asj/sjad260] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 08/02/2023] [Accepted: 08/04/2023] [Indexed: 08/12/2023] Open
Abstract
BACKGROUND The rapidly evolving field of artificial intelligence (AI) holds great potential for plastic surgeons. ChatGPT, a recently released AI large language model (LLM), promises applications across many disciplines, including healthcare. OBJECTIVES The aim of this article was to provide a primer for plastic surgeons on AI, LLM, and ChatGPT, including an analysis of current demonstrated and proposed clinical applications. METHODS A systematic review was performed identifying medical and surgical literature on ChatGPT's proposed clinical applications. Variables assessed included applications investigated, command tasks provided, user input information, AI-emulated human skills, output validation, and reported limitations. RESULTS The analysis included 175 articles reporting on 13 plastic surgery applications and 116 additional clinical applications, categorized by field and purpose. Thirty-four applications within plastic surgery are thus proposed, with relevance to different target audiences, including attending plastic surgeons (n = 17, 50%), trainees/educators (n = 8, 24.0%), researchers/scholars (n = 7, 21%), and patients (n = 2, 6%). The 15 identified limitations of ChatGPT were categorized by training data, algorithm, and ethical considerations. CONCLUSIONS Widespread use of ChatGPT in plastic surgery will depend on rigorous research of proposed applications to validate performance and address limitations. This systemic review aims to guide research, development, and regulation to safely adopt AI in plastic surgery.
Collapse
|
15
|
Ray PP. Striking a balance: embracing LLMs while upholding scientific integrity. Jpn J Radiol 2024; 42:208-209. [PMID: 37775670 DOI: 10.1007/s11604-023-01489-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 09/11/2023] [Indexed: 10/01/2023]
|
16
|
Younis HA, Eisa TAE, Nasser M, Sahib TM, Noor AA, Alyasiri OM, Salisu S, Hayder IM, Younis HA. A Systematic Review and Meta-Analysis of Artificial Intelligence Tools in Medicine and Healthcare: Applications, Considerations, Limitations, Motivation and Challenges. Diagnostics (Basel) 2024; 14:109. [PMID: 38201418 PMCID: PMC10802884 DOI: 10.3390/diagnostics14010109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 12/02/2023] [Accepted: 12/04/2023] [Indexed: 01/12/2024] Open
Abstract
Artificial intelligence (AI) has emerged as a transformative force in various sectors, including medicine and healthcare. Large language models like ChatGPT showcase AI's potential by generating human-like text through prompts. ChatGPT's adaptability holds promise for reshaping medical practices, improving patient care, and enhancing interactions among healthcare professionals, patients, and data. In pandemic management, ChatGPT rapidly disseminates vital information. It serves as a virtual assistant in surgical consultations, aids dental practices, simplifies medical education, and aids in disease diagnosis. A total of 82 papers were categorised into eight major areas, which are G1: treatment and medicine, G2: buildings and equipment, G3: parts of the human body and areas of the disease, G4: patients, G5: citizens, G6: cellular imaging, radiology, pulse and medical images, G7: doctors and nurses, and G8: tools, devices and administration. Balancing AI's role with human judgment remains a challenge. A systematic literature review using the PRISMA approach explored AI's transformative potential in healthcare, highlighting ChatGPT's versatile applications, limitations, motivation, and challenges. In conclusion, ChatGPT's diverse medical applications demonstrate its potential for innovation, serving as a valuable resource for students, academics, and researchers in healthcare. Additionally, this study serves as a guide, assisting students, academics, and researchers in the field of medicine and healthcare alike.
Collapse
Affiliation(s)
- Hussain A. Younis
- College of Education for Women, University of Basrah, Basrah 61004, Iraq
| | | | - Maged Nasser
- Computer & Information Sciences Department, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Malaysia;
| | - Thaeer Mueen Sahib
- Kufa Technical Institute, Al-Furat Al-Awsat Technical University, Kufa 54001, Iraq;
| | - Ameen A. Noor
- Computer Science Department, College of Education, University of Almustansirya, Baghdad 10045, Iraq;
| | | | - Sani Salisu
- Department of Information Technology, Federal University Dutse, Dutse 720101, Nigeria;
| | - Israa M. Hayder
- Qurna Technique Institute, Southern Technical University, Basrah 61016, Iraq;
| | - Hameed AbdulKareem Younis
- Department of Cybersecurity, College of Computer Science and Information Technology, University of Basrah, Basrah 61016, Iraq;
| |
Collapse
|
17
|
Khangembam BC. ChatGPT in Nuclear Medicine: Expanding Possibilities and Navigating Challenges. Indian J Nucl Med 2024; 39:69-70. [PMID: 38817717 PMCID: PMC11135365 DOI: 10.4103/ijnm.ijnm_67_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 08/09/2023] [Accepted: 09/04/2023] [Indexed: 06/01/2024] Open
|
18
|
Bayani A, Ayotte A, Nikiema JN. Automated Credibility Assessment of Web-Based Health Information Considering Health on the Net Foundation Code of Conduct (HONcode): Model Development and Validation Study. JMIR Form Res 2023; 7:e52995. [PMID: 38133919 PMCID: PMC10770789 DOI: 10.2196/52995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 12/12/2023] [Accepted: 12/13/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND An increasing number of users are turning to web-based sources as an important source of health care guidance information. Thus, trustworthy sources of information should be automatically identifiable using objective criteria. OBJECTIVE The purpose of this study was to automate the assessment of the Health on the Net Foundation Code of Conduct (HONcode) criteria, enhancing our ability to pinpoint trustworthy health information sources. METHODS A data set of 538 web pages displaying health content was collected from 43 health-related websites. HONcode criteria have been considered as web page and website levels. For the website-level criteria (confidentiality, transparency, financial disclosure, and advertising policy), a bag of keywords has been identified to assess the criteria using a rule-based model. For the web page-level criteria (authority, complementarity, justifiability, and attribution) several machine learning (ML) approaches were used. In total, 200 web pages were manually annotated until achieving a balanced representation in terms of frequency. In total, 3 ML models-random forest, support vector machines (SVM), and Bidirectional Encoder Representations from Transformers (BERT)-were trained on the initial annotated data. A second step of training was implemented for the complementarity criterion using the BERT model for multiclass classification of the complementarity sentences obtained by annotation and data augmentation (positive, negative, and noncommittal sentences). Finally, the remaining web pages were classified using the selected model and 100 sentences were randomly selected for manual review. RESULTS For web page-level criteria, the random forest model showed a good performance for the attribution criterion while displaying subpar performance in the others. BERT and SVM had a stable performance across all the criteria. BERT had a better area under the curve (AUC) of 0.96, 0.98, and 1.00 for neutral sentences, justifiability, and attribution, respectively. SVM had the overall better performance for the classification of complementarity with the AUC equal to 0.98. Finally, SVM and BERT had an equal AUC of 0.98 for the authority criterion. For the website level criteria, the rule-based model was able to retrieve web pages with an accuracy of 0.97 for confidentiality, 0.82 for transparency, and 0.51 for both financial disclosure and advertising policy. The final evaluation of the sentences determined 0.88 of precision and the agreement level of reviewers was computed at 0.82. CONCLUSIONS Our results showed the potential power of automating the HONcode criteria assessment using ML approaches. This approach could be used with different types of pretrained models to accelerate the text annotation, and classification and to improve the performance in low-resource cases. Further work needs to be conducted to determine how to assign different weights to the criteria, as well as to identify additional characteristics that should be considered for consolidating these criteria into a comprehensive reliability score.
Collapse
Affiliation(s)
- Azadeh Bayani
- Centre de recherche en santé publique, Université de Montréal et Centre intégré universitaire de santé et de services sociaux du Centre-Sud-de-l'Île-de-Montréal, Montréal, QC, Canada
- Laboratoire Transformation Numérique en Santé, Montreal, QC, Canada
| | - Alexandre Ayotte
- Centre de recherche en santé publique, Université de Montréal et Centre intégré universitaire de santé et de services sociaux du Centre-Sud-de-l'Île-de-Montréal, Montréal, QC, Canada
- Laboratoire Transformation Numérique en Santé, Montreal, QC, Canada
| | - Jean Noel Nikiema
- Centre de recherche en santé publique, Université de Montréal et Centre intégré universitaire de santé et de services sociaux du Centre-Sud-de-l'Île-de-Montréal, Montréal, QC, Canada
- Laboratoire Transformation Numérique en Santé, Montreal, QC, Canada
- Department of Management, Evaluation and Health Policy, School of Public Health, Université de Montréal, Montéal, QC, Canada
| |
Collapse
|
19
|
Pushpanathan K, Lim ZW, Er Yew SM, Chen DZ, Hui'En Lin HA, Lin Goh JH, Wong WM, Wang X, Jin Tan MC, Chang Koh VT, Tham YC. Popular large language model chatbots' accuracy, comprehensiveness, and self-awareness in answering ocular symptom queries. iScience 2023; 26:108163. [PMID: 37915603 PMCID: PMC10616302 DOI: 10.1016/j.isci.2023.108163] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/19/2023] [Accepted: 10/05/2023] [Indexed: 11/03/2023] Open
Abstract
In light of growing interest in using emerging large language models (LLMs) for self-diagnosis, we systematically assessed the performance of ChatGPT-3.5, ChatGPT-4.0, and Google Bard in delivering proficient responses to 37 common inquiries regarding ocular symptoms. Responses were masked, randomly shuffled, and then graded by three consultant-level ophthalmologists for accuracy (poor, borderline, good) and comprehensiveness. Additionally, we evaluated the self-awareness capabilities (ability to self-check and self-correct) of the LLM-Chatbots. 89.2% of ChatGPT-4.0 responses were 'good'-rated, outperforming ChatGPT-3.5 (59.5%) and Google Bard (40.5%) significantly (all p < 0.001). All three LLM-Chatbots showed optimal mean comprehensiveness scores as well (ranging from 4.6 to 4.7 out of 5). However, they exhibited subpar to moderate self-awareness capabilities. Our study underscores the potential of ChatGPT-4.0 in delivering accurate and comprehensive responses to ocular symptom inquiries. Future rigorous validation of their performance is crucial to ensure their reliability and appropriateness for actual clinical use.
Collapse
Affiliation(s)
- Krithi Pushpanathan
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Zhi Wei Lim
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Samantha Min Er Yew
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - David Ziyou Chen
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Ophthalmology, National University Hospital, Singapore, Singapore
| | - Hazel Anne Hui'En Lin
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Ophthalmology, National University Hospital, Singapore, Singapore
| | - Jocelyn Hui Lin Goh
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
| | - Wendy Meihua Wong
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Ophthalmology, National University Hospital, Singapore, Singapore
| | - Xiaofei Wang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing, China
- Advanced Innovation Centre for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, China
| | - Marcus Chun Jin Tan
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Ophthalmology, National University Hospital, Singapore, Singapore
| | - Victor Teck Chang Koh
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Ophthalmology, National University Hospital, Singapore, Singapore
| | - Yih-Chung Tham
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
- Ophthalmology and Visual Sciences Academic Clinical Programme (Eye ACP), Duke NUS Medical School, Singapore, Singapore
| |
Collapse
|
20
|
Hirata K, Kamagata K, Ueda D, Yanagawa M, Kawamura M, Nakaura T, Ito R, Tatsugami F, Matsui Y, Yamada A, Fushimi Y, Nozaki T, Fujita S, Fujioka T, Tsuboyama T, Fujima N, Naganawa S. From FDG and beyond: the evolving potential of nuclear medicine. Ann Nucl Med 2023; 37:583-595. [PMID: 37749301 DOI: 10.1007/s12149-023-01865-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 09/09/2023] [Indexed: 09/27/2023]
Abstract
The radiopharmaceutical 2-[fluorine-18]fluoro-2-deoxy-D-glucose (FDG) has been dominantly used in positron emission tomography (PET) scans for over 20 years, and due to its vast utility its applications have expanded and are continuing to expand into oncology, neurology, cardiology, and infectious/inflammatory diseases. More recently, the addition of artificial intelligence (AI) has enhanced nuclear medicine diagnosis and imaging with FDG-PET, and new radiopharmaceuticals such as prostate-specific membrane antigen (PSMA) and fibroblast activation protein inhibitor (FAPI) have emerged. Nuclear medicine therapy using agents such as [177Lu]-dotatate surpasses conventional treatments in terms of efficacy and side effects. This article reviews recently established evidence of FDG and non-FDG drugs and anticipates the future trajectory of nuclear medicine.
Collapse
Affiliation(s)
- Kenji Hirata
- Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita 15, Nishi 7, Kita-ku, Sapporo, Hokkaido, 060-8638, Japan.
| | - Koji Kamagata
- Department of Radiology, Juntendo University Graduate School of Medicine, Bunkyo-ku, Tokyo, 113-8421, Japan
| | - Daiju Ueda
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1-4-3 Asahi-machi, Abeno-ku, Osaka, 545-8585, Japan
| | - Masahiro Yanagawa
- Department of Radiology, Osaka University Graduate School of Medicine, Suita, Osaka, 565-0871, Japan
| | - Mariko Kawamura
- Department of Radiology, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, Aichi, 466-8550, Japan
| | - Takeshi Nakaura
- Department of Diagnostic Radiology, Kumamoto University Graduate School of Medicine, 1-1-1 Honjo Chuo-ku, Kumamoto, 860-8556, Japan
| | - Rintaro Ito
- Department of Radiology, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, Aichi, 466-8550, Japan
| | - Fuminari Tatsugami
- Department of Diagnostic Radiology, Hiroshima University, 1-2-3 Kasumi, Minami-ku, Hiroshima, 734-8551, Japan
| | - Yusuke Matsui
- Department of Radiology, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, 2-5-1 Shikata-cho, Kita-ku, Okayama, 700-8558, Japan
| | - Akira Yamada
- Department of Radiology, Shinshu University School of Medicine, 3-1-1 Asahi, Matsumoto, Nagano, 390-2621, Japan
| | - Yasutaka Fushimi
- Department of Diagnostic Imaging and Nuclear Medicine, Kyoto University Graduate School of Medicine, 54 Shogoin Kawahara-cho, Sakyo-ku, Kyoto, 606-8507, Japan
| | - Taiki Nozaki
- Department of Radiology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-ku, Tokyo, 160-0016, Japan
| | - Shohei Fujita
- Department of Radiology, Graduate School of Medicine and Faculty of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Tomoyuki Fujioka
- Department of Diagnostic Radiology, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo, 113-8519, Japan
| | - Takahiro Tsuboyama
- Department of Radiology, Osaka University Graduate School of Medicine, Suita, Osaka, 565-0871, Japan
| | - Noriyuki Fujima
- Department of Diagnostic and Interventional Radiology, Hokkaido University Hospital, N15, W5, Kita-ku, Sapporo, 060-8638, Japan
| | - Shinji Naganawa
- Department of Radiology, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, Aichi, 466-8550, Japan
| |
Collapse
|
21
|
Levkovich I, Elyoseph Z. Suicide Risk Assessments Through the Eyes of ChatGPT-3.5 Versus ChatGPT-4: Vignette Study. JMIR Ment Health 2023; 10:e51232. [PMID: 37728984 PMCID: PMC10551796 DOI: 10.2196/51232] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 08/22/2023] [Accepted: 08/24/2023] [Indexed: 09/22/2023] Open
Abstract
BACKGROUND ChatGPT, a linguistic artificial intelligence (AI) model engineered by OpenAI, offers prospective contributions to mental health professionals. Although having significant theoretical implications, ChatGPT's practical capabilities, particularly regarding suicide prevention, have not yet been substantiated. OBJECTIVE The study's aim was to evaluate ChatGPT's ability to assess suicide risk, taking into consideration 2 discernable factors-perceived burdensomeness and thwarted belongingness-over a 2-month period. In addition, we evaluated whether ChatGPT-4 more accurately evaluated suicide risk than did ChatGPT-3.5. METHODS ChatGPT was tasked with assessing a vignette that depicted a hypothetical patient exhibiting differing degrees of perceived burdensomeness and thwarted belongingness. The assessments generated by ChatGPT were subsequently contrasted with standard evaluations rendered by mental health professionals. Using both ChatGPT-3.5 and ChatGPT-4 (May 24, 2023), we executed 3 evaluative procedures in June and July 2023. Our intent was to scrutinize ChatGPT-4's proficiency in assessing various facets of suicide risk in relation to the evaluative abilities of both mental health professionals and an earlier version of ChatGPT-3.5 (March 14 version). RESULTS During the period of June and July 2023, we found that the likelihood of suicide attempts as evaluated by ChatGPT-4 was similar to the norms of mental health professionals (n=379) under all conditions (average Z score of 0.01). Nonetheless, a pronounced discrepancy was observed regarding the assessments performed by ChatGPT-3.5 (May version), which markedly underestimated the potential for suicide attempts, in comparison to the assessments carried out by the mental health professionals (average Z score of -0.83). The empirical evidence suggests that ChatGPT-4's evaluation of the incidence of suicidal ideation and psychache was higher than that of the mental health professionals (average Z score of 0.47 and 1.00, respectively). Conversely, the level of resilience as assessed by both ChatGPT-4 and ChatGPT-3.5 (both versions) was observed to be lower in comparison to the assessments offered by mental health professionals (average Z score of -0.89 and -0.90, respectively). CONCLUSIONS The findings suggest that ChatGPT-4 estimates the likelihood of suicide attempts in a manner akin to evaluations provided by professionals. In terms of recognizing suicidal ideation, ChatGPT-4 appears to be more precise. However, regarding psychache, there was an observed overestimation by ChatGPT-4, indicating a need for further research. These results have implications regarding ChatGPT-4's potential to support gatekeepers, patients, and even mental health professionals' decision-making. Despite the clinical potential, intensive follow-up studies are necessary to establish the use of ChatGPT-4's capabilities in clinical practice. The finding that ChatGPT-3.5 frequently underestimates suicide risk, especially in severe cases, is particularly troubling. It indicates that ChatGPT may downplay one's actual suicide risk level.
Collapse
Affiliation(s)
- Inbar Levkovich
- Oranim Academic College, Faculty of Graduate Studies, Kiryat Tivon, Israel
| | - Zohar Elyoseph
- Department of Psychology and Educational Counseling, The Center for Psychobiological Research, Max Stern Yezreel Valley College, Emek Yezreel, Israel
- Department of Brain Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
| |
Collapse
|
22
|
Ordak M. ChatGPT's Skills in Statistical Analysis Using the Example of Allergology: Do We Have Reason for Concern? Healthcare (Basel) 2023; 11:2554. [PMID: 37761751 PMCID: PMC10530997 DOI: 10.3390/healthcare11182554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 09/13/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023] Open
Abstract
BACKGROUND Content generated by artificial intelligence is sometimes not truthful. To date, there have been a number of medical studies related to the validity of ChatGPT's responses; however, there is a lack of studies addressing various aspects of statistical analysis. The aim of this study was to assess the validity of the answers provided by ChatGPT in relation to statistical analysis, as well as to identify recommendations to be implemented in the future in connection with the results obtained. METHODS The study was divided into four parts and was based on the exemplary medical field of allergology. The first part consisted of asking ChatGPT 30 different questions related to statistical analysis. The next five questions included a request for ChatGPT to perform the relevant statistical analyses, and another five requested ChatGPT to indicate which statistical test should be applied to articles accepted for publication in Allergy. The final part of the survey involved asking ChatGPT the same statistical question three times. RESULTS Out of the 40 general questions asked that related to broad statistical analysis, ChatGPT did not fully answer half of them. Assumptions necessary for the application of specific statistical tests were not included. ChatGPT also gave completely divergent answers to one question about which test should be used. CONCLUSION The answers provided by ChatGPT to various statistical questions may give rise to the use of inappropriate statistical tests and, consequently, the subsequent misinterpretation of the research results obtained. Questions asked in this regard need to be framed more precisely.
Collapse
Affiliation(s)
- Michal Ordak
- Department of Pharmacotherapy and Pharmaceutical Care, Faculty of Pharmacy, Medical University of Warsaw, Banacha 1 Str., 02-097 Warsaw, Poland
| |
Collapse
|
23
|
Chatterjee S, Bhattacharya M, Lee SS, Chakraborty C. Can artificial intelligence-strengthened ChatGPT or other large language models transform nucleic acid research? MOLECULAR THERAPY. NUCLEIC ACIDS 2023; 33:205-207. [PMID: 37727444 PMCID: PMC10505907 DOI: 10.1016/j.omtn.2023.06.019] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]
Affiliation(s)
- Srijan Chatterjee
- Institute for Skeletal Aging & Orthopaedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si, Gangwon-do 24252, Republic of Korea
| | - Manojit Bhattacharya
- Department of Zoology, Fakir Mohan University, Vyasa Vihar, Balasore, Odisha 756020, India
| | - Sang-Soo Lee
- Institute for Skeletal Aging & Orthopaedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si, Gangwon-do 24252, Republic of Korea
| | - Chiranjib Chakraborty
- Department of Biotechnology, School of Life Science and Biotechnology, Adamas University, Kolkata, West Bengal 700126, India
| |
Collapse
|
24
|
Currie GM. Academic integrity and artificial intelligence: is ChatGPT hype, hero or heresy? Semin Nucl Med 2023; 53:719-730. [PMID: 37225599 DOI: 10.1053/j.semnuclmed.2023.04.008] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 04/30/2023] [Indexed: 05/26/2023]
Abstract
Academic integrity in both higher education and scientific writing has been challenged by developments in artificial intelligence. The limitations associated with algorithms have been largely overcome by the recently released ChatGPT; a chatbot powered by GPT-3.5 capable of producing accurate and human-like responses to questions in real-time. Despite the potential benefits, ChatGPT confronts significant limitations to its usefulness in nuclear medicine and radiology. Most notably, ChatGPT is prone to errors and fabrication of information which poses a risk to professionalism, ethics and integrity. These limitations simultaneously undermine the value of ChatGPT to the user by not producing outcomes at the expected standard. Nonetheless, there are a number of exciting applications of ChatGPT in nuclear medicine across education, clinical and research sectors. Assimilation of ChatGPT into practice requires redefining of norms, and re-engineering of information expectations.
Collapse
Affiliation(s)
- Geoffrey M Currie
- Charles Sturt University, Wagga Wagga, NSW, Australia; Baylor College of Medicine, Houston, TX.
| |
Collapse
|
25
|
Li W, Zhang Y, Chen F. ChatGPT in Colorectal Surgery: A Promising Tool or a Passing Fad? Ann Biomed Eng 2023; 51:1892-1897. [PMID: 37162695 DOI: 10.1007/s10439-023-03232-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 05/03/2023] [Indexed: 05/11/2023]
Abstract
Colorectal surgery is a specialized branch of surgery that involves the diagnosis and treatment of conditions affecting the colon, rectum, and anus. In the recent years, the use of artificial intelligence (AI) has gained considerable interest in various medical specialties, including surgery. Chatbot Generative Pre-Trained Transformer (ChatGPT), an AI-based chatbot developed by OpenAI, has shown great potential in improving the quality of healthcare delivery by providing accurate and timely information to both patients and healthcare professionals. In this paper, we investigate the potential application of ChatGPT in colorectal surgery. We also discuss the potential advantages and challenges associated with the implementation of ChatGPT in the surgical setting. Furthermore, we address the socio-ethical implications of utilizing ChatGPT in healthcare. This includes concerns over patient privacy, liability, and the potential impact on the doctor-patient relationship. Our findings suggest that ChatGPT has the potential to revolutionize the field of colorectal surgery by providing personalized and precise medical information, reducing errors and complications, and improving patient outcomes.
Collapse
Affiliation(s)
- Wenbo Li
- Department of Nursing, Jinzhou Medical University, Jinzhou, China
| | - Yinxu Zhang
- Department of Colorectal Surgery, The First Affiliated Hospital, Jinzhou Medical University, Jinzhou, 121001, China
| | - Fengmin Chen
- Department of Colorectal Surgery, The First Affiliated Hospital, Jinzhou Medical University, Jinzhou, 121001, China.
| |
Collapse
|
26
|
Watters C, Lemanski MK. Universal skepticism of ChatGPT: a review of early literature on chat generative pre-trained transformer. Front Big Data 2023; 6:1224976. [PMID: 37680954 PMCID: PMC10482048 DOI: 10.3389/fdata.2023.1224976] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 07/10/2023] [Indexed: 09/09/2023] Open
Abstract
ChatGPT, a new language model developed by OpenAI, has garnered significant attention in various fields since its release. This literature review provides an overview of early ChatGPT literature across multiple disciplines, exploring its applications, limitations, and ethical considerations. The review encompasses Scopus-indexed publications from November 2022 to April 2023 and includes 156 articles related to ChatGPT. The findings reveal a predominance of negative sentiment across disciplines, though subject-specific attitudes must be considered. The review highlights the implications of ChatGPT in many fields including healthcare, raising concerns about employment opportunities and ethical considerations. While ChatGPT holds promise for improved communication, further research is needed to address its capabilities and limitations. This literature review provides insights into early research on ChatGPT, informing future investigations and practical applications of chatbot technology, as well as development and usage of generative AI.
Collapse
Affiliation(s)
- Casey Watters
- Faculty of Law, Bond University, Gold Coast, QLD, Australia
| | | |
Collapse
|
27
|
Perera Molligoda Arachchige AS. Large language models (LLM) and ChatGPT: a medical student perspective. Eur J Nucl Med Mol Imaging 2023; 50:2248-2249. [PMID: 37046082 DOI: 10.1007/s00259-023-06227-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 04/04/2023] [Indexed: 04/14/2023]
|
28
|
Srivastav S, Chandrakar R, Gupta S, Babhulkar V, Agrawal S, Jaiswal A, Prasad R, Wanjari MB. ChatGPT in Radiology: The Advantages and Limitations of Artificial Intelligence for Medical Imaging Diagnosis. Cureus 2023; 15:e41435. [PMID: 37546142 PMCID: PMC10404120 DOI: 10.7759/cureus.41435] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 07/06/2023] [Indexed: 08/08/2023] Open
Abstract
This review article provides an overview of using artificial intelligence (AI) in radiology. It discusses the advantages and limitations of ChatGPT, a large language model, for medical imaging diagnosis. ChatGPT has shown great promise in improving the accuracy and efficiency of radiological diagnoses by reducing interpretation variability and errors and improving workflow efficiency. However, there are also limitations, including the need for high-quality training data, ethical considerations, and further research and development to improve its performance and usability. Despite these challenges, ChatGPT has the potential to significantly impact radiology and medical imaging diagnosis. The review article highlights the need for continued research and development, coupled with ethical and regulatory considerations, to ensure that ChatGPT is used to its full potential in improving radiological diagnoses and patient care.
Collapse
Affiliation(s)
- Samriddhi Srivastav
- Medicine, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education & Research, Wardha, IND
| | - Rashi Chandrakar
- Medicine, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education & Research, Wardha, IND
| | - Shalvi Gupta
- Surgery, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education & Research, Wardha, IND
| | - Vaishnavi Babhulkar
- Medicine, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education & Research, Wardha, IND
| | - Sristy Agrawal
- Medicine, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education & Research, Wardha, IND
| | - Arpita Jaiswal
- Obstetrics and Gynaecology, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education & Research, Wardha, IND
| | - Roshan Prasad
- Medicine and Surgery, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education & Research, Wardha, IND
| | - Mayur B Wanjari
- Research and Development, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education & Research, Wardha, IND
| |
Collapse
|
29
|
Fan W, Zhang J, Wang N, Li J, Hu L. The Application of Deep Learning on CBCT in Dentistry. Diagnostics (Basel) 2023; 13:2056. [PMID: 37370951 DOI: 10.3390/diagnostics13122056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 06/06/2023] [Accepted: 06/12/2023] [Indexed: 06/29/2023] Open
Abstract
Cone beam computed tomography (CBCT) has become an essential tool in modern dentistry, allowing dentists to analyze the relationship between teeth and the surrounding tissues. However, traditional manual analysis can be time-consuming and its accuracy depends on the user's proficiency. To address these limitations, deep learning (DL) systems have been integrated into CBCT analysis to improve accuracy and efficiency. Numerous DL models have been developed for tasks such as automatic diagnosis, segmentation, classification of teeth, inferior alveolar nerve, bone, airway, and preoperative planning. All research articles summarized were from Pubmed, IEEE, Google Scholar, and Web of Science up to December 2022. Many studies have demonstrated that the application of deep learning technology in CBCT examination in dentistry has achieved significant progress, and its accuracy in radiology image analysis has reached the level of clinicians. However, in some fields, its accuracy still needs to be improved. Furthermore, ethical issues and CBCT device differences may prohibit its extensive use. DL models have the potential to be used clinically as medical decision-making aids. The combination of DL and CBCT can highly reduce the workload of image reading. This review provides an up-to-date overview of the current applications of DL on CBCT images in dentistry, highlighting its potential and suggesting directions for future research.
Collapse
Affiliation(s)
- Wenjie Fan
- Department of Stomatology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China
- School of Stomatology, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Jiaqi Zhang
- Department of Stomatology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China
- School of Stomatology, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Nan Wang
- Department of Stomatology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China
- School of Stomatology, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Jia Li
- Department of Stomatology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China
- School of Stomatology, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Li Hu
- Department of Stomatology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China
- School of Stomatology, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| |
Collapse
|
30
|
Komorowski M, Del Pilar Arias López M, Chang AC. How could ChatGPT impact my practice as an intensivist? An overview of potential applications, risks and limitations. Intensive Care Med 2023:10.1007/s00134-023-07096-7. [PMID: 37256340 DOI: 10.1007/s00134-023-07096-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 05/05/2023] [Indexed: 06/01/2023]
Affiliation(s)
- Matthieu Komorowski
- Division of Anaesthetics, Pain Medicine, and Intensive Care, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, SW7 2AZ, UK.
- Intensive Care Unit, Charing Cross Hospital, Fulham Palace Road, London, W6 8RF, UK.
| | - Maria Del Pilar Arias López
- Hospital de Niños Ricardo Gutierrez, Intermediate Care Unit, Gallo 1330, C1425EFD, Buenos Aires, Argentina
- Argentina Society of Intensive Care, SATI-Q Paediatric Program. Av. Cnel, Niceto Vega 4617, C1414BEA, Buenos Aires, Argentina
| | - Anthony C Chang
- Children's Hospital of Orange County Sharon Disney Lund Medical Intelligence and Innovation Institute, 1201 W. La Veta Ave, Orange, CA, 92868-3874, USA
| |
Collapse
|
31
|
Alhaidry HM, Fatani B, Alrayes JO, Almana AM, Alfhaed NK. ChatGPT in Dentistry: A Comprehensive Review. Cureus 2023; 15:e38317. [PMID: 37266053 PMCID: PMC10230850 DOI: 10.7759/cureus.38317] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/29/2023] [Indexed: 06/03/2023] Open
Abstract
Chat generative pre-trained transformer (ChatGPT) is an artificial intelligence chatbot that uses natural language processing that can respond to human input in a conversational manner. ChatGPT has numerous applications in the health care system including dentistry; it is used in diagnoses and for assessing disease risk and scheduling appointments. It also has a role in scientific research. In the dental field, it has provided many benefits such as detecting dental and maxillofacial abnormalities on panoramic radiographs and identifying different dental restorations. Therefore, it helps in decreasing the workload. But even with these benefits, one should take into consideration the risks and limitations of this chatbot. Few articles mentioned the use of ChatGPT in dentistry. This comprehensive review represents data collected from 66 relevant articles using PubMed and Google Scholar as databases. This review aims to discuss all relevant published articles on the use of ChatGPT in dentistry.
Collapse
Affiliation(s)
- Hind M Alhaidry
- Advanced General Dentistry, Prince Sultan Military Medical City, Riyadh, SAU
| | - Bader Fatani
- Dentistry, College of Dentistry, King Saud University, Riyadh, SAU
| | - Jenan O Alrayes
- Dentistry, College of Dentistry, King Saud University, Riyadh, SAU
| | | | - Nawaf K Alfhaed
- Dentistry, College of Dentistry, King Saud University, Riyadh, SAU
| |
Collapse
|
32
|
Sallam M. ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns. Healthcare (Basel) 2023; 11:887. [PMID: 36981544 PMCID: PMC10048148 DOI: 10.3390/healthcare11060887] [Citation(s) in RCA: 517] [Impact Index Per Article: 517.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 03/17/2023] [Accepted: 03/17/2023] [Indexed: 03/22/2023] Open
Abstract
ChatGPT is an artificial intelligence (AI)-based conversational large language model (LLM). The potential applications of LLMs in health care education, research, and practice could be promising if the associated valid concerns are proactively examined and addressed. The current systematic review aimed to investigate the utility of ChatGPT in health care education, research, and practice and to highlight its potential limitations. Using the PRIMSA guidelines, a systematic search was conducted to retrieve English records in PubMed/MEDLINE and Google Scholar (published research or preprints) that examined ChatGPT in the context of health care education, research, or practice. A total of 60 records were eligible for inclusion. Benefits of ChatGPT were cited in 51/60 (85.0%) records and included: (1) improved scientific writing and enhancing research equity and versatility; (2) utility in health care research (efficient analysis of datasets, code generation, literature reviews, saving time to focus on experimental design, and drug discovery and development); (3) benefits in health care practice (streamlining the workflow, cost saving, documentation, personalized medicine, and improved health literacy); and (4) benefits in health care education including improved personalized learning and the focus on critical thinking and problem-based learning. Concerns regarding ChatGPT use were stated in 58/60 (96.7%) records including ethical, copyright, transparency, and legal issues, the risk of bias, plagiarism, lack of originality, inaccurate content with risk of hallucination, limited knowledge, incorrect citations, cybersecurity issues, and risk of infodemics. The promising applications of ChatGPT can induce paradigm shifts in health care education, research, and practice. However, the embrace of this AI chatbot should be conducted with extreme caution considering its potential limitations. As it currently stands, ChatGPT does not qualify to be listed as an author in scientific articles unless the ICMJE/COPE guidelines are revised or amended. An initiative involving all stakeholders in health care education, research, and practice is urgently needed. This will help to set a code of ethics to guide the responsible use of ChatGPT among other LLMs in health care and academia.
Collapse
Affiliation(s)
- Malik Sallam
- Department of Pathology, Microbiology and Forensic Medicine, School of Medicine, The University of Jordan, Amman 11942, Jordan
- Department of Clinical Laboratories and Forensic Medicine, Jordan University Hospital, Amman 11942, Jordan
| |
Collapse
|
33
|
Čartolovni A, Malešević A, Poslon L. Critical analysis of the AI impact on the patient-physician relationship: A multi-stakeholder qualitative study. Digit Health 2023; 9:20552076231220833. [PMID: 38130798 PMCID: PMC10734361 DOI: 10.1177/20552076231220833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 11/29/2023] [Indexed: 12/23/2023] Open
Abstract
Objective This qualitative study aims to present the aspirations, expectations and critical analysis of the potential for artificial intelligence (AI) to transform patient-physician relationship, according to multi-stakeholder insight. Methods This study was conducted from June to December 2021, using an anticipatory ethics approach and sociology of expectations as the theoretical frameworks. It focused mainly on three groups of stakeholders; namely, physicians (n = 12), patients (n = 15) and healthcare managers (n = 11), all of whom are directly related to the adoption of AI in medicine (n = 38). Results In this study, interviews were conducted with 40% of the patients in the sample (15/38), as well as 31% of the physicians (12/38) and 29% of health managers in the sample (11/38). The findings highlight the following: (1) the impact of AI on fundamental aspects of the patient-physician relationship and the underlying importance of a synergistic relationship between the physician and AI; (2) the potential for AI to alleviate workload and reduce administrative burden by saving time and putting the patient at the centre of the caring process and (3) the potential risk to the holistic approach by neglecting humanness in healthcare. Conclusions This multi-stakeholder qualitative study, which focused on the micro-level of healthcare decision-making, sheds new light on the impact of AI on healthcare and the potential transformation of patient-physician relationship. The results of the current study highlight the need to adopt a critical awareness approach to the implementation of AI in healthcare by applying critical thinking and reasoning. It is important not to rely solely upon the recommendations of AI while neglecting clinical reasoning and physicians' knowledge of best clinical practices. Instead, it is vital that the core values of the existing patient-physician relationship - such as trust and honesty, conveyed through open and sincere communication - are preserved.
Collapse
Affiliation(s)
- Anto Čartolovni
- Digital Healthcare Ethics Laboratory (Digit-HeaL), Catholic University of Croatia, Zagreb, Croatia
- School of Medicine, Catholic University of Croatia, Zagreb, Croatia
| | - Anamaria Malešević
- Digital Healthcare Ethics Laboratory (Digit-HeaL), Catholic University of Croatia, Zagreb, Croatia
| | - Luka Poslon
- Digital Healthcare Ethics Laboratory (Digit-HeaL), Catholic University of Croatia, Zagreb, Croatia
| |
Collapse
|