1
|
Kuerbanjiang W, Peng S, Jiamaliding Y, Yi Y. Performance Evaluation of Large Language Models in Cervical Cancer Management Based on a Standardized Questionnaire: Comparative Study. J Med Internet Res 2025; 27:e63626. [PMID: 39908540 DOI: 10.2196/63626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 11/01/2024] [Accepted: 12/11/2024] [Indexed: 02/07/2025] Open
Abstract
BACKGROUND Cervical cancer remains the fourth leading cause of death among women globally, with a particularly severe burden in low-resource settings. A comprehensive approach-from screening to diagnosis and treatment-is essential for effective prevention and management. Large language models (LLMs) have emerged as potential tools to support health care, though their specific role in cervical cancer management remains underexplored. OBJECTIVE This study aims to systematically evaluate the performance and interpretability of LLMs in cervical cancer management. METHODS Models were selected from the AlpacaEval leaderboard version 2.0 and based on the capabilities of our computer. The questions inputted into the models cover aspects of general knowledge, screening, diagnosis, and treatment, according to guidelines. The prompt was developed using the Context, Objective, Style, Tone, Audience, and Response (CO-STAR) framework. Responses were evaluated for accuracy, guideline compliance, clarity, and practicality, graded as A, B, C, and D with corresponding scores of 3, 2, 1, and 0. The effective rate was calculated as the ratio of A and B responses to the total number of designed questions. Local Interpretable Model-Agnostic Explanations (LIME) was used to explain and enhance physicians' trust in model outputs within the medical context. RESULTS Nine models were included in this study, and a set of 100 standardized questions covering general information, screening, diagnosis, and treatment was designed based on international and national guidelines. Seven models (ChatGPT-4.0 Turbo, Claude 2, Gemini Pro, Mistral-7B-v0.2, Starling-LM-7B alpha, HuatuoGPT, and BioMedLM 2.7B) provided stable responses. Among all the models included, ChatGPT-4.0 Turbo ranked first with a mean score of 2.67 (95% CI 2.54-2.80; effective rate 94.00%) with a prompt and 2.52 (95% CI 2.37-2.67; effective rate 87.00%) without a prompt, outperforming the other 8 models (P<.001). Regardless of prompts, QiZhenGPT consistently ranked among the lowest-performing models, with P<.01 in comparisons against all models except BioMedLM. Interpretability analysis showed that prompts improved alignment with human annotations for proprietary models (median intersection over union 0.43), while medical-specialized models exhibited limited improvement. CONCLUSIONS Proprietary LLMs, particularly ChatGPT-4.0 Turbo and Claude 2, show promise in clinical decision-making involving logical analysis. The use of prompts can enhance the accuracy of some models in cervical cancer management to varying degrees. Medical-specialized models, such as HuatuoGPT and BioMedLM, did not perform as well as expected in this study. By contrast, proprietary models, particularly those augmented with prompts, demonstrated notable accuracy and interpretability in medical tasks, such as cervical cancer management. However, this study underscores the need for further research to explore the practical application of LLMs in medical practice.
Collapse
Affiliation(s)
| | - Shengzhe Peng
- Department of Gynecology, Zhongnan Hospital of Wuhan University, Wuhan, Hubei Province, China
| | | | - Yuexiong Yi
- Department of Gynecology, Zhongnan Hospital of Wuhan University, Wuhan, Hubei Province, China
| |
Collapse
|
2
|
Barbosa-Silva J, Driusso P, Ferreira EA, de Abreu RM. Exploring the Efficacy of Artificial Intelligence: A Comprehensive Analysis of CHAT-GPT's Accuracy and Completeness in Addressing Urinary Incontinence Queries. Neurourol Urodyn 2025; 44:153-164. [PMID: 39390731 DOI: 10.1002/nau.25603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 09/05/2024] [Accepted: 09/25/2024] [Indexed: 10/12/2024]
Abstract
BACKGROUND Artificial intelligence models are increasingly gaining popularity among patients and healthcare professionals. While it is impossible to restrict patient's access to different sources of information on the Internet, healthcare professional needs to be aware of the content-quality available across different platforms. OBJECTIVE To investigate the accuracy and completeness of Chat Generative Pretrained Transformer (ChatGPT) in addressing frequently asked questions related to the management and treatment of female urinary incontinence (UI), compared to recommendations from guidelines. METHODS This is a cross-sectional study. Two researchers developed 14 frequently asked questions related to UI. Then, they were inserted into the ChatGPT platform on September 16, 2023. The accuracy (scores from 1 to 5) and completeness (score from 1 to 3) of ChatGPT's answers were assessed individually by two experienced researchers in the Women's Health field, following the recommendations proposed by the guidelines for UI. RESULTS Most of the answers were classified as "more correct than incorrect" (n = 6), followed by "incorrect information than correct" (n = 3), "approximately equal correct and incorrect" (n = 2), "near all correct" (n = 2, and "correct" (n = 1). Regarding the appropriateness, most of the answers were classified as adequate, as they provided the minimum information expected to be classified as correct. CONCLUSION These results showed an inconsistency when evaluating the accuracy of answers generated by ChatGPT compared by scientific guidelines. Almost all the answers did not bring the complete content expected or reported in previous guidelines, which highlights to healthcare professionals and scientific community a concern about using artificial intelligence in patient counseling.
Collapse
Affiliation(s)
- Jordana Barbosa-Silva
- Women's Health Research Laboratory, Physical Therapy Department, Federal University of São Carlos, São Carlos, Brazil
| | - Patricia Driusso
- Women's Health Research Laboratory, Physical Therapy Department, Federal University of São Carlos, São Carlos, Brazil
| | - Elizabeth A Ferreira
- Department of Obstetrics and Gynecology, FMUSP School of Medicine, University of São Paulo, São Paulo, Brazil
- Department of Physiotherapy, Speech Therapy and Occupational Therapy, School of Medicine, University of São Paulo, São Paulo, Brazil
| | - Raphael M de Abreu
- Department of Physiotherapy, LUNEX University, International University of Health, Exercise & Sports S.A., Differdange, Luxembourg
- LUNEX ASBL Luxembourg Health & Sport Sciences Research Institute, Differdange, Luxembourg
| |
Collapse
|
3
|
Graf EM, McKinney JA, Dye AB, Lin L, Sanchez-Ramos L. Exploring the Limits of Artificial Intelligence for Referencing Scientific Articles. Am J Perinatol 2024; 41:2072-2081. [PMID: 38653452 DOI: 10.1055/s-0044-1786033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
OBJECTIVE To evaluate the reliability of three artificial intelligence (AI) chatbots (ChatGPT, Google Bard, and Chatsonic) in generating accurate references from existing obstetric literature. STUDY DESIGN Between mid-March and late April 2023, ChatGPT, Google Bard, and Chatsonic were prompted to provide references for specific obstetrical randomized controlled trials (RCTs) published in 2020. RCTs were considered for inclusion if they were mentioned in a previous article that primarily evaluated RCTs published by the top medical and obstetrics and gynecology journals with the highest impact factors in 2020 as well as RCTs published in a new journal focused on publishing obstetric RCTs. The selection of the three AI models was based on their popularity, performance in natural language processing, and public availability. Data collection involved prompting the AI chatbots to provide references according to a standardized protocol. The primary evaluation metric was the accuracy of each AI model in correctly citing references, including authors, publication title, journal name, and digital object identifier (DOI). Statistical analysis was performed using a permutation test to compare the performance of the AI models. RESULTS Among the 44 RCTs analyzed, Google Bard demonstrated the highest accuracy, correctly citing 13.6% of the requested RCTs, whereas ChatGPT and Chatsonic exhibited lower accuracy rates of 2.4 and 0%, respectively. Google Bard often substantially outperformed Chatsonic and ChatGPT in correctly citing the studied reference components. The majority of references from all AI models studied were noted to provide DOIs for unrelated studies or DOIs that do not exist. CONCLUSION To ensure the reliability of scientific information being disseminated, authors must exercise caution when utilizing AI for scientific writing and literature search. However, despite their limitations, collaborative partnerships between AI systems and researchers have the potential to drive synergistic advancements, leading to improved patient care and outcomes. KEY POINTS · AI chatbots often cite scientific articles incorrectly.. · AI chatbots can create false references.. · Responsible AI use in research is vital..
Collapse
Affiliation(s)
- Emily M Graf
- Department of Obstetrics and Gynecology, University of Florida College of Medicine, Jacksonville, Florida
| | - Jordan A McKinney
- Department of Obstetrics and Gynecology, University of Florida College of Medicine, Jacksonville, Florida
| | - Alexander B Dye
- Department of Obstetrics and Gynecology, University of Florida College of Medicine, Jacksonville, Florida
| | - Lifeng Lin
- Department of Epidemiology and Biostatistics, University of Arizona, Tucson, Arizona
| | - Luis Sanchez-Ramos
- Department of Obstetrics and Gynecology, University of Florida College of Medicine, Jacksonville, Florida
| |
Collapse
|
4
|
An J, Ferrante JM, Macenat M, Ganesan S, Hudson SV, Omene C, Garcia H, Kinney AY. Promoting informed approaches in precision oncology and clinical trial participation for Black patients with cancer: Community-engaged development and pilot testing of a digital intervention. Cancer 2024; 130 Suppl 20:3561-3577. [PMID: 37837177 PMCID: PMC11686548 DOI: 10.1002/cncr.35049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/18/2023] [Accepted: 09/11/2023] [Indexed: 10/15/2023]
Abstract
BACKGROUND Black patients with cancer are less likely to receive precision cancer treatments than White patients and are underrepresented in clinical trials. To address these disparities, the study aimed to develop and pilot-test a digital intervention to improve Black patients' knowledge about precision oncology and clinical trials, empower patients to increase relevant discussion, and promote informed decision-making. METHODS A community-engaged approach, including a Community Advisory Board and two rounds of key informant interviews with Black patients with cancer, their relatives, and providers (n = 48) was used to develop and refine the multimedia digital intervention. Thematic analysis was conducted for qualitative data. The intervention was then pilot-tested with 30 Black patients with cancer to assess feasibility, acceptability, appropriateness, knowledge, decision self-efficacy, and patient empowerment; Wilcoxon matched pairs signed-rank test was used to analyze quantitative data. RESULTS The digital tool was found to be feasible, acceptable, and culturally appropriate. Key informants shared their preferences and recommendations for the digital intervention and helped improve cultural appropriateness through user and usability testing. In the pilot test, appreciable improvement was found in participants' knowledge about precision oncology (z = -2.04, p = .052), knowledge about clinical trials (z = -3.14, p = .001), and decisional self-efficacy for targeted/immune therapy (z = -1.96, p = .0495). CONCLUSIONS The digital intervention could be a promising interactive decision-support tool for increasing Black patients' participation in clinical trials and receipt of precision treatments, including immunotherapy. Its use in clinical practice may reduce disparities in oncology care and research. PLAIN LANGUAGE SUMMARY We developed a digital interactive decision support tool for Black patients with cancer by convening a Community Advisory Board and conducting interviews with Black patients with cancer, their relatives, and providers. We then pilot-tested the intervention with newly diagnosed Black patients with cancer and found appreciable improvement in participants' knowledge about precision oncology, knowledge about clinical trials, and confidence in making decisions for targeted/immune therapy. Our digital tool has great potential to be an affordable and scalable solution for empowering and educating Black patients with cancer to help them make informed decisions about precision oncology and clinical trials and ultimately reducing racial disparities.
Collapse
Affiliation(s)
- Jinghua An
- Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey, USA
| | - Jeanne M. Ferrante
- Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey, USA
- Rutgers Robert Wood Johnson Medical School, The State University of New Jersey, New Brunswick, New Jersey, USA
| | - Myneka Macenat
- Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey, USA
| | - Shridar Ganesan
- Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey, USA
| | - Shawna V. Hudson
- Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey, USA
- Rutgers Robert Wood Johnson Medical School, The State University of New Jersey, New Brunswick, New Jersey, USA
| | - Coral Omene
- Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey, USA
- Rutgers Robert Wood Johnson Medical School, The State University of New Jersey, New Brunswick, New Jersey, USA
| | - Harold Garcia
- Lawrence Herbert School of Communication, Hofstra University, Hempstead, New York, USA
| | - Anita Y. Kinney
- Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey, USA
- School of Public Health, The State University of New Jersey, New Brunswick, New Jersey, USA
| |
Collapse
|
5
|
Ding N, Yuan Z, Ma Z, Wu Y, Yin L. AI-Assisted Rational Design and Activity Prediction of Biological Elements for Optimizing Transcription-Factor-Based Biosensors. Molecules 2024; 29:3512. [PMID: 39124917 PMCID: PMC11313831 DOI: 10.3390/molecules29153512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 07/22/2024] [Accepted: 07/24/2024] [Indexed: 08/12/2024] Open
Abstract
The rational design, activity prediction, and adaptive application of biological elements (bio-elements) are crucial research fields in synthetic biology. Currently, a major challenge in the field is efficiently designing desired bio-elements and accurately predicting their activity using vast datasets. The advancement of artificial intelligence (AI) technology has enabled machine learning and deep learning algorithms to excel in uncovering patterns in bio-element data and predicting their performance. This review explores the application of AI algorithms in the rational design of bio-elements, activity prediction, and the regulation of transcription-factor-based biosensor response performance using AI-designed elements. We discuss the advantages, adaptability, and biological challenges addressed by the AI algorithms in various applications, highlighting their powerful potential in analyzing biological data. Furthermore, we propose innovative solutions to the challenges faced by AI algorithms in the field and suggest future research directions. By consolidating current research and demonstrating the practical applications and future potential of AI in synthetic biology, this review provides valuable insights for advancing both academic research and practical applications in biotechnology.
Collapse
Affiliation(s)
- Nana Ding
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Hangzhou 311300, China;
- Zhejiang Provincial Key Laboratory of Resources Protection and Innovation of Traditional Chinese Medicine, Zhejiang A&F University, Hangzhou 311300, China
| | - Zenan Yuan
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Hangzhou 311300, China;
- Zhejiang Provincial Key Laboratory of Resources Protection and Innovation of Traditional Chinese Medicine, Zhejiang A&F University, Hangzhou 311300, China
| | - Zheng Ma
- Zhejiang Provincial Key Laboratory of Biometrology and Inspection & Quarantine, College of Life Sciences, China Jiliang University, Hangzhou 310018, China;
| | - Yefei Wu
- Zhejiang Qianjiang Biochemical Co., Ltd., Haining 314400, China;
| | - Lianghong Yin
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Hangzhou 311300, China;
- Zhejiang Provincial Key Laboratory of Resources Protection and Innovation of Traditional Chinese Medicine, Zhejiang A&F University, Hangzhou 311300, China
| |
Collapse
|
6
|
Zhang S, Song J. A chatbot based question and answer system for the auxiliary diagnosis of chronic diseases based on large language model. Sci Rep 2024; 14:17118. [PMID: 39054346 PMCID: PMC11272932 DOI: 10.1038/s41598-024-67429-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 07/11/2024] [Indexed: 07/27/2024] Open
Abstract
In recent years, artificial intelligence has made remarkable strides, improving various aspects of our daily lives. One notable application is in intelligent chatbots that use deep learning models. These systems have shown tremendous promise in the medical sector, enhancing healthcare quality, treatment efficiency, and cost-effectiveness. However, their role in aiding disease diagnosis, particularly chronic conditions, remains underexplored. Addressing this issue, this study employs large language models from the GPT series, in conjunction with deep learning techniques, to design and develop a diagnostic system targeted at chronic diseases. Specifically, performed transfer learning and fine-tuning on the GPT-2 model, enabling it to assist in accurately diagnosing 24 common chronic diseases. To provide a user-friendly interface and seamless interactive experience, we further developed a dialog-based interface, naming it Chat Ella. This system can make precise predictions for chronic diseases based on the symptoms described by users. Experimental results indicate that our model achieved an accuracy rate of 97.50% on the validation set, and an area under the curve (AUC) value reaching 99.91%. Moreover, conducted user satisfaction tests, which revealed that 68.7% of participants approved of Chat Ella, while 45.3% of participants found the system made daily medical consultations more convenient. It can rapidly and accurately assess a patient's condition based on the symptoms described and provide timely feedback, making it of significant value in the design of medical auxiliary products for household use.
Collapse
Affiliation(s)
- Sainan Zhang
- Graduate School of Communication Design, Hanyang University, ERICA Campus, Ansan, 15588, Republic of Korea.
| | - Jisung Song
- Graduate School of Communication Design, Hanyang University, ERICA Campus, Ansan, 15588, Republic of Korea
| |
Collapse
|
7
|
Aden D, Zaheer S, Khan S. Possible benefits, challenges, pitfalls, and future perspective of using ChatGPT in pathology. REVISTA ESPANOLA DE PATOLOGIA : PUBLICACION OFICIAL DE LA SOCIEDAD ESPANOLA DE ANATOMIA PATOLOGICA Y DE LA SOCIEDAD ESPANOLA DE CITOLOGIA 2024; 57:198-210. [PMID: 38971620 DOI: 10.1016/j.patol.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 02/22/2024] [Accepted: 04/16/2024] [Indexed: 07/08/2024]
Abstract
The much-hyped artificial intelligence (AI) model called ChatGPT developed by Open AI can have great benefits for physicians, especially pathologists, by saving time so that they can use their time for more significant work. Generative AI is a special class of AI model, which uses patterns and structures learned from existing data and can create new data. Utilizing ChatGPT in Pathology offers a multitude of benefits, encompassing the summarization of patient records and its promising prospects in Digital Pathology, as well as its valuable contributions to education and research in this field. However, certain roadblocks need to be dealt like integrating ChatGPT with image analysis which will act as a revolution in the field of pathology by increasing diagnostic accuracy and precision. The challenges with the use of ChatGPT encompass biases from its training data, the need for ample input data, potential risks related to bias and transparency, and the potential adverse outcomes arising from inaccurate content generation. Generation of meaningful insights from the textual information which will be efficient in processing different types of image data, such as medical images, and pathology slides. Due consideration should be given to ethical and legal issues including bias.
Collapse
Affiliation(s)
- Durre Aden
- Department of Pathology, Hamdard Institute of Medical Sciences and Research, Jamia Hamdard, New Delhi, India
| | - Sufian Zaheer
- Department of Pathology, Vardhman Mahavir Medical College and Safdarjung Hospital, New Delhi, India.
| | - Sabina Khan
- Department of Pathology, Hamdard Institute of Medical Sciences and Research, Jamia Hamdard, New Delhi, India
| |
Collapse
|
8
|
Abi-Rafeh J, Cattelan L, Xu HH, Bassiri-Tehrani B, Kazan R, Nahai F. Artificial Intelligence-Generated Social Media Content Creation and Management Strategies for Plastic Surgeons. Aesthet Surg J 2024; 44:769-778. [PMID: 38366026 DOI: 10.1093/asj/sjae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 02/02/2024] [Accepted: 02/08/2024] [Indexed: 02/18/2024] Open
Abstract
BACKGROUND Social media platforms have come to represent integral components of the professional marketing and advertising strategy for plastic surgeons. Effective and consistent content development, however, remains technically demanding and time consuming, prompting most to employ, at non-negligible costs, social media marketing specialists for content planning and development. OBJECTIVES In the present study, we aimed to investigate the ability of presently available artificial intelligence (AI) models to assist plastic surgeons in their social media content development and sharing plans. METHODS An AI large language model was prompted on the study's objectives through a series of standardized user interactions. Social media platforms of interest, on which the AI model was prompted, included Instagram, TikTok, and X (formerly Twitter). RESULTS A 1-year, entirely AI-generated social media plan, comprising a total of 1091 posts for the 3 aforementioned social media platforms, is presented. Themes of the AI-generated content proposed for each platform were classified in 6 categories, including patient-related, practice-related, educational, "uplifting," interactive, and promotional posts. Overall, 91 publicly recognized holidays and observant and awareness days were incorporated into the content calendars. The AI model demonstrated an ability to differentiate between the distinct formats of each of the 3 social media platforms investigated, generating unique ideas for each, and providing detailed content development and posting instructions, scripts, and post captions, leveraging features specific to each platform. CONCLUSIONS By providing detailed and actionable social media content creation and posting plans to plastic surgeons, presently available AI models can be readily leveraged to assist in and significantly alleviate the burden associated with social media account management, content generation, and potentially patient conversion.
Collapse
|
9
|
Mijwil MM, Naji AS, Doshi R, Hiran KK, Bala I, Guma A. The Effect of Human-Computer Interaction on New Applications by Exploring the Use Case of ChatGPT in Healthcare Services. ADVANCES IN MEDICAL EDUCATION, RESEARCH, AND ETHICS 2024:74-87. [DOI: 10.4018/979-8-3693-5493-3.ch005] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2025]
Abstract
Human-Computer interaction (HCI) is a domain that focuses on growing the interaction between humans and computer systems by designing and developing user interfaces that are efficient and delightful to use. In this chapter, the authors focus on the importance of deep human-computer interaction on new applications with an emphasis on using ChatGPT applications in the health services domain. This chapter provides full details on the importance of executing ChatGPT in various health-related scenarios while highlighting the importance of HCI to enhance user interactions in personalized medical advice in a ChatGPT application. This chapter concludes that the capabilities of ChatGPT and artificial intelligence applications can revolutionize the healthcare industry by enhancing the accessibility and effectiveness of new media communications between the user and applications while creating innovative resolutions to improve healthcare services.
Collapse
Affiliation(s)
| | | | | | | | - Indu Bala
- Lovely Professional University, India
| | | |
Collapse
|
10
|
Marchi F, Bellini E, Iandelli A, Sampieri C, Peretti G. Exploring the landscape of AI-assisted decision-making in head and neck cancer treatment: a comparative analysis of NCCN guidelines and ChatGPT responses. Eur Arch Otorhinolaryngol 2024; 281:2123-2136. [PMID: 38421392 DOI: 10.1007/s00405-024-08525-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 02/02/2024] [Indexed: 03/02/2024]
Abstract
PURPOSE Recent breakthroughs in natural language processing and machine learning, exemplified by ChatGPT, have spurred a paradigm shift in healthcare. Released by OpenAI in November 2022, ChatGPT rapidly gained global attention. Trained on massive text datasets, this large language model holds immense potential to revolutionize healthcare. However, existing literature often overlooks the need for rigorous validation and real-world applicability. METHODS This head-to-head comparative study assesses ChatGPT's capabilities in providing therapeutic recommendations for head and neck cancers. Simulating every NCCN Guidelines scenarios. ChatGPT is queried on primary treatments, adjuvant treatment, and follow-up, with responses compared to the NCCN Guidelines. Performance metrics, including sensitivity, specificity, and F1 score, are employed for assessment. RESULTS The study includes 68 hypothetical cases and 204 clinical scenarios. ChatGPT exhibits promising capabilities in addressing NCCN-related queries, achieving high sensitivity and overall accuracy across primary treatment, adjuvant treatment, and follow-up. The study's metrics showcase robustness in providing relevant suggestions. However, a few inaccuracies are noted, especially in primary treatment scenarios. CONCLUSION Our study highlights the proficiency of ChatGPT in providing treatment suggestions. The model's alignment with the NCCN Guidelines sets the stage for a nuanced exploration of AI's evolving role in oncological decision support. However, challenges related to the interpretability of AI in clinical decision-making and the importance of clinicians understanding the underlying principles of AI models remain unexplored. As AI continues to advance, collaborative efforts between models and medical experts are deemed essential for unlocking new frontiers in personalized cancer care.
Collapse
Affiliation(s)
- Filippo Marchi
- Unit of Otorhinolaryngology-Head and Neck Surgery, IRCCS Ospedale Policlinico San Martino, Largo Rosanna Benzi, 10, 16132, Genoa, Italy
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, 16132, Genoa, Italy
| | - Elisa Bellini
- Unit of Otorhinolaryngology-Head and Neck Surgery, IRCCS Ospedale Policlinico San Martino, Largo Rosanna Benzi, 10, 16132, Genoa, Italy.
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, 16132, Genoa, Italy.
| | - Andrea Iandelli
- Unit of Otorhinolaryngology-Head and Neck Surgery, IRCCS Ospedale Policlinico San Martino, Largo Rosanna Benzi, 10, 16132, Genoa, Italy
| | - Claudio Sampieri
- Department of Experimental Medicine (DIMES), University of Genoa, Genoa, Italy
- Department of Otolaryngology-Hospital Cliníc, Barcelona, Spain
- Functional Unit of Head and Neck Tumors-Hospital Cliníc, Barcelona, Spain
| | - Giorgio Peretti
- Unit of Otorhinolaryngology-Head and Neck Surgery, IRCCS Ospedale Policlinico San Martino, Largo Rosanna Benzi, 10, 16132, Genoa, Italy
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, 16132, Genoa, Italy
| |
Collapse
|
11
|
Horgan R, Martins JG, Saade G, Abuhamad A, Kawakita T. ChatGPT in maternal-fetal medicine practice: a primer for clinicians. Am J Obstet Gynecol MFM 2024; 6:101302. [PMID: 38281582 DOI: 10.1016/j.ajogmf.2024.101302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 01/02/2024] [Accepted: 01/21/2024] [Indexed: 01/30/2024]
Abstract
ChatGPT (Generative Pre-trained Transformer), a language model that was developed by OpenAI and launched in November 2022, generates human-like responses to prompts using deep-learning technology. The integration of large language processing models into healthcare has the potential to improve the accessibility of medical information for both patients and health professionals alike. In this commentary, we demonstrated the ability of ChatGPT to produce patient information sheets. Four board-certified, maternal-fetal medicine attending physicians rated the accuracy and humanness of the information according to 2 predefined scales of accuracy and completeness. The median score for accuracy of information was rated 4.8 on a 6-point scale and the median score for completeness of information was 2.2 on a 3-point scale for the 5 patient information leaflets generated by ChatGPT. Concerns raised included the omission of clinically important information for patient counseling in some patient information leaflets and the inability to verify the source of information because ChatGPT does not provide references. ChatGPT is a powerful tool that has the potential to enhance patient care, but such a tool requires extensive validation and is perhaps best considered as an adjunct to clinical practice rather than as a tool to be used freely by the public for healthcare information.
Collapse
Affiliation(s)
- Rebecca Horgan
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Eastern Virginia Medical School, Norfolk, VA..
| | - Juliana G Martins
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Eastern Virginia Medical School, Norfolk, VA
| | - George Saade
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Eastern Virginia Medical School, Norfolk, VA
| | - Alfred Abuhamad
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Eastern Virginia Medical School, Norfolk, VA
| | - Tetsuya Kawakita
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Eastern Virginia Medical School, Norfolk, VA
| |
Collapse
|
12
|
Lee Y, Kim SY. Potential applications of ChatGPT in obstetrics and gynecology in Korea: a review article. Obstet Gynecol Sci 2024; 67:153-159. [PMID: 38247132 PMCID: PMC10948210 DOI: 10.5468/ogs.23231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 11/08/2023] [Accepted: 11/29/2023] [Indexed: 01/23/2024] Open
Abstract
The use of chatbot technology, particularly chat generative pre-trained transformer (ChatGPT) with an impressive 175 billion parameters, has garnered significant attention across various domains, including Obstetrics and Gynecology (OBGYN). This comprehensive review delves into the transformative potential of chatbots with a special focus on ChatGPT as a leading artificial intelligence (AI) technology. Moreover, ChatGPT harnesses the power of deep learning algorithms to generate responses that closely mimic human language, opening up myriad applications in medicine, research, and education. In the field of medicine, ChatGPT plays a pivotal role in diagnosis, treatment, and personalized patient education. Notably, the technology has demonstrated remarkable capabilities, surpassing human performance in OBGYN examinations, and delivering highly accurate diagnoses. However, challenges remain, including the need to verify the accuracy of the information and address the ethical considerations and limitations. In the wide scope of chatbot technology, AI systems play a vital role in healthcare processes, including documentation, diagnosis, research, and education. Although promising, the limitations and occasional inaccuracies require validation by healthcare professionals. This review also examined global chatbot adoption in healthcare, emphasizing the need for user awareness to ensure patient safety. Chatbot technology holds great promise in OBGYN and medicine, offering innovative solutions while necessitating responsible integration to ensure patient care and safety.
Collapse
Affiliation(s)
- YooKyung Lee
- Division of Maternal Fetal Medicine, Department of Obstetrics and Gynecology, MizMedi Hospital, Seoul, Korea
| | - So Yun Kim
- Division of Maternal Fetal Medicine, Department of Obstetrics and Gynecology, MizMedi Hospital, Seoul, Korea
| |
Collapse
|
13
|
Abi-Rafeh J, Xu HH, Kazan R, Tevlin R, Furnas H. Large Language Models and Artificial Intelligence: A Primer for Plastic Surgeons on the Demonstrated and Potential Applications, Promises, and Limitations of ChatGPT. Aesthet Surg J 2024; 44:329-343. [PMID: 37562022 DOI: 10.1093/asj/sjad260] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 08/02/2023] [Accepted: 08/04/2023] [Indexed: 08/12/2023] Open
Abstract
BACKGROUND The rapidly evolving field of artificial intelligence (AI) holds great potential for plastic surgeons. ChatGPT, a recently released AI large language model (LLM), promises applications across many disciplines, including healthcare. OBJECTIVES The aim of this article was to provide a primer for plastic surgeons on AI, LLM, and ChatGPT, including an analysis of current demonstrated and proposed clinical applications. METHODS A systematic review was performed identifying medical and surgical literature on ChatGPT's proposed clinical applications. Variables assessed included applications investigated, command tasks provided, user input information, AI-emulated human skills, output validation, and reported limitations. RESULTS The analysis included 175 articles reporting on 13 plastic surgery applications and 116 additional clinical applications, categorized by field and purpose. Thirty-four applications within plastic surgery are thus proposed, with relevance to different target audiences, including attending plastic surgeons (n = 17, 50%), trainees/educators (n = 8, 24.0%), researchers/scholars (n = 7, 21%), and patients (n = 2, 6%). The 15 identified limitations of ChatGPT were categorized by training data, algorithm, and ethical considerations. CONCLUSIONS Widespread use of ChatGPT in plastic surgery will depend on rigorous research of proposed applications to validate performance and address limitations. This systemic review aims to guide research, development, and regulation to safely adopt AI in plastic surgery.
Collapse
|
14
|
Kavadella A, Dias da Silva MA, Kaklamanos EG, Stamatopoulos V, Giannakopoulos K. Evaluation of ChatGPT's Real-Life Implementation in Undergraduate Dental Education: Mixed Methods Study. JMIR MEDICAL EDUCATION 2024; 10:e51344. [PMID: 38111256 PMCID: PMC10867750 DOI: 10.2196/51344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 10/28/2023] [Accepted: 12/11/2023] [Indexed: 12/20/2023]
Abstract
BACKGROUND The recent artificial intelligence tool ChatGPT seems to offer a range of benefits in academic education while also raising concerns. Relevant literature encompasses issues of plagiarism and academic dishonesty, as well as pedagogy and educational affordances; yet, no real-life implementation of ChatGPT in the educational process has been reported to our knowledge so far. OBJECTIVE This mixed methods study aimed to evaluate the implementation of ChatGPT in the educational process, both quantitatively and qualitatively. METHODS In March 2023, a total of 77 second-year dental students of the European University Cyprus were divided into 2 groups and asked to compose a learning assignment on "Radiation Biology and Radiation Protection in the Dental Office," working collaboratively in small subgroups, as part of the educational semester program of the Dentomaxillofacial Radiology module. Careful planning ensured a seamless integration of ChatGPT, addressing potential challenges. One group searched the internet for scientific resources to perform the task and the other group used ChatGPT for this purpose. Both groups developed a PowerPoint (Microsoft Corp) presentation based on their research and presented it in class. The ChatGPT group students additionally registered all interactions with the language model during the prompting process and evaluated the final outcome; they also answered an open-ended evaluation questionnaire, including questions on their learning experience. Finally, all students undertook a knowledge examination on the topic, and the grades between the 2 groups were compared statistically, whereas the free-text comments of the questionnaires were thematically analyzed. RESULTS Out of the 77 students, 39 were assigned to the ChatGPT group and 38 to the literature research group. Seventy students undertook the multiple choice question knowledge examination, and examination grades ranged from 5 to 10 on the 0-10 grading scale. The Mann-Whitney U test showed that students of the ChatGPT group performed significantly better (P=.045) than students of the literature research group. The evaluation questionnaires revealed the benefits (human-like interface, immediate response, and wide knowledge base), the limitations (need for rephrasing the prompts to get a relevant answer, general content, false citations, and incapability to provide images or videos), and the prospects (in education, clinical practice, continuing education, and research) of ChatGPT. CONCLUSIONS Students using ChatGPT for their learning assignments performed significantly better in the knowledge examination than their fellow students who used the literature research methodology. Students adapted quickly to the technological environment of the language model, recognized its opportunities and limitations, and used it creatively and efficiently. Implications for practice: the study underscores the adaptability of students to technological innovations including ChatGPT and its potential to enhance educational outcomes. Educators should consider integrating ChatGPT into curriculum design; awareness programs are warranted to educate both students and educators about the limitations of ChatGPT, encouraging critical engagement and responsible use.
Collapse
Affiliation(s)
- Argyro Kavadella
- School of Dentistry, European University Cyprus, Nicosia, Cyprus
| | - Marco Antonio Dias da Silva
- Research Group of Teleducation and Teledentistry, Federal University of Campina Grande, Campina Grande, Brazil
| | - Eleftherios G Kaklamanos
- School of Dentistry, European University Cyprus, Nicosia, Cyprus
- School of Dentistry, Aristotle University of Thessaloniki, Thessaloniki, Greece
- Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, United Arab Emirates
| | - Vasileios Stamatopoulos
- Information Management Systems Institute, ATHENA Research and Innovation Center, Athens, Greece
| | | |
Collapse
|
15
|
Preiksaitis C, Rose C. Opportunities, Challenges, and Future Directions of Generative Artificial Intelligence in Medical Education: Scoping Review. JMIR MEDICAL EDUCATION 2023; 9:e48785. [PMID: 37862079 PMCID: PMC10625095 DOI: 10.2196/48785] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 07/28/2023] [Accepted: 09/28/2023] [Indexed: 10/21/2023]
Abstract
BACKGROUND Generative artificial intelligence (AI) technologies are increasingly being utilized across various fields, with considerable interest and concern regarding their potential application in medical education. These technologies, such as Chat GPT and Bard, can generate new content and have a wide range of possible applications. OBJECTIVE This study aimed to synthesize the potential opportunities and limitations of generative AI in medical education. It sought to identify prevalent themes within recent literature regarding potential applications and challenges of generative AI in medical education and use these to guide future areas for exploration. METHODS We conducted a scoping review, following the framework by Arksey and O'Malley, of English language articles published from 2022 onward that discussed generative AI in the context of medical education. A literature search was performed using PubMed, Web of Science, and Google Scholar databases. We screened articles for inclusion, extracted data from relevant studies, and completed a quantitative and qualitative synthesis of the data. RESULTS Thematic analysis revealed diverse potential applications for generative AI in medical education, including self-directed learning, simulation scenarios, and writing assistance. However, the literature also highlighted significant challenges, such as issues with academic integrity, data accuracy, and potential detriments to learning. Based on these themes and the current state of the literature, we propose the following 3 key areas for investigation: developing learners' skills to evaluate AI critically, rethinking assessment methodology, and studying human-AI interactions. CONCLUSIONS The integration of generative AI in medical education presents exciting opportunities, alongside considerable challenges. There is a need to develop new skills and competencies related to AI as well as thoughtful, nuanced approaches to examine the growing use of generative AI in medical education.
Collapse
Affiliation(s)
- Carl Preiksaitis
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Christian Rose
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| |
Collapse
|
16
|
Suppadungsuk S, Thongprayoon C, Miao J, Krisanapan P, Qureshi F, Kashani K, Cheungpasitporn W. Exploring the Potential of Chatbots in Critical Care Nephrology. MEDICINES (BASEL, SWITZERLAND) 2023; 10:58. [PMID: 37887265 PMCID: PMC10608511 DOI: 10.3390/medicines10100058] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 10/17/2023] [Accepted: 10/18/2023] [Indexed: 10/28/2023]
Abstract
The exponential growth of artificial intelligence (AI) has allowed for its integration into multiple sectors, including, notably, healthcare. Chatbots have emerged as a pivotal resource for improving patient outcomes and assisting healthcare practitioners through various AI-based technologies. In critical care, kidney-related conditions play a significant role in determining patient outcomes. This article examines the potential for integrating chatbots into the workflows of critical care nephrology to optimize patient care. We detail their specific applications in critical care nephrology, such as managing acute kidney injury, alert systems, and continuous renal replacement therapy (CRRT); facilitating discussions around palliative care; and bolstering collaboration within a multidisciplinary team. Chatbots have the potential to augment real-time data availability, evaluate renal health, identify potential risk factors, build predictive models, and monitor patient progress. Moreover, they provide a platform for enhancing communication and education for both patients and healthcare providers, paving the way for enriched knowledge and honed professional skills. However, it is vital to recognize the inherent challenges and limitations when using chatbots in this domain. Here, we provide an in-depth exploration of the concerns tied to chatbots' accuracy, dependability, data protection and security, transparency, potential algorithmic biases, and ethical implications in critical care nephrology. While human discernment and intervention are indispensable, especially in complex medical scenarios or intricate situations, the sustained advancements in AI signal that the integration of precision-engineered chatbot algorithms within critical care nephrology has considerable potential to elevate patient care and pivotal outcome metrics in the future.
Collapse
Affiliation(s)
- Supawadee Suppadungsuk
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA
- Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Samut Prakan 10540, Thailand
| | - Charat Thongprayoon
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Jing Miao
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Pajaree Krisanapan
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA
- Division of Nephrology and Hypertension, Thammasat University Hospital, Pathum Thani 12120, Thailand
| | - Fawad Qureshi
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Kianoush Kashani
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Wisit Cheungpasitporn
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
17
|
Chavez MR. ChatGPT: the good, the bad, and the potential. Am J Obstet Gynecol 2023; 229:357. [PMID: 37031760 DOI: 10.1016/j.ajog.2023.04.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 04/03/2023] [Indexed: 04/11/2023]
Affiliation(s)
- Martin R Chavez
- Director of Maternal Fetal Medicine and Fetal Surgery Program, Division of Maternal Fetal Medicine, Department of Obstetrics and Gynecology, NYU Langone Health, NYU Langone Hospital-Long Island, Mineola, NY; Professor of Obstetrics, Gynecology and Reproductive Medicine, NYU Long Island School of Medicine, 259 First Street, Mineola, NY 11501.
| |
Collapse
|
18
|
Sanchez-Ramos L, Lin L, Romero R. Beware of references when using ChatGPT as a source of information to write scientific articles. Am J Obstet Gynecol 2023; 229:356-357. [PMID: 37031761 PMCID: PMC10524915 DOI: 10.1016/j.ajog.2023.04.004] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Accepted: 04/03/2023] [Indexed: 04/11/2023]
Affiliation(s)
- Luis Sanchez-Ramos
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, University of Florida College of Medicine, 653 West 8th St, Jacksonville, FL 32209.
| | - Lifeng Lin
- Department of Epidemiology and Biostatistics, The University of Arizona, Tucson, AZ
| | - Roberto Romero
- Pregnancy Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, MD and Detroit, MI; Department of Obstetrics and Gynecology, University of Michigan, Ann Arbor, MI; Department of Epidemiology and Biostatistics, Michigan University, East Lansing, MI
| |
Collapse
|
19
|
Suppadungsuk S, Thongprayoon C, Krisanapan P, Tangpanithandee S, Garcia Valencia O, Miao J, Mekraksakit P, Kashani K, Cheungpasitporn W. Examining the Validity of ChatGPT in Identifying Relevant Nephrology Literature: Findings and Implications. J Clin Med 2023; 12:5550. [PMID: 37685617 PMCID: PMC10488525 DOI: 10.3390/jcm12175550] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 08/21/2023] [Accepted: 08/24/2023] [Indexed: 09/10/2023] Open
Abstract
Literature reviews are valuable for summarizing and evaluating the available evidence in various medical fields, including nephrology. However, identifying and exploring the potential sources requires focus and time devoted to literature searching for clinicians and researchers. ChatGPT is a novel artificial intelligence (AI) large language model (LLM) renowned for its exceptional ability to generate human-like responses across various tasks. However, whether ChatGPT can effectively assist medical professionals in identifying relevant literature is unclear. Therefore, this study aimed to assess the effectiveness of ChatGPT in identifying references to literature reviews in nephrology. We keyed the prompt "Please provide the references in Vancouver style and their links in recent literature on… name of the topic" into ChatGPT-3.5 (03/23 Version). We selected all the results provided by ChatGPT and assessed them for existence, relevance, and author/link correctness. We recorded each resource's citations, authors, title, journal name, publication year, digital object identifier (DOI), and link. The relevance and correctness of each resource were verified by searching on Google Scholar. Of the total 610 references in the nephrology literature, only 378 (62%) of the references provided by ChatGPT existed, while 31% were fabricated, and 7% of citations were incomplete references. Notably, only 122 (20%) of references were authentic. Additionally, 256 (68%) of the links in the references were found to be incorrect, and the DOI was inaccurate in 206 (54%) of the references. Moreover, among those with a link provided, the link was correct in only 20% of cases, and 3% of the references were irrelevant. Notably, an analysis of specific topics in electrolyte, hemodialysis, and kidney stones found that >60% of the references were inaccurate or misleading, with less reliable authorship and links provided by ChatGPT. Based on our findings, the use of ChatGPT as a sole resource for identifying references to literature reviews in nephrology is not recommended. Future studies could explore ways to improve AI language models' performance in identifying relevant nephrology literature.
Collapse
Affiliation(s)
- Supawadee Suppadungsuk
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA; (S.S.); (C.T.); (P.K.); (S.T.); (O.G.V.); (J.M.); (P.M.); (K.K.)
- Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Samut Prakan 10540, Thailand
| | - Charat Thongprayoon
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA; (S.S.); (C.T.); (P.K.); (S.T.); (O.G.V.); (J.M.); (P.M.); (K.K.)
| | - Pajaree Krisanapan
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA; (S.S.); (C.T.); (P.K.); (S.T.); (O.G.V.); (J.M.); (P.M.); (K.K.)
- Division of Nephrology, Thammasat University Hospital, Pathum Thani 12120, Thailand
| | - Supawit Tangpanithandee
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA; (S.S.); (C.T.); (P.K.); (S.T.); (O.G.V.); (J.M.); (P.M.); (K.K.)
- Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Samut Prakan 10540, Thailand
| | - Oscar Garcia Valencia
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA; (S.S.); (C.T.); (P.K.); (S.T.); (O.G.V.); (J.M.); (P.M.); (K.K.)
| | - Jing Miao
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA; (S.S.); (C.T.); (P.K.); (S.T.); (O.G.V.); (J.M.); (P.M.); (K.K.)
| | - Poemlarp Mekraksakit
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA; (S.S.); (C.T.); (P.K.); (S.T.); (O.G.V.); (J.M.); (P.M.); (K.K.)
| | - Kianoush Kashani
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA; (S.S.); (C.T.); (P.K.); (S.T.); (O.G.V.); (J.M.); (P.M.); (K.K.)
| | - Wisit Cheungpasitporn
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA; (S.S.); (C.T.); (P.K.); (S.T.); (O.G.V.); (J.M.); (P.M.); (K.K.)
| |
Collapse
|
20
|
Vintzileos AM, Chavez MR, Romero R. A role for artificial intelligence chatbots in the writing of scientific articles. Am J Obstet Gynecol 2023; 229:89-90. [PMID: 37117103 PMCID: PMC10524709 DOI: 10.1016/j.ajog.2023.03.040] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 03/24/2023] [Indexed: 04/30/2023]
Affiliation(s)
- Anthony M Vintzileos
- Department of Obstetrics and Gynecology, Lenox Hill Hospital Northwell Health, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New York, NY.
| | - Martin R Chavez
- Department of Obstetrics and Gynecology, NYU Langone Hospital-Long Island, NYU Long Island School of Medicine, Mineola, NY
| | - Roberto Romero
- Pregnancy Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Bethesda, MD and Detroit, MI; Department of Obstetrics and Gynecology, University of Michigan, Ann Arbor, MI; Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI
| |
Collapse
|
21
|
Suhag A, Kidd J, McGath M, Rajesh R, Gelfinbein J, Cacace N, Monteleone B, Chavez MR. ChatGPT: a pioneering approach to complex prenatal differential diagnosis. Am J Obstet Gynecol MFM 2023; 5:101029. [PMID: 37257586 DOI: 10.1016/j.ajogmf.2023.101029] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 05/19/2023] [Indexed: 06/02/2023]
Abstract
This commentary examines how ChatGPT can assist healthcare teams in the prenatal diagnosis of rare and complex cases by creating a differential diagnoses based on deidentified clinical findings, while also acknowledging its limitations.
Collapse
Affiliation(s)
- Anju Suhag
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, NYU Langone Health, NYU Langone Hospital-Long Island, NYU Long Island School of Medicine, Mineola, NY (Drs Suhag and Kidd, Mses McGath and Cacace, and Dr Chavez).
| | - Jennifer Kidd
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, NYU Langone Health, NYU Langone Hospital-Long Island, NYU Long Island School of Medicine, Mineola, NY (Drs Suhag and Kidd, Mses McGath and Cacace, and Dr Chavez)
| | - Meghan McGath
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, NYU Langone Health, NYU Langone Hospital-Long Island, NYU Long Island School of Medicine, Mineola, NY (Drs Suhag and Kidd, Mses McGath and Cacace, and Dr Chavez); Department of Clinical Genetics, NYU Langone Hospital-Long Island, Mineola, NY (Mses McGath and Cacace, and Dr Monteleone)
| | - Raeshmma Rajesh
- Department of Obstetrics and Gynecology, Richmond University Medical Center, Staten Island, NY (Dr Rajesh)
| | | | - Nicole Cacace
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, NYU Langone Health, NYU Langone Hospital-Long Island, NYU Long Island School of Medicine, Mineola, NY (Drs Suhag and Kidd, Mses McGath and Cacace, and Dr Chavez); Department of Clinical Genetics, NYU Langone Hospital-Long Island, Mineola, NY (Mses McGath and Cacace, and Dr Monteleone)
| | - Berrin Monteleone
- Department of Clinical Genetics, NYU Langone Hospital-Long Island, Mineola, NY (Mses McGath and Cacace, and Dr Monteleone)
| | - Martin R Chavez
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, NYU Langone Health, NYU Langone Hospital-Long Island, NYU Long Island School of Medicine, Mineola, NY (Drs Suhag and Kidd, Mses McGath and Cacace, and Dr Chavez)
| |
Collapse
|
22
|
Allahqoli L, Ghiasvand MM, Mazidimoradi A, Salehiniya H, Alkatout I. Diagnostic and Management Performance of ChatGPT in Obstetrics and Gynecology. Gynecol Obstet Invest 2023; 88:310-313. [PMID: 37494894 DOI: 10.1159/000533177] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 07/20/2023] [Indexed: 07/28/2023]
Abstract
OBJECTIVES The use of artificial intelligence (AI) in clinical patient management and medical education has been advancing over time. ChatGPT was developed and trained recently, using a large quantity of textual data from the internet. Medical science is expected to be transformed by its use. The present study was conducted to evaluate the diagnostic and management performance of the ChatGPT AI model in obstetrics and gynecology. DESIGN A cross-sectional study was conducted. PARTICIPANTS/MATERIALS, SETTING, METHODS This study was conducted in Iran in March 2023. Medical histories and examination results of 30 cases were determined in six areas of obstetrics and gynecology. The cases were presented to a gynecologist and ChatGPT for diagnosis and management. Answers from the gynecologist and ChatGPT were compared, and the diagnostic and management performance of ChatGPT were determined. RESULTS Ninety percent (27 of 30) of the cases in obstetrics and gynecology were correctly handled by ChatGPT. Its responses were eloquent, informed, and free of a significant number of errors or misinformation. Even when the answers provided by ChatGPT were incorrect, the responses contained a logical explanation about the case as well as information provided in the question stem. LIMITATIONS The data used in this study were taken from the electronic book and may reflect bias in the diagnosis of ChatGPT. CONCLUSIONS This is the first evaluation of ChatGPT's performance in diagnosis and management in the field of obstetrics and gynecology. It appears that ChatGPT has potential applications in the practice of medicine and is (currently) free and simple to use. However, several ethical considerations and limitations such as bias, validity, copyright infringement, and plagiarism need to be addressed in future studies.
Collapse
Affiliation(s)
- Leila Allahqoli
- Midwifery Department, Ministry of Health and Medical Education, Tehran, Iran,
| | | | - Afrooz Mazidimoradi
- Student Research Committee, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Hamid Salehiniya
- Social Determinants of Health Research Center, Birjand University of Medical Sciences, Birjand, Iran
| | - Ibrahim Alkatout
- Campus Kiel, Kiel School of Gynaecological Endoscopy, University Hospitals Schleswig-Holstein, Kiel, Germany
| |
Collapse
|