1
|
Chotcomwongse P, Ruamviboonsuk P, Grzybowski A. Utilizing Large Language Models in Ophthalmology: The Current Landscape and Challenges. Ophthalmol Ther 2024; 13:2543-2558. [PMID: 39180701 PMCID: PMC11408418 DOI: 10.1007/s40123-024-01018-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 08/01/2024] [Indexed: 08/26/2024] Open
Abstract
A large language model (LLM) is an artificial intelligence (AI) model that uses natural language processing (NLP) to understand, interpret, and generate human-like language responses from unstructured text input. Its real-time response capabilities and eloquent dialogue enhance the interactive user experience in human-AI communication like never before. By gathering several sources on the internet, LLM chatbots can interact and respond to a wide range of queries, including problem solving, text summarization, and creating informative notes. Since ophthalmology is one of the medical fields integrating image analysis, telemedicine, AI, and other technologies, LLMs are likely to play an important role in eye care in the near future. This review summarizes the performance and potential applicability of LLMs in ophthalmology according to currently available publications.
Collapse
Affiliation(s)
- Peranut Chotcomwongse
- Vitreoretina Unit, Department of Ophthalmology, Rajavithi Hospital, Rungsit University, Bangkok, Thailand
| | - Paisan Ruamviboonsuk
- Vitreoretina Unit, Department of Ophthalmology, Rajavithi Hospital, Rungsit University, Bangkok, Thailand
| | - Andrzej Grzybowski
- University of Warmia and Mazury, Olsztyn, Poland.
- Institute for Research in Ophthalmology, Foundation for Ophthalmology Development, 61-553, Poznan, Poland.
| |
Collapse
|
2
|
AlSaad R, Abd-Alrazaq A, Boughorbel S, Ahmed A, Renault MA, Damseh R, Sheikh J. Multimodal Large Language Models in Health Care: Applications, Challenges, and Future Outlook. J Med Internet Res 2024; 26:e59505. [PMID: 39321458 DOI: 10.2196/59505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 08/07/2024] [Accepted: 08/20/2024] [Indexed: 09/27/2024] Open
Abstract
In the complex and multidimensional field of medicine, multimodal data are prevalent and crucial for informed clinical decisions. Multimodal data span a broad spectrum of data types, including medical images (eg, MRI and CT scans), time-series data (eg, sensor data from wearable devices and electronic health records), audio recordings (eg, heart and respiratory sounds and patient interviews), text (eg, clinical notes and research articles), videos (eg, surgical procedures), and omics data (eg, genomics and proteomics). While advancements in large language models (LLMs) have enabled new applications for knowledge retrieval and processing in the medical field, most LLMs remain limited to processing unimodal data, typically text-based content, and often overlook the importance of integrating the diverse data modalities encountered in clinical practice. This paper aims to present a detailed, practical, and solution-oriented perspective on the use of multimodal LLMs (M-LLMs) in the medical field. Our investigation spanned M-LLM foundational principles, current and potential applications, technical and ethical challenges, and future research directions. By connecting these elements, we aimed to provide a comprehensive framework that links diverse aspects of M-LLMs, offering a unified vision for their future in health care. This approach aims to guide both future research and practical implementations of M-LLMs in health care, positioning them as a paradigm shift toward integrated, multimodal data-driven medical practice. We anticipate that this work will spark further discussion and inspire the development of innovative approaches in the next generation of medical M-LLM systems.
Collapse
Affiliation(s)
- Rawan AlSaad
- Weill Cornell Medicine-Qatar, Education City, Doha, Qatar
| | | | - Sabri Boughorbel
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| | - Arfan Ahmed
- Weill Cornell Medicine-Qatar, Education City, Doha, Qatar
| | | | - Rafat Damseh
- Department of Computer Science and Software Engineering, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Javaid Sheikh
- Weill Cornell Medicine-Qatar, Education City, Doha, Qatar
| |
Collapse
|
3
|
Yanagita Y, Yokokawa D, Uchida S, Li Y, Uehara T, Ikusaka M. Can AI-Generated Clinical Vignettes in Japanese Be Used Medically and Linguistically? J Gen Intern Med 2024:10.1007/s11606-024-09031-y. [PMID: 39313665 DOI: 10.1007/s11606-024-09031-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 09/10/2024] [Indexed: 09/25/2024]
Abstract
BACKGROUND Creating clinical vignettes requires considerable effort. Recent developments in generative artificial intelligence (AI) for natural language processing have been remarkable and may allow for the easy and immediate creation of diverse clinical vignettes. OBJECTIVE In this study, we evaluated the medical accuracy and grammatical correctness of AI-generated clinical vignettes in Japanese and verified their usefulness. METHODS Clinical vignettes were created using the generative AI model GPT-4-0613. The input prompts for the clinical vignettes specified the following seven elements: (1) age, (2) sex, (3) chief complaint and time course since onset, (4) physical findings, (5) examination results, (6) diagnosis, and (7) treatment course. The list of diseases integrated into the vignettes was based on 202 cases considered in the management of diseases and symptoms in Japan's Primary Care Physicians Training Program. The clinical vignettes were evaluated for medical and Japanese-language accuracy by three physicians using a five-point scale. A total score of 13 points or above was defined as "sufficiently beneficial and immediately usable with minor revisions," a score between 10 and 12 points was defined as "partly insufficient and in need of modifications," and a score of 9 points or below was defined as "insufficient." RESULTS Regarding medical accuracy, of the 202 clinical vignettes, 118 scored 13 points or above, 78 scored between 10 and 12 points, and 6 scored 9 points or below. Regarding Japanese-language accuracy, 142 vignettes scored 13 points or above, 56 scored between 10 and 12 points, and 4 scored 9 points or below. Overall, 97% (196/202) of vignettes were available with some modifications. CONCLUSION Overall, 97% of the clinical vignettes proved practically useful, based on confirmation and revision by Japanese medical physicians. Given the significant effort required by physicians to create vignettes without AI, using GPT is expected to greatly optimize this process.
Collapse
Affiliation(s)
- Yasutaka Yanagita
- Department of General Medicine, Chiba University Hospital, Chiba, Japan.
| | - Daiki Yokokawa
- Department of General Medicine, Chiba University Hospital, Chiba, Japan
| | - Shun Uchida
- Uchida Internal Medicine Clinic, Saitama, Japan
| | - Yu Li
- Department of General Medicine, Chiba University Hospital, Chiba, Japan
| | - Takanori Uehara
- Department of General Medicine, Chiba University Hospital, Chiba, Japan
| | - Masatomi Ikusaka
- Department of General Medicine, Chiba University Hospital, Chiba, Japan
| |
Collapse
|
4
|
Wang Z, Yang W, Li Z, Rong Z, Wang X, Han J, Ma L. A 25-Year Retrospective of the Use of AI for Diagnosing Acute Stroke: Systematic Review. J Med Internet Res 2024; 26:e59711. [PMID: 39255472 PMCID: PMC11422733 DOI: 10.2196/59711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 06/25/2024] [Accepted: 07/15/2024] [Indexed: 09/12/2024] Open
Abstract
BACKGROUND Stroke is a leading cause of death and disability worldwide. Rapid and accurate diagnosis is crucial for minimizing brain damage and optimizing treatment plans. OBJECTIVE This review aims to summarize the methods of artificial intelligence (AI)-assisted stroke diagnosis over the past 25 years, providing an overview of performance metrics and algorithm development trends. It also delves into existing issues and future prospects, intending to offer a comprehensive reference for clinical practice. METHODS A total of 50 representative articles published between 1999 and 2024 on using AI technology for stroke prevention and diagnosis were systematically selected and analyzed in detail. RESULTS AI-assisted stroke diagnosis has made significant advances in stroke lesion segmentation and classification, stroke risk prediction, and stroke prognosis. Before 2012, research mainly focused on segmentation using traditional thresholding and heuristic techniques. From 2012 to 2016, the focus shifted to machine learning (ML)-based approaches. After 2016, the emphasis moved to deep learning (DL), which brought significant improvements in accuracy. In stroke lesion segmentation and classification as well as stroke risk prediction, DL has shown superiority over ML. In stroke prognosis, both DL and ML have shown good performance. CONCLUSIONS Over the past 25 years, AI technology has shown promising performance in stroke diagnosis.
Collapse
Affiliation(s)
| | | | | | - Ze Rong
- Nantong University, Nantong, China
| | | | | | - Lei Ma
- Nantong University, Nantong, China
| |
Collapse
|
5
|
Pool J, Indulska M, Sadiq S. Large language models and generative AI in telehealth: a responsible use lens. J Am Med Inform Assoc 2024; 31:2125-2136. [PMID: 38441296 PMCID: PMC11339524 DOI: 10.1093/jamia/ocae035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 02/05/2024] [Accepted: 02/14/2024] [Indexed: 08/23/2024] Open
Abstract
OBJECTIVE This scoping review aims to assess the current research landscape of the application and use of large language models (LLMs) and generative Artificial Intelligence (AI), through tools such as ChatGPT in telehealth. Additionally, the review seeks to identify key areas for future research, with a particular focus on AI ethics considerations for responsible use and ensuring trustworthy AI. MATERIALS AND METHODS Following the scoping review methodological framework, a search strategy was conducted across 6 databases. To structure our review, we employed AI ethics guidelines and principles, constructing a concept matrix for investigating the responsible use of AI in telehealth. Using the concept matrix in our review enabled the identification of gaps in the literature and informed future research directions. RESULTS Twenty studies were included in the review. Among the included studies, 5 were empirical, and 15 were reviews and perspectives focusing on different telehealth applications and healthcare contexts. Benefit and reliability concepts were frequently discussed in these studies. Privacy, security, and accountability were peripheral themes, with transparency, explainability, human agency, and contestability lacking conceptual or empirical exploration. CONCLUSION The findings emphasized the potential of LLMs, especially ChatGPT, in telehealth. They provide insights into understanding the use of LLMs, enhancing telehealth services, and taking ethical considerations into account. By proposing three future research directions with a focus on responsible use, this review further contributes to the advancement of this emerging phenomenon of healthcare AI.
Collapse
Affiliation(s)
- Javad Pool
- ARC Industrial Transformation Training Centre for Information Resilience (CIRES), The University of Queensland, Brisbane 4072, Australia
- School of Electrical Engineering and Computer Science, The University of Queensland, Brisbane 4072, Australia
| | - Marta Indulska
- ARC Industrial Transformation Training Centre for Information Resilience (CIRES), The University of Queensland, Brisbane 4072, Australia
- Business School, The University of Queensland, Brisbane 4072, Australia
| | - Shazia Sadiq
- ARC Industrial Transformation Training Centre for Information Resilience (CIRES), The University of Queensland, Brisbane 4072, Australia
- School of Electrical Engineering and Computer Science, The University of Queensland, Brisbane 4072, Australia
| |
Collapse
|
6
|
Luo MJ, Pang J, Bi S, Lai Y, Zhao J, Shang Y, Cui T, Yang Y, Lin Z, Zhao L, Wu X, Lin D, Chen J, Lin H. Development and Evaluation of a Retrieval-Augmented Large Language Model Framework for Ophthalmology. JAMA Ophthalmol 2024; 142:798-805. [PMID: 39023885 PMCID: PMC11258636 DOI: 10.1001/jamaophthalmol.2024.2513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 05/14/2024] [Indexed: 07/20/2024]
Abstract
Importance Although augmenting large language models (LLMs) with knowledge bases may improve medical domain-specific performance, practical methods are needed for local implementation of LLMs that address privacy concerns and enhance accessibility for health care professionals. Objective To develop an accurate, cost-effective local implementation of an LLM to mitigate privacy concerns and support their practical deployment in health care settings. Design, Setting, and Participants ChatZOC (Sun Yat-Sen University Zhongshan Ophthalmology Center), a retrieval-augmented LLM framework, was developed by enhancing a baseline LLM with a comprehensive ophthalmic dataset and evaluation framework (CODE), which includes over 30 000 pieces of ophthalmic knowledge. This LLM was benchmarked against 10 representative LLMs, including GPT-4 and GPT-3.5 Turbo (OpenAI), across 300 clinical questions in ophthalmology. The evaluation, involving a panel of medical experts and biomedical researchers, focused on accuracy, utility, and safety. A double-masked approach was used to try to minimize bias assessment across all models. The study used a comprehensive knowledge base derived from ophthalmic clinical practice, without directly involving clinical patients. Exposures LLM response to clinical questions. Main Outcomes and Measures Accuracy, utility, and safety of LLMs in responding to clinical questions. Results The baseline model achieved a human ranking score of 0.48. The retrieval-augmented LLM had a score of 0.60, a difference of 0.12 (95% CI, 0.02-0.22; P = .02) from baseline and not different from GPT-4 with a score of 0.61 (difference = 0.01; 95% CI, -0.11 to 0.13; P = .89). For scientific consensus, the retrieval-augmented LLM was 84.0% compared with the baseline model of 46.5% (difference = 37.5%; 95% CI, 29.0%-46.0%; P < .001) and not different from GPT-4 with a value of 79.2% (difference = 4.8%; 95% CI, -0.3% to 10.0%; P = .06). Conclusions and Relevance Results of this quality improvement study suggest that the integration of high-quality knowledge bases improved the LLM's performance in medical domains. This study highlights the transformative potential of augmented LLMs in clinical practice by providing reliable, safe, and practical clinical information. Further research is needed to explore the broader application of such frameworks in the real world.
Collapse
Affiliation(s)
- Ming-Jie Luo
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Jianyu Pang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Shaowei Bi
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Yunxi Lai
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Jiaman Zhao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Yuanrui Shang
- The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Tingxin Cui
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Yahan Yang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Zhenzhe Lin
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Lanqin Zhao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Xiaohang Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Duoru Lin
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Jingjing Chen
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Haotian Lin
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
- Center for Precision Medicine and Department of Genetics and Biomedical Informatics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Hainan Eye Hospital and Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Haikou, China
| |
Collapse
|
7
|
Kikuchi T, Nakao T, Nakamura Y, Hanaoka S, Mori H, Yoshikawa T. Toward Improved Radiologic Diagnostics: Investigating the Utility and Limitations of GPT-3.5 Turbo and GPT-4 with Quiz Cases. AJNR Am J Neuroradiol 2024:ajnr.A8332. [PMID: 38719605 DOI: 10.3174/ajnr.a8332] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Accepted: 05/03/2024] [Indexed: 08/17/2024]
Abstract
BACKGROUND AND PURPOSE The rise of large language models such as generative pretrained transformers (GPTs) has sparked considerable interest in radiology, especially in interpreting radiologic reports and image findings. While existing research has focused on GPTs estimating diagnoses from radiologic descriptions, exploring alternative diagnostic information sources is also crucial. This study introduces the use of GPTs (GPT-3.5 Turbo and GPT-4) for information retrieval and summarization, searching relevant case reports via PubMed, and investigates their potential to aid diagnosis. MATERIALS AND METHODS From October 2021 to December 2023, we selected 115 cases from the "Case of the Week" series on the American Journal of Neuroradiology website. Their Description and Legend sections were presented to the GPTs for the 2 tasks. For the Direct Diagnosis task, the models provided 3 differential diagnoses that were considered correct if they matched the diagnosis in the diagnosis section. For the Case Report Search task, the models generated 2 keywords per case, creating PubMed search queries to extract up to 3 relevant reports. A response was considered correct if reports containing the disease name stated in the diagnosis section were extracted. The McNemar test was used to evaluate whether adding a Case Report Search to Direct Diagnosis improved overall accuracy. RESULTS In the Direct Diagnosis task, GPT-3.5 Turbo achieved a correct response rate of 26% (30/115 cases), whereas GPT-4 achieved 41% (47/115). For the Case Report Search task, GPT-3.5 Turbo scored 10% (11/115), and GPT-4 scored 7% (8/115). Correct responses totaled 32% (37/115) with 3 overlapping cases for GPT-3.5 Turbo, whereas GPT-4 had 43% (50/115) of correct responses with 5 overlapping cases. Adding Case Report Search improved GPT-3.5 Turbo's performance (P = .023) but not that of GPT-4 (P = .248). CONCLUSIONS The effectiveness of adding Case Report Search to GPT-3.5 Turbo was particularly pronounced, suggesting its potential as an alternative diagnostic approach to GPTs, particularly in scenarios where direct diagnoses from GPTs are not obtainable. Nevertheless, the overall performance of GPT models in both direct diagnosis and case report retrieval tasks remains not optimal, and users should be aware of their limitations.
Collapse
Affiliation(s)
- Tomohiro Kikuchi
- From the Departments of Computational Diagnostic Radiology and Preventive Medicine (T.K., T.N., Y.N., T.Y.), The University of Tokyo Hospital, Tokyo, Japan
- Department of Radiology (T.K., H.M.), School of Medicine, Jichi Medical University, Shimotsuke, Tochigi, Japan
| | - Takahiro Nakao
- From the Departments of Computational Diagnostic Radiology and Preventive Medicine (T.K., T.N., Y.N., T.Y.), The University of Tokyo Hospital, Tokyo, Japan
| | - Yuta Nakamura
- From the Departments of Computational Diagnostic Radiology and Preventive Medicine (T.K., T.N., Y.N., T.Y.), The University of Tokyo Hospital, Tokyo, Japan
| | - Shouhei Hanaoka
- Departments of Radiology (S.H.), The University of Tokyo Hospital, Tokyo, Japan
| | - Harushi Mori
- Department of Radiology (T.K., H.M.), School of Medicine, Jichi Medical University, Shimotsuke, Tochigi, Japan
| | - Takeharu Yoshikawa
- From the Departments of Computational Diagnostic Radiology and Preventive Medicine (T.K., T.N., Y.N., T.Y.), The University of Tokyo Hospital, Tokyo, Japan
| |
Collapse
|
8
|
Sallam M, Al-Mahzoum K, Alshuaib O, Alhajri H, Alotaibi F, Alkhurainej D, Al-Balwah MY, Barakat M, Egger J. Language discrepancies in the performance of generative artificial intelligence models: an examination of infectious disease queries in English and Arabic. BMC Infect Dis 2024; 24:799. [PMID: 39118057 PMCID: PMC11308449 DOI: 10.1186/s12879-024-09725-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 08/06/2024] [Indexed: 08/10/2024] Open
Abstract
BACKGROUND Assessment of artificial intelligence (AI)-based models across languages is crucial to ensure equitable access and accuracy of information in multilingual contexts. This study aimed to compare AI model efficiency in English and Arabic for infectious disease queries. METHODS The study employed the METRICS checklist for the design and reporting of AI-based studies in healthcare. The AI models tested included ChatGPT-3.5, ChatGPT-4, Bing, and Bard. The queries comprised 15 questions on HIV/AIDS, tuberculosis, malaria, COVID-19, and influenza. The AI-generated content was assessed by two bilingual experts using the validated CLEAR tool. RESULTS In comparing AI models' performance in English and Arabic for infectious disease queries, variability was noted. English queries showed consistently superior performance, with Bard leading, followed by Bing, ChatGPT-4, and ChatGPT-3.5 (P = .012). The same trend was observed in Arabic, albeit without statistical significance (P = .082). Stratified analysis revealed higher scores for English in most CLEAR components, notably in completeness, accuracy, appropriateness, and relevance, especially with ChatGPT-3.5 and Bard. Across the five infectious disease topics, English outperformed Arabic, except for flu queries in Bing and Bard. The four AI models' performance in English was rated as "excellent", significantly outperforming their "above-average" Arabic counterparts (P = .002). CONCLUSIONS Disparity in AI model performance was noticed between English and Arabic in response to infectious disease queries. This language variation can negatively impact the quality of health content delivered by AI models among native speakers of Arabic. This issue is recommended to be addressed by AI developers, with the ultimate goal of enhancing health outcomes.
Collapse
Affiliation(s)
- Malik Sallam
- Department of Pathology, Microbiology and Forensic Medicine, School of Medicine, The University of Jordan, Amman, 11942, Jordan.
- Department of Translational Medicine, Faculty of Medicine, Lund University, Malmö, 22184, Sweden.
- Department of Clinical Laboratories and Forensic Medicine, Jordan University Hospital, Queen Rania Al-Abdullah Street-Aljubeiha, P.O. Box: 13046, Amman, Jordan.
| | | | - Omaima Alshuaib
- School of Medicine, The University of Jordan, Amman, 11942, Jordan
| | - Hawajer Alhajri
- School of Medicine, The University of Jordan, Amman, 11942, Jordan
| | - Fatmah Alotaibi
- School of Medicine, The University of Jordan, Amman, 11942, Jordan
| | | | | | - Muna Barakat
- Department of Clinical Pharmacy and Therapeutics, Faculty of Pharmacy, Applied Science Private University, Amman, 11931, Jordan
- MEU Research Unit, Middle East University, Amman, 11831, Jordan
| | - Jan Egger
- Institute for AI in Medicine (IKIM), University Medicine Essen (AöR), Essen, Germany
| |
Collapse
|
9
|
Sridharan K, Sivaramakrishnan G. Enhancing readability of USFDA patient communications through large language models: a proof-of-concept study. Expert Rev Clin Pharmacol 2024; 17:731-741. [PMID: 38823007 DOI: 10.1080/17512433.2024.2363840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 05/31/2024] [Indexed: 06/03/2024]
Abstract
BACKGROUND The US Food and Drug Administration (USFDA) communicates new drug safety concerns through drug safety communications (DSCs) and medication guides (MGs), which often challenge patients with average reading abilities due to their complexity. This study assesses whether large language models (LLMs) can enhance the readability of these materials. METHODS We analyzed the latest DSCs and MGs, using ChatGPT 4.0© and Gemini© to simplify them to a sixth-grade reading level. Outputs were evaluated for readability, technical accuracy, and content inclusiveness. RESULTS Original materials were difficult to read (DSCs grade level 13, MGs 22). LLMs significantly improved readability, reducing the grade levels to more accessible readings (Single prompt - DSCs: ChatGPT 4.0© 10.1, Gemini© 8; MGs: ChatGPT 4.0© 7.1, Gemini© 6.5. Multiple prompts - DSCs: ChatGPT 4.0© 10.3, Gemini© 7.5; MGs: ChatGPT 4.0© 8, Gemini© 6.8). LLM outputs retained technical accuracy and key messages. CONCLUSION LLMs can significantly simplify complex health-related information, making it more accessible to patients. Future research should extend these findings to other languages and patient groups in real-world settings.
Collapse
Affiliation(s)
- Kannan Sridharan
- Department of Pharmacology & Therapeutics, College of Medicine & Medical Sciences, Arabian Gulf University, Manama, Kingdom of Bahrain
| | - Gowri Sivaramakrishnan
- Speciality Dental Residency Program, Primary Health Care Centers, Manama, Kingdom of Bahrain
| |
Collapse
|
10
|
MohanaSundaram A, Emran TB. A commentary on 'ChatGPT in medicine: prospects and challenges: a review article' - correspondence. Int J Surg 2024; 110:5178-5179. [PMID: 38640507 PMCID: PMC11325996 DOI: 10.1097/js9.0000000000001487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 04/09/2024] [Indexed: 04/21/2024]
Affiliation(s)
| | - Talha Bin Emran
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka, Bangladesh
| |
Collapse
|
11
|
Leypold T, Schäfer B, Boos AM, Beier JP. Artificial Intelligence-Powered Hand Surgery Consultation: GPT-4 as an Assistant in a Hand Surgery Outpatient Clinic. J Hand Surg Am 2024:S0363-5023(24)00261-2. [PMID: 39066762 DOI: 10.1016/j.jhsa.2024.06.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 05/21/2024] [Accepted: 06/05/2024] [Indexed: 07/30/2024]
Abstract
PURPOSE Exploring the integration of artificial intelligence in clinical settings, this study examined the feasibility of using Generative Pretrained Transformer 4 (GPT-4), a large language model, as a consultation assistant in a hand surgery outpatient clinic. METHODS The study involved 10 simulated patient scenarios with common hand conditions, where GPT-4, enhanced through specific prompt engineering techniques, conducted medical history interviews, and assisted in diagnostic processes. A panel of expert hand surgeons, each board-certified in hand surgery, evaluated GPT-4's responses using a Likert Scale across five criteria with scores ranging from 1 (lowest) to 5 (highest). RESULTS Generative Pretrained Transformer 4 achieved an average score of 4.6, reflecting good performance in documenting a medical history, as evaluated by the hand surgeons. CONCLUSIONS These findings suggest that GPT-4 can effectively document medical histories to meet the standards of hand surgeons in a simulated environment. The findings indicate potential for future application in patient care, but the actual performance of GPT-4 in real clinical settings remains to be investigated. CLINICAL RELEVANCE This study provides a preliminary indication that GPT-4 could be a useful consultation assistant in a hand surgery outpatient clinic, but further research is required to explore its reliability and practicality in actual practice.
Collapse
Affiliation(s)
- Tim Leypold
- Department of Plastic Surgery, Hand Surgery-Burn Center, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, Germany.
| | - Benedikt Schäfer
- Department of Plastic Surgery, Hand Surgery-Burn Center, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, Germany
| | - Anja M Boos
- Department of Plastic Surgery, Hand Surgery-Burn Center, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, Germany
| | - Justus P Beier
- Department of Plastic Surgery, Hand Surgery-Burn Center, University Hospital Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, Germany
| |
Collapse
|
12
|
Cheng J, Huang C, Zhang J, Wu B, Zhang W, Liu X, Zhang J, Tang Y, Zhou H, Zhang Q, Gu M, Dong J, Zhang X. Multimodal deep learning using on-chip diffractive optics with in situ training capability. Nat Commun 2024; 15:6189. [PMID: 39043669 PMCID: PMC11266606 DOI: 10.1038/s41467-024-50677-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 07/18/2024] [Indexed: 07/25/2024] Open
Abstract
Multimodal deep learning plays a pivotal role in supporting the processing and learning of diverse data types within the realm of artificial intelligence generated content (AIGC). However, most photonic neuromorphic processors for deep learning can only handle a single data modality (either vision or audio) due to the lack of abundant parameter training in optical domain. Here, we propose and demonstrate a trainable diffractive optical neural network (TDONN) chip based on on-chip diffractive optics with massive tunable elements to address these constraints. The TDONN chip includes one input layer, five hidden layers, and one output layer, and only one forward propagation is required to obtain the inference results without frequent optical-electrical conversion. The customized stochastic gradient descent algorithm and the drop-out mechanism are developed for photonic neurons to realize in situ training and fast convergence in the optical domain. The TDONN chip achieves a potential throughput of 217.6 tera-operations per second (TOPS) with high computing density (447.7 TOPS/mm2), high system-level energy efficiency (7.28 TOPS/W), and low optical latency (30.2 ps). The TDONN chip has successfully implemented four-class classification in different modalities (vision, audio, and touch) and achieve 85.7% accuracy on multimodal test sets. Our work opens up a new avenue for multimodal deep learning with integrated photonic processors, providing a potential solution for low-power AI large models using photonic technology.
Collapse
Affiliation(s)
- Junwei Cheng
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Chaoran Huang
- Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, 999077, China
| | - Jialong Zhang
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Bo Wu
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Wenkai Zhang
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Xinyu Liu
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Jiahui Zhang
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Yiyi Tang
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Hailong Zhou
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Qiming Zhang
- Institute of Photonic Chips, University of Shanghai for Science and Technology, Shanghai, 200093, China
| | - Min Gu
- Institute of Photonic Chips, University of Shanghai for Science and Technology, Shanghai, 200093, China
| | - Jianji Dong
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China.
- Optics Valley Laboratory, Wuhan, 430074, China.
| | - Xinliang Zhang
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China
- Optics Valley Laboratory, Wuhan, 430074, China
| |
Collapse
|
13
|
Haider SA, Pressman SM, Borna S, Gomez-Cabello CA, Sehgal A, Leibovich BC, Forte AJ. Evaluating Large Language Model (LLM) Performance on Established Breast Classification Systems. Diagnostics (Basel) 2024; 14:1491. [PMID: 39061628 PMCID: PMC11275570 DOI: 10.3390/diagnostics14141491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 06/25/2024] [Accepted: 07/09/2024] [Indexed: 07/28/2024] Open
Abstract
Medical researchers are increasingly utilizing advanced LLMs like ChatGPT-4 and Gemini to enhance diagnostic processes in the medical field. This research focuses on their ability to comprehend and apply complex medical classification systems for breast conditions, which can significantly aid plastic surgeons in making informed decisions for diagnosis and treatment, ultimately leading to improved patient outcomes. Fifty clinical scenarios were created to evaluate the classification accuracy of each LLM across five established breast-related classification systems. Scores from 0 to 2 were assigned to LLM responses to denote incorrect, partially correct, or completely correct classifications. Descriptive statistics were employed to compare the performances of ChatGPT-4 and Gemini. Gemini exhibited superior overall performance, achieving 98% accuracy compared to ChatGPT-4's 71%. While both models performed well in the Baker classification for capsular contracture and UTSW classification for gynecomastia, Gemini consistently outperformed ChatGPT-4 in other systems, such as the Fischer Grade Classification for gender-affirming mastectomy, Kajava Classification for ectopic breast tissue, and Regnault Classification for breast ptosis. With further development, integrating LLMs into plastic surgery practice will likely enhance diagnostic support and decision making.
Collapse
Affiliation(s)
- Syed Ali Haider
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | | | - Sahar Borna
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | | | - Ajai Sehgal
- Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA
| | - Bradley C. Leibovich
- Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA
- Department of Urology, Mayo Clinic, Rochester, MN 55905, USA
| | - Antonio Jorge Forte
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
- Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
14
|
Yang Z, Wang D, Zhou F, Song D, Zhang Y, Jiang J, Kong K, Liu X, Qiao Y, Chang RT, Han Y, Li F, Tham CC, Zhang X. Understanding natural language: Potential application of large language models to ophthalmology. Asia Pac J Ophthalmol (Phila) 2024; 13:100085. [PMID: 39059558 DOI: 10.1016/j.apjo.2024.100085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 06/19/2024] [Accepted: 07/19/2024] [Indexed: 07/28/2024] Open
Abstract
Large language models (LLMs), a natural language processing technology based on deep learning, are currently in the spotlight. These models closely mimic natural language comprehension and generation. Their evolution has undergone several waves of innovation similar to convolutional neural networks. The transformer architecture advancement in generative artificial intelligence marks a monumental leap beyond early-stage pattern recognition via supervised learning. With the expansion of parameters and training data (terabytes), LLMs unveil remarkable human interactivity, encompassing capabilities such as memory retention and comprehension. These advances make LLMs particularly well-suited for roles in healthcare communication between medical practitioners and patients. In this comprehensive review, we discuss the trajectory of LLMs and their potential implications for clinicians and patients. For clinicians, LLMs can be used for automated medical documentation, and given better inputs and extensive validation, LLMs may be able to autonomously diagnose and treat in the future. For patient care, LLMs can be used for triage suggestions, summarization of medical documents, explanation of a patient's condition, and customizing patient education materials tailored to their comprehension level. The limitations of LLMs and possible solutions for real-world use are also presented. Given the rapid advancements in this area, this review attempts to briefly cover many roles that LLMs may play in the ophthalmic space, with a focus on improving the quality of healthcare delivery.
Collapse
Affiliation(s)
- Zefeng Yang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou 510060, China
| | - Deming Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou 510060, China
| | - Fengqi Zhou
- Ophthalmology, Mayo Clinic Health System, Eau Claire, Wisconsin, USA
| | - Diping Song
- Shanghai Artificial Intelligence Laboratory, Shanghai, China
| | - Yinhang Zhang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou 510060, China
| | - Jiaxuan Jiang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou 510060, China
| | - Kangjie Kong
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou 510060, China
| | - Xiaoyi Liu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou 510060, China
| | - Yu Qiao
- Shanghai Artificial Intelligence Laboratory, Shanghai, China
| | - Robert T Chang
- Department of Ophthalmology, Byers Eye Institute at Stanford University, Palo Alto, CA, USA
| | - Ying Han
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, USA
| | - Fei Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou 510060, China.
| | - Clement C Tham
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China; Hong Kong Eye Hospital, Kowloon, Hong Kong SAR, China; Department of Ophthalmology and Visual Sciences, Prince of Wales Hospital, Shatin, Hong Kong SAR, China.
| | - Xiulan Zhang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou 510060, China.
| |
Collapse
|
15
|
Ong JCL, Chang SYH, William W, Butte AJ, Shah NH, Chew LST, Liu N, Doshi-Velez F, Lu W, Savulescu J, Ting DSW. Ethical and regulatory challenges of large language models in medicine. Lancet Digit Health 2024; 6:e428-e432. [PMID: 38658283 DOI: 10.1016/s2589-7500(24)00061-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 03/08/2024] [Accepted: 03/12/2024] [Indexed: 04/26/2024]
Abstract
With the rapid growth of interest in and use of large language models (LLMs) across various industries, we are facing some crucial and profound ethical concerns, especially in the medical field. The unique technical architecture and purported emergent abilities of LLMs differentiate them substantially from other artificial intelligence (AI) models and natural language processing techniques used, necessitating a nuanced understanding of LLM ethics. In this Viewpoint, we highlight ethical concerns stemming from the perspectives of users, developers, and regulators, notably focusing on data privacy and rights of use, data provenance, intellectual property contamination, and broad applications and plasticity of LLMs. A comprehensive framework and mitigating strategies will be imperative for the responsible integration of LLMs into medical practice, ensuring alignment with ethical principles and safeguarding against potential societal risks.
Collapse
Affiliation(s)
- Jasmine Chiat Ling Ong
- Division of Pharmacy, Singapore General Hospital, Singapore; Duke-NUS Medical School, National University of Singapore, Singapore
| | - Shelley Yin-Hsi Chang
- Department of Ophthalmology, Chang Gung Memorial Hospital, Linkou Medical Center, Taoyuan, Taiwan; College of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Wasswa William
- Department of Biomedical Sciences and Engineering, Mbarara University of Science and Technology, Mbarara, Uganda
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, and Department of Pediatrics, University of California, San Francisco, San Francisco, CA, USA; Center for Data-Driven Insights and Innovation, University of California Health, Oakland, CA, USA
| | - Nigam H Shah
- Stanford Health Care, Palo Alto, CA, USA; Department of Medicine, and Clinical Excellence Research Center, School of Medicine, Stanford University, Stanford, CA, USA
| | - Lita Sui Tjien Chew
- Department of Pharmacy, National University of Singapore, Singapore; Singapore Health Services, Pharmacy and Therapeutics Council Office, Singapore; Department of Pharmacy, National Cancer Centre Singapore, Singapore
| | - Nan Liu
- Duke-NUS Medical School, National University of Singapore, Singapore
| | - Finale Doshi-Velez
- Harvard Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Wei Lu
- StatNLP Research Group, Singapore University of Technology and Design, Singpore
| | - Julian Savulescu
- Murdoch Children's Research Institute, Melbourne, VIC, Australia; Centre for Biomedical Ethics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Oxford Uehiro Centre for Practical Ethics, Faculty of Philosophy, University of Oxford, Oxford, UK
| | - Daniel Shu Wei Ting
- Duke-NUS Medical School, National University of Singapore, Singapore; Artificial Intelligence and Digital Innovation, Singapore Eye Research Institute, Singapore National Eye Center, Singapore Health Service, Singapore; Byers Eye Institute, Stanford University, Palo Alto, CA, USA.
| |
Collapse
|
16
|
Tan S, Xin X, Wu D. ChatGPT in medicine: prospects and challenges: a review article. Int J Surg 2024; 110:3701-3706. [PMID: 38502861 PMCID: PMC11175750 DOI: 10.1097/js9.0000000000001312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 02/26/2024] [Indexed: 03/21/2024]
Abstract
It has been a year since the launch of Chat Generator Pre-Trained Transformer (ChatGPT), a generative artificial intelligence (AI) program. The introduction of this cross-generational product initially brought a huge shock to people with its incredible potential and then aroused increasing concerns among people. In the field of medicine, researchers have extensively explored the possible applications of ChatGPT and achieved numerous satisfactory results. However, opportunities and issues always come together. Problems have also been exposed during the applications of ChatGPT, requiring cautious handling, thorough consideration, and further guidelines for safe use. Here, the authors summarized the potential applications of ChatGPT in the medical field, including revolutionizing healthcare consultation, assisting patient management and treatment, transforming medical education, and facilitating clinical research. Meanwhile, the authors also enumerated researchers' concerns arising along with its broad and satisfactory applications. As it is irreversible that AI will gradually permeate every aspect of modern life, the authors hope that this review can not only promote people's understanding of the potential applications of ChatGPT in the future but also remind them to be more cautious about this "Pandora's Box" in the medical field. It is necessary to establish normative guidelines for its safe use in the medical field as soon as possible.
Collapse
Affiliation(s)
| | | | - Di Wu
- Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shijingshan, Beijing, China
| |
Collapse
|
17
|
Naqvi WM, Shaikh SZ, Mishra GV. Large language models in physical therapy: time to adapt and adept. Front Public Health 2024; 12:1364660. [PMID: 38887241 PMCID: PMC11182445 DOI: 10.3389/fpubh.2024.1364660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 05/10/2024] [Indexed: 06/20/2024] Open
Abstract
Healthcare is experiencing a transformative phase, with artificial intelligence (AI) and machine learning (ML). Physical therapists (PTs) stand on the brink of a paradigm shift in education, practice, and research. Rather than visualizing AI as a threat, it presents an opportunity to revolutionize. This paper examines how large language models (LLMs), such as ChatGPT and BioMedLM, driven by deep ML can offer human-like performance but face challenges in accuracy due to vast data in PT and rehabilitation practice. PTs can benefit by developing and training an LLM specifically for streamlining administrative tasks, connecting globally, and customizing treatments using LLMs. However, human touch and creativity remain invaluable. This paper urges PTs to engage in learning and shaping AI models by highlighting the need for ethical use and human supervision to address potential biases. Embracing AI as a contributor, and not just a user, is crucial by integrating AI, fostering collaboration for a future in which AI enriches the PT field provided data accuracy, and the challenges associated with feeding the AI model are sensitively addressed.
Collapse
Affiliation(s)
- Waqar M. Naqvi
- Department of Interdisciplinary Sciences, Datta Meghe Institute of Higher Education and Research, Wardha, India
- Department of Physiotherapy, College of Health Sciences, Gulf Medical University, Ajman, United Arab Emirates
- NKP Salve Institute of Medical Sciences and Research Center, Nagpur, India
| | - Summaiya Zareen Shaikh
- Department of Neuro-Physiotherapy, The SIA College of Health Sciences, College of Physiotherapy, Thane, India
| | - Gaurav V. Mishra
- Department of Radiodiagnosis, Datta Meghe Institute of Higher Education and Research, Wardha, India
| |
Collapse
|
18
|
Leypold T, Lingens LF, Beier JP, Boos AM. Integrating AI in Lipedema Management: Assessing the Efficacy of GPT-4 as a Consultation Assistant. Life (Basel) 2024; 14:646. [PMID: 38792666 PMCID: PMC11122530 DOI: 10.3390/life14050646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 05/09/2024] [Accepted: 05/16/2024] [Indexed: 05/26/2024] Open
Abstract
The role of artificial intelligence (AI) in healthcare is evolving, offering promising avenues for enhancing clinical decision making and patient management. Limited knowledge about lipedema often leads to patients being frequently misdiagnosed with conditions like lymphedema or obesity rather than correctly identifying lipedema. Furthermore, patients with lipedema often present with intricate and extensive medical histories, resulting in significant time consumption during consultations. AI could, therefore, improve the management of these patients. This research investigates the utilization of OpenAI's Generative Pre-Trained Transformer 4 (GPT-4), a sophisticated large language model (LLM), as an assistant in consultations for lipedema patients. Six simulated scenarios were designed to mirror typical patient consultations commonly encountered in a lipedema clinic. GPT-4 was tasked with conducting patient interviews to gather medical histories, presenting its findings, making preliminary diagnoses, and recommending further diagnostic and therapeutic actions. Advanced prompt engineering techniques were employed to refine the efficacy, relevance, and accuracy of GPT-4's responses. A panel of experts in lipedema treatment, using a Likert Scale, evaluated GPT-4's responses across six key criteria. Scoring ranged from 1 (lowest) to 5 (highest), with GPT-4 achieving an average score of 4.24, indicating good reliability and applicability in a clinical setting. This study is one of the initial forays into applying large language models like GPT-4 in specific clinical scenarios, such as lipedema consultations. It demonstrates the potential of AI in supporting clinical practices and emphasizes the continuing importance of human expertise in the medical field, despite ongoing technological advancements.
Collapse
Affiliation(s)
- Tim Leypold
- Department of Plastic Surgery, Hand Surgery–Burn Center, University Hospital RWTH Aachen, Pauwelsstraße 30, 52074 Aachen, Germany; (L.F.L.); (J.P.B.); (A.M.B.)
| | | | | | | |
Collapse
|
19
|
Bitterman DS, Downing A, Maués J, Lustberg M. Promise and Perils of Large Language Models for Cancer Survivorship and Supportive Care. J Clin Oncol 2024; 42:1607-1611. [PMID: 38452323 PMCID: PMC11095890 DOI: 10.1200/jco.23.02439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 12/07/2023] [Accepted: 01/17/2024] [Indexed: 03/09/2024] Open
Abstract
A call to action to bring stakeholders together to plan for the future of LLM-enhanced cancer survivorship.
Collapse
Affiliation(s)
- Danielle S. Bitterman
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA
- Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA
| | | | | | - Maryam Lustberg
- Department of Medical Oncology, Yale School of Medicine, New Haven, CT
| |
Collapse
|
20
|
Esmaeilzadeh P. Challenges and strategies for wide-scale artificial intelligence (AI) deployment in healthcare practices: A perspective for healthcare organizations. Artif Intell Med 2024; 151:102861. [PMID: 38555850 DOI: 10.1016/j.artmed.2024.102861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 03/19/2024] [Accepted: 03/25/2024] [Indexed: 04/02/2024]
Abstract
Healthcare organizations have realized that Artificial intelligence (AI) can provide a competitive edge through personalized patient experiences, improved patient outcomes, early diagnosis, augmented clinician capabilities, enhanced operational efficiencies, or improved medical service accessibility. However, deploying AI-driven tools in the healthcare ecosystem could be challenging. This paper categorizes AI applications in healthcare and comprehensively examines the challenges associated with deploying AI in medical practices at scale. As AI continues to make strides in healthcare, its integration presents various challenges, including production timelines, trust generation, privacy concerns, algorithmic biases, and data scarcity. The paper highlights that flawed business models and wrong workflows in healthcare practices cannot be rectified merely by deploying AI-driven tools. Healthcare organizations should re-evaluate root problems such as misaligned financial incentives (e.g., fee-for-service models), dysfunctional medical workflows (e.g., high rates of patient readmissions), poor care coordination between different providers, fragmented electronic health records systems, and inadequate patient education and engagement models in tandem with AI adoption. This study also explores the need for a cultural shift in viewing AI not as a threat but as an enabler that can enhance healthcare delivery and create new employment opportunities while emphasizing the importance of addressing underlying operational issues. The necessity of investments beyond finance is discussed, emphasizing the importance of human capital, continuous learning, and a supportive environment for AI integration. The paper also highlights the crucial role of clear regulations in building trust, ensuring safety, and guiding the ethical use of AI, calling for coherent frameworks addressing transparency, model accuracy, data quality control, liability, and ethics. Furthermore, this paper underscores the importance of advancing AI literacy within academia to prepare future healthcare professionals for an AI-driven landscape. Through careful navigation and proactive measures addressing these challenges, the healthcare community can harness AI's transformative power responsibly and effectively, revolutionizing healthcare delivery and patient care. The paper concludes with a vision and strategic suggestions for the future of healthcare with AI, emphasizing thoughtful, responsible, and innovative engagement as the pathway to realizing its full potential to unlock immense benefits for healthcare organizations, physicians, nurses, and patients while proactively mitigating risks.
Collapse
Affiliation(s)
- Pouyan Esmaeilzadeh
- Department of Information Systems and Business Analytics, College of Business, Florida International University (FIU), Modesto A. Maidique Campus, 11200 S.W. 8th St, RB 261B, Miami, FL 33199, United States.
| |
Collapse
|
21
|
Sheikh MS, Thongprayoon C, Suppadungsuk S, Miao J, Qureshi F, Kashani K, Cheungpasitporn W. Evaluating ChatGPT's Accuracy in Responding to Patient Education Questions on Acute Kidney Injury and Continuous Renal Replacement Therapy. Blood Purif 2024; 53:725-731. [PMID: 38679000 DOI: 10.1159/000539065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 04/19/2024] [Indexed: 05/01/2024]
Abstract
INTRODUCTION Acute kidney injury (AKI) and continuous renal replacement therapy (CRRT) are critical areas in nephrology. The effectiveness of ChatGPT in simpler, patient education-oriented questions has not been thoroughly assessed. This study evaluates the proficiency of ChatGPT 4.0 in responding to such questions, subjected to various linguistic alterations. METHODS Eighty-nine questions were sourced from the Mayo Clinic Handbook for educating patients on AKI and CRRT. These questions were categorized as original, paraphrased with different interrogative adverbs, paraphrased resulting in incomplete sentences, and paraphrased containing misspelled words. Two nephrologists verified the questions for medical accuracy. A χ2 test was conducted to ascertain notable discrepancies in ChatGPT 4.0's performance across these formats. RESULTS ChatGPT provided notable accuracy in handling a variety of question formats for patient education in AKI and CRRT. Across all question types, ChatGPT demonstrated an accuracy of 97% for both original and adverb-altered questions and 98% for questions with incomplete sentences or misspellings. Specifically for AKI-related questions, the accuracy was consistently maintained at 97% for all versions. In the subset of CRRT-related questions, the tool achieved a 96% accuracy for original and adverb-altered questions, and this increased to 98% for questions with incomplete sentences or misspellings. The statistical analysis revealed no significant difference in performance across these varied question types (p value: 1.00 for AKI and 1.00 for CRRT), and there was no notable disparity between the artificial intelligence (AI)'s responses to AKI and CRRT questions (p value: 0.71). CONCLUSION ChatGPT 4.0 demonstrates consistent and high accuracy in interpreting and responding to queries related to AKI and CRRT, irrespective of linguistic modifications. These findings suggest that ChatGPT 4.0 has the potential to be a reliable support tool in the delivery of patient education, by accurately providing information across a range of question formats. Further research is needed to explore the direct impact of AI-generated responses on patient understanding and education outcomes.
Collapse
Affiliation(s)
- Mohammad Salman Sheikh
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Charat Thongprayoon
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Supawadee Suppadungsuk
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
- Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Salaya, Thailand
| | - Jing Miao
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Fawad Qureshi
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Kianoush Kashani
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Wisit Cheungpasitporn
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
22
|
Araújo R, Ramalhete L, Viegas A, Von Rekowski CP, Fonseca TAH, Calado CRC, Bento L. Simplifying Data Analysis in Biomedical Research: An Automated, User-Friendly Tool. Methods Protoc 2024; 7:36. [PMID: 38804330 PMCID: PMC11130801 DOI: 10.3390/mps7030036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 04/20/2024] [Accepted: 04/22/2024] [Indexed: 05/29/2024] Open
Abstract
Robust data normalization and analysis are pivotal in biomedical research to ensure that observed differences in populations are directly attributable to the target variable, rather than disparities between control and study groups. ArsHive addresses this challenge using advanced algorithms to normalize populations (e.g., control and study groups) and perform statistical evaluations between demographic, clinical, and other variables within biomedical datasets, resulting in more balanced and unbiased analyses. The tool's functionality extends to comprehensive data reporting, which elucidates the effects of data processing, while maintaining dataset integrity. Additionally, ArsHive is complemented by A.D.A. (Autonomous Digital Assistant), which employs OpenAI's GPT-4 model to assist researchers with inquiries, enhancing the decision-making process. In this proof-of-concept study, we tested ArsHive on three different datasets derived from proprietary data, demonstrating its effectiveness in managing complex clinical and therapeutic information and highlighting its versatility for diverse research fields.
Collapse
Affiliation(s)
- Rúben Araújo
- NMS—NOVA Medical School, FCM—Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, Campo Mártires da Pátria 130, 1169-056 Lisbon, Portugal
- CHRC—Comprehensive Health Research Centre, Universidade NOVA de Lisboa, 1150-082 Lisbon, Portugal
- ISEL—Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa, Rua Conselheiro Emídio Navarro 1, 1959-007 Lisbon, Portugal
| | - Luís Ramalhete
- NMS—NOVA Medical School, FCM—Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, Campo Mártires da Pátria 130, 1169-056 Lisbon, Portugal
- Blood and Transplantation Center of Lisbon, IPST—Instituto Português do Sangue e da Transplantação, Alameda das Linhas de Torres 117, 1769-001 Lisbon, Portugal
- iNOVA4Health—Advancing Precision Medicine, RG11: Reno-Vascular Diseases Group, NMS—NOVA Medical School, FCM—Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, 1169-056 Lisbon, Portugal
| | - Ana Viegas
- CHRC—Comprehensive Health Research Centre, Universidade NOVA de Lisboa, 1150-082 Lisbon, Portugal
- ESTeSL—Escola Superior de Tecnologia da Saúde de Lisboa, Instituto Politécnico de Lisboa, Avenida D. João II, Lote 4.69.01, 1990-096 Lisbon, Portugal
- Neurosciences Area, Clinical Neurophysiology Unit, ULSSJ—Unidade Local de Saúde São José, Rua José António Serrano, 1150-199 Lisbon, Portugal
| | - Cristiana P. Von Rekowski
- NMS—NOVA Medical School, FCM—Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, Campo Mártires da Pátria 130, 1169-056 Lisbon, Portugal
- CHRC—Comprehensive Health Research Centre, Universidade NOVA de Lisboa, 1150-082 Lisbon, Portugal
- ISEL—Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa, Rua Conselheiro Emídio Navarro 1, 1959-007 Lisbon, Portugal
| | - Tiago A. H. Fonseca
- NMS—NOVA Medical School, FCM—Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, Campo Mártires da Pátria 130, 1169-056 Lisbon, Portugal
- CHRC—Comprehensive Health Research Centre, Universidade NOVA de Lisboa, 1150-082 Lisbon, Portugal
- ISEL—Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa, Rua Conselheiro Emídio Navarro 1, 1959-007 Lisbon, Portugal
| | - Cecília R. C. Calado
- ISEL—Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa, Rua Conselheiro Emídio Navarro 1, 1959-007 Lisbon, Portugal
- Institute for Bioengineering and Biosciences (iBB), The Associate Laboratory Institute for Health and Bioeconomy–i4HB, Instituto Superior Técnico (IST), Universidade de Lisboa (UL), Av. Rovisco Pais, 1049-001 Lisboa, Portugal
| | - Luís Bento
- Intensive Care Department, ULSSJ—Unidade Local de Saúde São José, Rua José António Serrano, 1150-199 Lisbon, Portugal;
- Integrated Pathophysiological Mechanisms, CHRC—Comprehensive Health Research Centre, NMS—NOVA Medical School, FCM—Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, Campo Mártires da Pátria 130, 1169-056 Lisbon, Portugal
| |
Collapse
|
23
|
Ostrowska M, Kacała P, Onolememen D, Vaughan-Lane K, Sisily Joseph A, Ostrowski A, Pietruszewska W, Banaszewski J, Wróbel MJ. To trust or not to trust: evaluating the reliability and safety of AI responses to laryngeal cancer queries. Eur Arch Otorhinolaryngol 2024:10.1007/s00405-024-08643-8. [PMID: 38652298 DOI: 10.1007/s00405-024-08643-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Accepted: 03/26/2024] [Indexed: 04/25/2024]
Abstract
PURPOSE As online health information-seeking surges, concerns mount over the quality and safety of accessible content, potentially leading to patient harm through misinformation. On one hand, the emergence of Artificial Intelligence (AI) in healthcare could prevent it; on the other hand, questions raise regarding the quality and safety of the medical information provided. As laryngeal cancer is a prevalent head and neck malignancy, this study aims to evaluate the utility and safety of three large language models (LLMs) as sources of patient information about laryngeal cancer. METHODS A cross-sectional study was conducted using three LLMs (ChatGPT 3.5, ChatGPT 4.0, and Bard). A questionnaire comprising 36 inquiries about laryngeal cancer was categorised into diagnosis (11 questions), treatment (9 questions), novelties and upcoming treatments (4 questions), controversies (8 questions), and sources of information (4 questions). The population of reviewers consisted of 3 groups, including ENT specialists, junior physicians, and non-medicals, who graded the responses. Each physician evaluated each question twice for each model, while non-medicals only once. Everyone was blinded to the model type, and the question order was shuffled. Outcome evaluations were based on a safety score (1-3) and a Global Quality Score (GQS, 1-5). Results were compared between LLMs. The study included iterative assessments and statistical validations. RESULTS Analysis revealed that ChatGPT 3.5 scored highest in both safety (mean: 2.70) and GQS (mean: 3.95). ChatGPT 4.0 and Bard had lower safety scores of 2.56 and 2.42, respectively, with corresponding quality scores of 3.65 and 3.38. Inter-rater reliability was consistent, with less than 3% discrepancy. About 4.2% of responses fell into the lowest safety category (1), particularly in the novelty category. Non-medical reviewers' quality assessments correlated moderately (r = 0.67) with response length. CONCLUSIONS LLMs can be valuable resources for patients seeking information on laryngeal cancer. ChatGPT 3.5 provided the most reliable and safe responses among the models evaluated.
Collapse
Affiliation(s)
- Magdalena Ostrowska
- Department of Otolaryngology and Laryngological Oncology, Collegium Medicum, Nicolaus Copernicus University in Torun, ul.Marie Sklodowskiej-Curie 9, 85-094, Bydgoszcz, Poland
| | - Paulina Kacała
- ENT Scientific Club, Department of Otolaryngology and Laryngological Oncology, Collegium Medicum, Nicolaus Copernicus University in Torun, ul.Marie Sklodowskiej-Curie 9, 85-094, Bydgoszcz, Poland
| | - Deborah Onolememen
- ENT Scientific Club, Department of Otolaryngology and Laryngological Oncology, Collegium Medicum, Nicolaus Copernicus University in Torun, ul.Marie Sklodowskiej-Curie 9, 85-094, Bydgoszcz, Poland
| | - Katie Vaughan-Lane
- ENT Scientific Club, Department of Otolaryngology and Laryngological Oncology, Collegium Medicum, Nicolaus Copernicus University in Torun, ul.Marie Sklodowskiej-Curie 9, 85-094, Bydgoszcz, Poland.
| | - Anitta Sisily Joseph
- ENT Scientific Club, Department of Otolaryngology and Laryngological Oncology, Collegium Medicum, Nicolaus Copernicus University in Torun, ul.Marie Sklodowskiej-Curie 9, 85-094, Bydgoszcz, Poland
| | - Adam Ostrowski
- Department of Urology, Collegium Medicum, Nicolaus Copernicus University in Torun, ul.Marie Sklodowskiej-Curie 9, 85-094, Bydgoszcz, Poland
| | - Wioletta Pietruszewska
- Department of Otolaryngology, Laryngological Oncology, Audiology and Phoniatrics, Medical University of Lodz, ul Żeromskiego 113, 90-549, Lodz, Poland
| | - Jacek Banaszewski
- Department of Otolaryngology, Head and Neck Oncology, Poznan University of Medical Science, ul Przybyszewskiego 49, 60-355, Poznań, Poland
| | - Maciej J Wróbel
- Department of Otolaryngology and Laryngological Oncology, Collegium Medicum, Nicolaus Copernicus University in Torun, ul.Marie Sklodowskiej-Curie 9, 85-094, Bydgoszcz, Poland
| |
Collapse
|
24
|
Mu Y, He D. The Potential Applications and Challenges of ChatGPT in the Medical Field. Int J Gen Med 2024; 17:817-826. [PMID: 38476626 PMCID: PMC10929156 DOI: 10.2147/ijgm.s456659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 02/26/2024] [Indexed: 03/14/2024] Open
Abstract
ChatGPT, an AI-driven conversational large language model (LLM), has garnered significant scholarly attention since its inception, owing to its manifold applications in the realm of medical science. This study primarily examines the merits, limitations, anticipated developments, and practical applications of ChatGPT in clinical practice, healthcare, medical education, and medical research. It underscores the necessity for further research and development to enhance its performance and deployment. Moreover, future research avenues encompass ongoing enhancements and standardization of ChatGPT, mitigating its limitations, and exploring its integration and applicability in translational and personalized medicine. Reflecting the narrative nature of this review, a focused literature search was performed to identify relevant publications on ChatGPT's use in medicine. This process was aimed at gathering a broad spectrum of insights to provide a comprehensive overview of the current state and future prospects of ChatGPT in the medical domain. The objective is to aid healthcare professionals in understanding the groundbreaking advancements associated with the latest artificial intelligence tools, while also acknowledging the opportunities and challenges presented by ChatGPT.
Collapse
Affiliation(s)
- Yonglin Mu
- Department of Urology, Children’s Hospital of Chongqing Medical University, Chongqing, People’s Republic of China
| | - Dawei He
- Department of Urology, Children’s Hospital of Chongqing Medical University, Chongqing, People’s Republic of China
| |
Collapse
|
25
|
Denecke K, May R, Rivera-Romero O. Transformer Models in Healthcare: A Survey and Thematic Analysis of Potentials, Shortcomings and Risks. J Med Syst 2024; 48:23. [PMID: 38367119 PMCID: PMC10874304 DOI: 10.1007/s10916-024-02043-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 02/10/2024] [Indexed: 02/19/2024]
Abstract
Large Language Models (LLMs) such as General Pretrained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT), which use transformer model architectures, have significantly advanced artificial intelligence and natural language processing. Recognized for their ability to capture associative relationships between words based on shared context, these models are poised to transform healthcare by improving diagnostic accuracy, tailoring treatment plans, and predicting patient outcomes. However, there are multiple risks and potentially unintended consequences associated with their use in healthcare applications. This study, conducted with 28 participants using a qualitative approach, explores the benefits, shortcomings, and risks of using transformer models in healthcare. It analyses responses to seven open-ended questions using a simplified thematic analysis. Our research reveals seven benefits, including improved operational efficiency, optimized processes and refined clinical documentation. Despite these benefits, there are significant concerns about the introduction of bias, auditability issues and privacy risks. Challenges include the need for specialized expertise, the emergence of ethical dilemmas and the potential reduction in the human element of patient care. For the medical profession, risks include the impact on employment, changes in the patient-doctor dynamic, and the need for extensive training in both system operation and data interpretation.
Collapse
Affiliation(s)
- Kerstin Denecke
- Institute Patient-centered Digital Health, Bern University of Applied Sciences, Quellgasse 21, Biel, 2502, Switzerland.
| | - Richard May
- Harz University of Applied Sciences, Friedrichstraße 57-59, 38855, Wernigerode, Germany
| | - Octavio Rivera-Romero
- Instituto de Ingeniería Informática (I3US), Universidad de Sevilla, Sevilla, Spain
- Department of Electronic Technology, Universidad de Sevilla, Avda Reina Mercedes s/n, ETSI Informática, G1.43, Sevilla, 41012, Spain
| |
Collapse
|
26
|
Maki S, Furuya T, Inoue M, Shiga Y, Inage K, Eguchi Y, Orita S, Ohtori S. Machine Learning and Deep Learning in Spinal Injury: A Narrative Review of Algorithms in Diagnosis and Prognosis. J Clin Med 2024; 13:705. [PMID: 38337399 PMCID: PMC10856760 DOI: 10.3390/jcm13030705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 01/14/2024] [Accepted: 01/18/2024] [Indexed: 02/12/2024] Open
Abstract
Spinal injuries, including cervical and thoracolumbar fractures, continue to be a major public health concern. Recent advancements in machine learning and deep learning technologies offer exciting prospects for improving both diagnostic and prognostic approaches in spinal injury care. This narrative review systematically explores the practical utility of these computational methods, with a focus on their application in imaging techniques such as computed tomography (CT) and magnetic resonance imaging (MRI), as well as in structured clinical data. Of the 39 studies included, 34 were focused on diagnostic applications, chiefly using deep learning to carry out tasks like vertebral fracture identification, differentiation between benign and malignant fractures, and AO fracture classification. The remaining five were prognostic, using machine learning to analyze parameters for predicting outcomes such as vertebral collapse and future fracture risk. This review highlights the potential benefit of machine learning and deep learning in spinal injury care, especially their roles in enhancing diagnostic capabilities, detailed fracture characterization, risk assessments, and individualized treatment planning.
Collapse
Affiliation(s)
- Satoshi Maki
- Department of Orthopaedic Surgery, Graduate School of Medicine, Chiba University, Chiba 260-8670, Japan
- Center for Frontier Medical Engineering, Chiba University, Chiba 263-8522, Japan
| | - Takeo Furuya
- Department of Orthopaedic Surgery, Graduate School of Medicine, Chiba University, Chiba 260-8670, Japan
| | - Masahiro Inoue
- Department of Orthopaedic Surgery, Graduate School of Medicine, Chiba University, Chiba 260-8670, Japan
| | - Yasuhiro Shiga
- Department of Orthopaedic Surgery, Graduate School of Medicine, Chiba University, Chiba 260-8670, Japan
| | - Kazuhide Inage
- Department of Orthopaedic Surgery, Graduate School of Medicine, Chiba University, Chiba 260-8670, Japan
| | - Yawara Eguchi
- Department of Orthopaedic Surgery, Graduate School of Medicine, Chiba University, Chiba 260-8670, Japan
| | - Sumihisa Orita
- Department of Orthopaedic Surgery, Graduate School of Medicine, Chiba University, Chiba 260-8670, Japan
- Center for Frontier Medical Engineering, Chiba University, Chiba 263-8522, Japan
| | - Seiji Ohtori
- Department of Orthopaedic Surgery, Graduate School of Medicine, Chiba University, Chiba 260-8670, Japan
| |
Collapse
|
27
|
Larson HJ, Lin L. Generative artificial intelligence can have a role in combating vaccine hesitancy. BMJ 2024; 384:q69. [PMID: 38228351 PMCID: PMC10789191 DOI: 10.1136/bmj.q69] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/18/2024]
Affiliation(s)
- Heidi J Larson
- London School of Hygiene and Tropical Medicine, London, UK
- Institute for Health Metrics and Evaluation, University of Washington, Seattle, USA
| | - Leesa Lin
- London School of Hygiene and Tropical Medicine, London, UK
- Laboratory of Data Discovery for Health, Science Park, Hong Kong
| |
Collapse
|
28
|
Shaban-Nejad A, Michalowski M, Bianco S. Creative and generative artificial intelligence for personalized medicine and healthcare: Hype, reality, or hyperreality? Exp Biol Med (Maywood) 2023; 248:2497-2499. [PMID: 38311873 PMCID: PMC10854468 DOI: 10.1177/15353702241226801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2024] Open
Affiliation(s)
- Arash Shaban-Nejad
- UTHSC-ORNL Center for Biomedical Informatics and Department of Pediatrics, College of Medicine, The University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Martin Michalowski
- School of Nursing, University of Minnesota—Twin Cities, Minneapolis, MN 55455, USA
| | - Simone Bianco
- Altos Labs Bay Area Institute of Science, Redwood City, CA 94065, USA
| |
Collapse
|