1
|
Farrow L, Zhong M, Anderson L. Use of natural language processing techniques to predict patient selection for total hip and knee arthroplasty from radiology reports. Bone Joint J 2024; 106-B:688-695. [PMID: 38945535 DOI: 10.1302/0301-620x.106b7.bjj-2024-0136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Aims To examine whether natural language processing (NLP) using a clinically based large language model (LLM) could be used to predict patient selection for total hip or total knee arthroplasty (THA/TKA) from routinely available free-text radiology reports. Methods Data pre-processing and analyses were conducted according to the Artificial intelligence to Revolutionize the patient Care pathway in Hip and knEe aRthroplastY (ARCHERY) project protocol. This included use of de-identified Scottish regional clinical data of patients referred for consideration of THA/TKA, held in a secure data environment designed for artificial intelligence (AI) inference. Only preoperative radiology reports were included. NLP algorithms were based on the freely available GatorTron model, a LLM trained on over 82 billion words of de-identified clinical text. Two inference tasks were performed: assessment after model-fine tuning (50 Epochs and three cycles of k-fold cross validation), and external validation. Results For THA, there were 5,558 patient radiology reports included, of which 4,137 were used for model training and testing, and 1,421 for external validation. Following training, model performance demonstrated average (mean across three folds) accuracy, F1 score, and area under the receiver operating curve (AUROC) values of 0.850 (95% confidence interval (CI) 0.833 to 0.867), 0.813 (95% CI 0.785 to 0.841), and 0.847 (95% CI 0.822 to 0.872), respectively. For TKA, 7,457 patient radiology reports were included, with 3,478 used for model training and testing, and 3,152 for external validation. Performance metrics included accuracy, F1 score, and AUROC values of 0.757 (95% CI 0.702 to 0.811), 0.543 (95% CI 0.479 to 0.607), and 0.717 (95% CI 0.657 to 0.778) respectively. There was a notable deterioration in performance on external validation in both cohorts. Conclusion The use of routinely available preoperative radiology reports provides promising potential to help screen suitable candidates for THA, but not for TKA. The external validation results demonstrate the importance of further model testing and training when confronted with new clinical cohorts.
Collapse
Affiliation(s)
- Luke Farrow
- Grampian Orthopaedics, Aberdeen Royal Infirmary, Aberdeen, UK
- Institute of Applied Health Sciences, University of Aberdeen, Aberdeen, UK
| | - Mingjun Zhong
- Institute of Applied Health Sciences, University of Aberdeen, Aberdeen, UK
| | - Lesley Anderson
- Institute of Applied Health Sciences, University of Aberdeen, Aberdeen, UK
| |
Collapse
|
2
|
AlShehri Y, Sidhu A, Lakshmanan LVS, Lefaivre KA. Applications of Natural Language Processing for Automated Clinical Data Analysis in Orthopaedics. J Am Acad Orthop Surg 2024; 32:439-446. [PMID: 38626429 DOI: 10.5435/jaaos-d-23-00839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 02/20/2024] [Indexed: 04/18/2024] Open
Abstract
Natural language processing is an exciting and emerging field in health care that can transform the field of orthopaedics. It can aid in the process of automated clinical data analysis, changing the way we extract data for various purposes including research and registry formation, diagnosis, and medical billing. This scoping review will look at the various applications of NLP in orthopaedics. Specific examples of NLP applications include identification of essential data elements from surgical and imaging reports, patient feedback analysis, and use of AI conversational agents for patient engagement. We will demonstrate how NLP has proven itself to be a powerful and valuable tool. Despite these potential advantages, there are drawbacks we must consider. Concerns with data quality, bias, privacy, and accessibility may stand as barriers in the way of widespread implementation of NLP technology. As natural language processing technology continues to develop, it has the potential to revolutionize orthopaedic research and clinical practices and enhance patient outcomes.
Collapse
Affiliation(s)
- Yasir AlShehri
- From the Department of Orthopedics, College of Medicine, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia (AlShehri), the Department of Orthopaedics, Faculty of Medicine, The University of British Columbia, Vancouver, BC, Canada (Sidhu and Lefaivre), and the Department of Computer Science, The University of British Columbia, Vancouver, BC, Canada (Lakshmanan)
| | | | | | | |
Collapse
|
3
|
Huffman N, Pasqualini I, Khan ST, Klika AK, Deren ME, Jin Y, Kunze KN, Piuzzi NS. Enabling Personalized Medicine in Orthopaedic Surgery Through Artificial Intelligence: A Critical Analysis Review. JBJS Rev 2024; 12:01874474-202403000-00006. [PMID: 38466797 DOI: 10.2106/jbjs.rvw.23.00232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
» The application of artificial intelligence (AI) in the field of orthopaedic surgery holds potential for revolutionizing health care delivery across 3 crucial domains: (I) personalized prediction of clinical outcomes and adverse events, which may optimize patient selection, surgical planning, and enhance patient safety and outcomes; (II) diagnostic automated and semiautomated imaging analyses, which may reduce time burden and facilitate precise and timely diagnoses; and (III) forecasting of resource utilization, which may reduce health care costs and increase value for patients and institutions.» Computer vision is one of the most highly studied areas of AI within orthopaedics, with applications pertaining to fracture classification, identification of the manufacturer and model of prosthetic implants, and surveillance of prosthesis loosening and failure.» Prognostic applications of AI within orthopaedics include identifying patients who will likely benefit from a specified treatment, predicting prosthetic implant size, postoperative length of stay, discharge disposition, and surgical complications. Not only may these applications be beneficial to patients but also to institutions and payors because they may inform potential cost expenditure, improve overall hospital efficiency, and help anticipate resource utilization.» AI infrastructure development requires institutional financial commitment and a team of clinicians and data scientists with expertise in AI that can complement skill sets and knowledge. Once a team is established and a goal is determined, teams (1) obtain, curate, and label data; (2) establish a reference standard; (3) develop an AI model; (4) evaluate the performance of the AI model; (5) externally validate the model, and (6) reinforce, improve, and evaluate the model's performance until clinical implementation is possible.» Understanding the implications of AI in orthopaedics may eventually lead to wide-ranging improvements in patient care. However, AI, while holding tremendous promise, is not without methodological and ethical limitations that are essential to address. First, it is important to ensure external validity of programs before their use in a clinical setting. Investigators should maintain high quality data records and registry surveillance, exercise caution when evaluating others' reported AI applications, and increase transparency of the methodological conduct of current models to improve external validity and avoid propagating bias. By addressing these challenges and responsibly embracing the potential of AI, the medical field may eventually be able to harness its power to improve patient care and outcomes.
Collapse
Affiliation(s)
- Nickelas Huffman
- Cleveland Clinic, Department of Orthopaedic Surgery, Cleveland, Ohio
| | | | - Shujaa T Khan
- Cleveland Clinic, Department of Orthopaedic Surgery, Cleveland, Ohio
| | - Alison K Klika
- Cleveland Clinic, Department of Orthopaedic Surgery, Cleveland, Ohio
| | - Matthew E Deren
- Cleveland Clinic, Department of Orthopaedic Surgery, Cleveland, Ohio
| | - Yuxuan Jin
- Cleveland Clinic, Department of Orthopaedic Surgery, Cleveland, Ohio
| | - Kyle N Kunze
- Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York
| | - Nicolas S Piuzzi
- Cleveland Clinic, Department of Orthopaedic Surgery, Cleveland, Ohio
- Department of Biomedical Engineering, Cleveland Clinic Foundation, Cleveland, Ohio
| |
Collapse
|
4
|
Warren E, Hurley ET, Park CN, Crook BS, Lorentz S, Levin JM, Anakwenze O, MacDonald PB, Klifto CS. Evaluation of information from artificial intelligence on rotator cuff repair surgery. JSES Int 2024; 8:53-57. [PMID: 38312282 PMCID: PMC10837709 DOI: 10.1016/j.jseint.2023.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2024] Open
Abstract
Purpose The purpose of this study was to analyze the quality and readability of information regarding rotator cuff repair surgery available using an online AI software. Methods An open AI model (ChatGPT) was used to answer 24 commonly asked questions from patients on rotator cuff repair. Questions were stratified into one of three categories based on the Rothwell classification system: fact, policy, or value. The answers for each category were evaluated for reliability, quality and readability using The Journal of the American Medical Association Benchmark criteria, DISCERN score, Flesch-Kincaid Reading Ease Score and Grade Level. Results The Journal of the American Medical Association Benchmark criteria score for all three categories was 0, which is the lowest score indicating no reliable resources cited. The DISCERN score was 51 for fact, 53 for policy, and 55 for value questions, all of which are considered good scores. Across question categories, the reliability portion of the DISCERN score was low, due to a lack of resources. The Flesch-Kincaid Reading Ease Score (and Flesch-Kincaid Grade Level) was 48.3 (10.3) for the fact class, 42.0 (10.9) for the policy class, and 38.4 (11.6) for the value class. Conclusion The quality of information provided by the open AI chat system was generally high across all question types but had significant shortcomings in reliability due to the absence of source material citations. The DISCERN scores of the AI generated responses matched or exceeded previously published results of studies evaluating the quality of online information about rotator cuff repairs. The responses were U.S. 10th grade or higher reading level which is above the AMA and NIH recommendation of 6th grade reading level for patient materials. The AI software commonly referred the user to seek advice from orthopedic surgeons to improve their chances of a successful outcome.
Collapse
Affiliation(s)
- Eric Warren
- Duke University School of Medicine, Duke University, Durham, NC, USA
| | - Eoghan T. Hurley
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Caroline N. Park
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Bryan S. Crook
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Samuel Lorentz
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Jay M. Levin
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Oke Anakwenze
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Peter B. MacDonald
- Section of Orthopaedic Surgery & The Pan Am Clinic, University of Manitoba, Winnipeg, MB, Canada
| | | |
Collapse
|
5
|
Brameier DT, Alnasser AA, Carnino JM, Bhashyam AR, von Keudell AG, Weaver MJ. Artificial Intelligence in Orthopaedic Surgery: Can a Large Language Model "Write" a Believable Orthopaedic Journal Article? J Bone Joint Surg Am 2023; 105:1388-1392. [PMID: 37437021 DOI: 10.2106/jbjs.23.00473] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/14/2023]
Abstract
ABSTRACT ➢ Natural language processing with large language models is a subdivision of artificial intelligence (AI) that extracts meaning from text with use of linguistic rules, statistics, and machine learning to generate appropriate text responses. Its utilization in medicine and in the field of orthopaedic surgery is rapidly growing.➢ Large language models can be utilized in generating scientific manuscript texts of a publishable quality; however, they suffer from AI hallucinations, in which untruths or half-truths are stated with misleading confidence. Their use raises considerable concerns regarding the potential for research misconduct and for hallucinations to insert misinformation into the clinical literature.➢ Current editorial processes are insufficient for identifying the involvement of large language models in manuscripts. Academic publishing must adapt to encourage safe use of these tools by establishing clear guidelines for their use, which should be adopted across the orthopaedic literature, and by implementing additional steps in the editorial screening process to identify the use of these tools in submitted manuscripts.
Collapse
Affiliation(s)
- Devon T Brameier
- Department of Orthopaedic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Ahmad A Alnasser
- Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Jonathan M Carnino
- Boston University Chobanian & Avedisian School of Medicine, Boston, Massachusetts
| | - Abhiram R Bhashyam
- Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Arvind G von Keudell
- Department of Orthopaedic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
- Bispebjerg Hospital, University of Copenhagen, Copenhagen, Denmark
| | - Michael J Weaver
- Department of Orthopaedic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
6
|
Affiliation(s)
- Andrew S Bi
- NYU Langone Orthopedic Hospital, New York, NY
| |
Collapse
|
7
|
Swiontkowski MF, Callaghan JJ, Lewallen DG, Berry DJ. Large Database and Registry Research in Joint Arthroplasty and Orthopaedics. J Bone Joint Surg Am 2022; 104:1-3. [PMID: 36260035 DOI: 10.2106/jbjs.22.00932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|