Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kunze KN, Jang SJ, Li TY, Pareek A, Finocchiaro A, Fu MC, Taylor SA, Dines JS, Dines DM, Warren RF, Gulotta LV. Artificial intelligence for automated identification of total shoulder arthroplasty implants. J Shoulder Elbow Surg 2023;32:2115-2122. [PMID: 37172888 DOI: 10.1016/j.jse.2023.03.028] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 03/03/2023] [Accepted: 03/22/2023] [Indexed: 05/15/2023]

For:	Kunze KN, Jang SJ, Li TY, Pareek A, Finocchiaro A, Fu MC, Taylor SA, Dines JS, Dines DM, Warren RF, Gulotta LV. Artificial intelligence for automated identification of total shoulder arthroplasty implants. J Shoulder Elbow Surg 2023;32:2115-2122. [PMID: 37172888 DOI: 10.1016/j.jse.2023.03.028] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 03/03/2023] [Accepted: 03/22/2023] [Indexed: 05/15/2023]

Number

Cited by Other Article(s)

Fiedler B, Azua EN, Phillips T, Ahmed AS. ChatGPT performance on the American Shoulder and Elbow Surgeons maintenance of certification exam. J Shoulder Elbow Surg 2024;33:1888-1893. [PMID: 38580067 DOI: 10.1016/j.jse.2024.02.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 01/24/2024] [Accepted: 02/12/2024] [Indexed: 04/07/2024]

Abstract

BACKGROUND

While multiple studies have tested the ability of large language models (LLMs), such as ChatGPT, to pass standardized medical exams at different levels of training, LLMs have never been tested on surgical sub-specialty examinations, such as the American Shoulder and Elbow Surgeons (ASES) Maintenance of Certification (MOC). The purpose of this study was to compare results of ChatGPT 3.5, GPT-4, and fellowship-trained surgeons on the 2023 ASES MOC self-assessment exam.

METHODS

ChatGPT 3.5 and GPT-4 were subjected to the same set of text-only questions from the ASES MOC exam, and GPT-4 was additionally subjected to image-based MOC exam questions. Question responses from both models were compared against the correct answers. Performance of both models was compared to corresponding average human performance on the same question subsets. One sided proportional z-test were utilized to analyze data.

RESULTS

Humans performed significantly better than Chat GPT 3.5 on exclusively text-based questions (76.4% vs. 60.8%, P = .044). Humans also performed significantly better than GPT 4 on image-based questions (73.9% vs. 53.2%, P = .019). There was no significant difference between humans and GPT 4 in text-based questions (76.4% vs. 66.7%, P = .136). Accounting for all questions, humans significantly outperformed GPT-4 (75.3% vs. 60.2%, P = .012). GPT-4 did not perform statistically significantly betterer than ChatGPT 3.5 on text-only questions (66.7% vs. 60.8%, P = .268).

DISCUSSION

Although human performance was overall superior, ChatGPT demonstrated the capacity to analyze orthopedic information and answer specialty-specific questions on the ASES MOC exam for both text and image-based questions. With continued advancements in deep learning, LLMs may someday rival exam performance of fellowship-trained surgeons.

Collapse

Sassi M, Villa Corta M, Pisani MG, Nicodemi G, Schena E, Pecchia L, Longo UG. Advanced Home-Based Shoulder Rehabilitation: A Systematic Review of Remote Monitoring Devices and Their Therapeutic Efficacy. SENSORS (BASEL, SWITZERLAND) 2024;24:2936. [PMID: 38733040 PMCID: PMC11086333 DOI: 10.3390/s24092936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 04/30/2024] [Accepted: 05/02/2024] [Indexed: 05/13/2024]

Yang L, Oeding JF, de Marinis R, Marigi E, Sanchez-Sotelo J. Deep learning to automatically classify very large sets of preoperative and postoperative shoulder arthroplasty radiographs. J Shoulder Elbow Surg 2024;33:773-780. [PMID: 37879598 DOI: 10.1016/j.jse.2023.09.021] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 09/06/2023] [Accepted: 09/10/2023] [Indexed: 10/27/2023]

Abstract

BACKGROUND

Joint arthroplasty registries usually lack information on medical imaging owing to the laborious process of observing and recording, as well as the lack of standard methods to transfer the imaging information to the registries, which can limit the investigation of various research questions. Artificial intelligence (AI) algorithms can automate imaging-feature identification with high accuracy and efficiency. With the purpose of enriching shoulder arthroplasty registries with organized imaging information, it was hypothesized that an automated AI algorithm could be developed to classify and organize preoperative and postoperative radiographs from shoulder arthroplasty patients according to laterality, radiographic projection, and implant type.

METHODS

This study used a cohort of 2303 shoulder radiographs from 1724 shoulder arthroplasty patients. Two observers manually labeled all radiographs according to (1) laterality (left or right), (2) projection (anteroposterior, axillary, or lateral), and (3) whether the radiograph was a preoperative radiograph or showed an anatomic total shoulder arthroplasty or a reverse shoulder arthroplasty. All these labeled radiographs were randomly split into developmental and testing sets at the patient level and based on stratification. By use of 10-fold cross-validation, a 3-task deep-learning algorithm was trained on the developmental set to classify the 3 aforementioned characteristics. The trained algorithm was then evaluated on the testing set using quantitative metrics and visual evaluation techniques.

RESULTS

The trained algorithm perfectly classified laterality (F1 scores [harmonic mean values of precision and sensitivity] of 100% on the testing set). When classifying the imaging projection, the algorithm achieved F1 scores of 99.2%, 100%, and 100% on anteroposterior, axillary, and lateral views, respectively. When classifying the implant type, the model achieved F1 scores of 100%, 95.2%, and 100% on preoperative radiographs, anatomic total shoulder arthroplasty radiographs, and reverse shoulder arthroplasty radiographs, respectively. Visual evaluation using integrated maps showed that the algorithm focused on the relevant patient body and prosthesis parts for classification. It took the algorithm 20.3 seconds to analyze 502 images.

CONCLUSIONS

We developed an efficient, accurate, and reliable AI algorithm to automatically identify key imaging features of laterality, imaging view, and implant type in shoulder radiographs. This algorithm represents the first step to automatically classify and organize shoulder radiographs on a large scale in very little time, which will profoundly enrich shoulder arthroplasty registries.

Collapse

Oeding JF, Yang L, Sanchez-Sotelo J, Camp CL, Karlsson J, Samuelsson K, Pearle AD, Ranawat AS, Kelly BT, Pareek A. A practical guide to the development and deployment of deep learning models for the orthopaedic surgeon: Part III, focus on registry creation, diagnosis, and data privacy. Knee Surg Sports Traumatol Arthrosc 2024;32:518-528. [PMID: 38426614 DOI: 10.1002/ksa.12085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 01/22/2024] [Accepted: 01/23/2024] [Indexed: 03/02/2024]

Abstract

Deep learning is a subset of artificial intelligence (AI) with enormous potential to transform orthopaedic surgery. As has already become evident with the deployment of Large Language Models (LLMs) like ChatGPT (OpenAI Inc.), deep learning can rapidly enter clinical and surgical practices. As such, it is imperative that orthopaedic surgeons acquire a deeper understanding of the technical terminology, capabilities and limitations associated with deep learning models. The focus of this series thus far has been providing surgeons with an overview of the steps needed to implement a deep learning-based pipeline, emphasizing some of the important technical details for surgeons to understand as they encounter, evaluate or lead deep learning projects. However, this series would be remiss without providing practical examples of how deep learning models have begun to be deployed and highlighting the areas where the authors feel deep learning may have the most profound potential. While computer vision applications of deep learning were the focus of Parts I and II, due to the enormous impact that natural language processing (NLP) has had in recent months, NLP-based deep learning models are also discussed in this final part of the series. In this review, three applications that the authors believe can be impacted the most by deep learning but with which many surgeons may not be familiar are discussed: (1) registry construction, (2) diagnostic AI and (3) data privacy. Deep learning-based registry construction will be essential for the development of more impactful clinical applications, with diagnostic AI being one of those applications likely to augment clinical decision-making in the near future. As the applications of deep learning continue to grow, the protection of patient information will become increasingly essential; as such, applications of deep learning to enhance data privacy are likely to become more important than ever before. Level of Evidence: Level IV.

Collapse

Oeding JF, Krych AJ, Pearle AD, Kelly BT, Kunze KN. Medical Imaging Applications Developed Using Artificial Intelligence Demonstrate High Internal Validity Yet Are Limited in Scope and Lack External Validation. Arthroscopy 2024:S0749-8063(24)00099-9. [PMID: 38325497 DOI: 10.1016/j.arthro.2024.01.043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 01/21/2024] [Accepted: 01/29/2024] [Indexed: 02/09/2024]

Abstract

PURPOSE

To (1) review definitions and concepts necessary to interpret applications of deep learning (DL; a domain of artificial intelligence that leverages neural networks to make predictions on media inputs such as images) and (2) identify knowledge and translational gaps in the literature to provide insight into specific areas for improvement as adoption of this technology continues.

METHODS

A comprehensive search of the literature was performed in December 2023 for articles regarding the use of DL in sports medicine. For each study, information regarding the joint of focus, specific anatomic structure/pathology to which DL was applied, imaging modality utilized, source of images used for model training and testing, data set size, model performance, and whether the DL model was externally validated was recorded. A numerical scale was used to rate each DL model's clinical impact, with 1 corresponding to proof-of-concept studies with little to no direct clinical impact and 5 corresponding to practice-changing clinical impact and ready for clinical deployment.

RESULTS

Fifty-five studies were identified, all of which were published within the past 5 years, while 82% were published within the past 3 years. Of the DL models identified, 84% were developed for classification tasks, 9% for automated measurements, and 7% for segmentation. A total of 62% of studies utilized magnetic resonance imaging as the imaging modality, 25% radiographs, and 7% ultrasound, while 1 study each used computed tomography, arthroscopic images, or arthroscopic video. Sixty-five percent of studies focused on the detection of tears (anterior cruciate ligament [ACL], rotator cuff [RC], and meniscus). The diagnostic performance of ACL tears, as determined by the area under the receiver operator curve (AUROC), ranged from 0.81 to 0.99 for ACL tears (excellent to near perfect), 0.83 to 0.94 for RC tears (excellent), and from 0.75 to 0.96 for meniscus tears (acceptable to excellent). In addition, 3 studies focused on detection of cartilage lesions had AUROC ranging from 0.90 to 0.92 (excellent performance). However, only 4 (7%) studies externally validated their models, suggesting that they may not be generalizable or may not perform well when applied to populations other than that used to develop the model. Finally, the mean clinical impact score was 2 (range, 1-3) on scale of 1 to 5, corresponding to limited clinical applicability.

CONCLUSIONS

DL models in orthopaedic sports medicine show generally excellent performance (high internal validity) but require external validation to facilitate clinical deployment. In addition, current models have low clinical applicability and fail to advance the field due to a focus on routine tasks and a narrow conceptual framework.

LEVEL OF EVIDENCE

Level IV, scoping review of Level I to IV studies.

Collapse

Bi AS, Kunze KN, Jazrawi LM. Editorial Commentary: Artificial Intelligence Models Show Impressive Results for Musculoskeletal Pathology Detection. Arthroscopy 2024;40:579-580. [PMID: 38296452 DOI: 10.1016/j.arthro.2023.07.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 07/26/2023] [Indexed: 02/08/2024]

Kunze KN, Williams RJ, Ranawat AS, Pearle AD, Kelly BT, Karlsson J, Martin RK, Pareek A. Artificial intelligence (AI) and large data registries: Understanding the advantages and limitations of contemporary data sets for use in AI research. Knee Surg Sports Traumatol Arthrosc 2024;32:13-18. [PMID: 38226678 DOI: 10.1002/ksa.12018] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 11/27/2023] [Indexed: 01/17/2024]