Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Shah RF, Bini S, Vail T. Data for registry and quality review can be retrospectively collected using natural language processing from unstructured charts of arthroplasty patients. Bone Joint J 2020;102-B:99-104. [DOI: 10.1302/0301-620x.102b7.bjj-2019-1574.r1] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

For:	Shah RF, Bini S, Vail T. Data for registry and quality review can be retrospectively collected using natural language processing from unstructured charts of arthroplasty patients. Bone Joint J 2020;102-B:99-104. [DOI: 10.1302/0301-620x.102b7.bjj-2019-1574.r1] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Number

Cited by Other Article(s)

Lam BD, Chrysafi P, Chiasakul T, Khosla H, Karagkouni D, McNichol M, Adamski A, Reyes N, Abe K, Mantha S, Vlachos IS, Zwicker JI, Patell R. Machine learning natural language processing for identifying venous thromboembolism: systematic review and meta-analysis. Blood Adv 2024;8:2991-3000. [PMID: 38522096 PMCID: PMC11215191 DOI: 10.1182/bloodadvances.2023012200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 02/22/2024] [Accepted: 02/22/2024] [Indexed: 03/26/2024] Open

Abstract

ABSTRACT

Venous thromboembolism (VTE) is a leading cause of preventable in-hospital mortality. Monitoring VTE cases is limited by the challenges of manual medical record review and diagnosis code interpretation. Natural language processing (NLP) can automate the process. Rule-based NLP methods are effective but time consuming. Machine learning (ML)-NLP methods present a promising solution. We conducted a systematic review and meta-analysis of studies published before May 2023 that use ML-NLP to identify VTE diagnoses in the electronic health records. Four reviewers screened all manuscripts, excluding studies that only used a rule-based method. A meta-analysis evaluated the pooled performance of each study's best performing model that evaluated for pulmonary embolism and/or deep vein thrombosis. Pooled sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) with confidence interval (CI) were calculated by DerSimonian and Laird method using a random-effects model. Study quality was assessed using an adapted TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) tool. Thirteen studies were included in the systematic review and 8 had data available for meta-analysis. Pooled sensitivity was 0.931 (95% CI, 0.881-0.962), specificity 0.984 (95% CI, 0.967-0.992), PPV 0.910 (95% CI, 0.865-0.941) and NPV 0.985 (95% CI, 0.977-0.990). All studies met at least 13 of the 21 NLP-modified TRIPOD items, demonstrating fair quality. The highest performing models used vectorization rather than bag-of-words and deep-learning techniques such as convolutional neural networks. There was significant heterogeneity in the studies, and only 4 validated their model on an external data set. Further standardization of ML studies can help progress this novel technology toward real-world implementation.

Collapse

Affiliation(s)

Barbara D. Lam Division of Hematology, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA Division of Clinical Informatics, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA
Pavlina Chrysafi Department of Medicine, Mount Auburn Hospital, Harvard Medical School, Boston, MA
Thita Chiasakul Center of Excellence in Translational Hematology, Division of Hematology, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
Harshit Khosla Department of Medicine, Saint Vincent Hospital, Worcester, MA
Dimitra Karagkouni Department of Pathology, Cancer Research Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA
Megan McNichol Library Sciences, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA
Alys Adamski Division of Blood Disorders, National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, GA
Nimia Reyes Division of Blood Disorders, National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, GA
Karon Abe Division of Blood Disorders, National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, GA
Simon Mantha Division of Hematology, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
Ioannis S. Vlachos Department of Pathology, Cancer Research Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA
Jeffrey I. Zwicker Division of Hematology, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
Rushad Patell Division of Hematology, Department of Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA

Collapse

Berhouet J, Samargandi R. Emerging Innovations in Preoperative Planning and Motion Analysis in Orthopedic Surgery. Diagnostics (Basel) 2024;14:1321. [PMID: 39001212 PMCID: PMC11240316 DOI: 10.3390/diagnostics14131321] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 06/15/2024] [Accepted: 06/20/2024] [Indexed: 07/16/2024] Open

AlShehri Y, Sidhu A, Lakshmanan LVS, Lefaivre KA. Applications of Natural Language Processing for Automated Clinical Data Analysis in Orthopaedics. J Am Acad Orthop Surg 2024;32:439-446. [PMID: 38626429 DOI: 10.5435/jaaos-d-23-00839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 02/20/2024] [Indexed: 04/18/2024] Open

Tavabi N, Pruneski J, Golchin S, Singh M, Sanborn R, Heyworth B, Landschaft A, Kimia A, Kiapour A. Building large-scale registries from unstructured clinical notes using a low-resource natural language processing pipeline. Artif Intell Med 2024;151:102847. [PMID: 38658131 DOI: 10.1016/j.artmed.2024.102847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 02/06/2024] [Accepted: 03/19/2024] [Indexed: 04/26/2024]

Abstract

Building clinical registries is an important step in clinical research and improvement of patient care quality. Natural Language Processing (NLP) methods have shown promising results in extracting valuable information from unstructured clinical notes. However, the structure and nature of clinical notes are very different from regular text that state-of-the-art NLP models are trained and tested on, and they have their own set of challenges. In this study, we propose Sentence Extractor with Keywords (SE-K), an efficient and interpretable classification approach for extracting information from clinical notes and show that it outperforms more computationally expensive methods in text classification. Following the Institutional Review Board (IRB) approval, we used SE-K and two embedding based NLP approaches (Sentence Extractor with Embeddings (SE-E) and Bidirectional Encoder Representations from Transformers (BERT)) to develop comprehensive registry of anterior cruciate ligament surgeries from 20 years of unstructured clinical data at a multi-site tertiary-care regional children's hospital. The low-resource approach (SE-K) had better performance (average AUROC of 0.94 ± 0.04) than the embedding-based approaches (SE-E: 0.93 ± 0.04 and BERT: 0.87 ± 0.09) for out of sample validation, in addition to minimum performance drop between test and out-of-sample validation. Moreover, the SE-K approach was at least six times faster (on CPU) than SE-E (on CPU) and BERT (on GPU) and provides interpretability. Our proposed approach, SE-K, can be effectively used to extract relevant variables from clinic notes to build large-scale registries, with consistently better performance compared to the more resource-intensive approaches (e.g., BERT). Such approaches can facilitate information extraction from unstructured notes for registry building, quality improvement and adverse event monitoring.

Collapse

Pruneski JA, Tavabi N, Heyworth BE, Kocher MS, Kramer DE, Christino MA, Milewski MD, Yen YM, Micheli L, Murray MM, Garcia Andujar RA, Kiapour AM. Prevalence and Predictors of Concomitant Meniscal Surgery During Pediatric and Adolescent ACL Reconstruction: Analysis of 4729 Patients Over 20 Years at a Tertiary-Care Regional Children's Hospital. Orthop J Sports Med 2024;12:23259671241236496. [PMID: 38515604 PMCID: PMC10956158 DOI: 10.1177/23259671241236496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 09/11/2023] [Indexed: 03/23/2024] Open

Abstract

Background

The rate of concomitant meniscal procedures performed in conjunction with anterior cruciate ligament (ACL) reconstruction is increasing. Few studies have examined these procedures in high-risk pediatric cohorts.

Hypotheses

That (1) the rates of meniscal repair compared with meniscectomy would increase throughout the study period and (2) patient-related factors would be able to predict the type of meniscal operation, which would differ according to age.

Study Design

Cohort study (prevalence); Level of evidence, 2.

Methods

Natural language processing was used to extract clinical variables from notes of patients who underwent ACL reconstruction between 2000 and 2020 at a single institution. Patients were stratified to pediatric (5-13 years) and adolescent (14-19 years) cohorts. Linear regression was used to evaluate changes in the prevalence of concomitant meniscal surgery during the study period. Logistic regression was used to determine predictors of the need for and type of meniscal procedure.

Results

Of 4729 patients (mean age, 16 ± 2 years; 54.7% female) identified, 2458 patients (52%) underwent concomitant meniscal procedures (55% repair rate). The prevalence of lateral meniscal (LM) procedures increased in both pediatric and adolescent cohorts, whereas the prevalence of medial meniscal (MM) repair increased in the adolescent cohort (P = .02). In the adolescent cohort, older age was predictive of concomitant medial meniscectomy (P = .031). In the pediatric cohort, female sex was predictive of concomitant MM surgery and of undergoing lateral meniscectomy versus repair (P≤ .029). Female sex was associated with decreased odds of concomitant LM surgery in both cohorts (P≤ .018). Revision ACLR was predictive of concomitant MM surgery and of meniscectomy (medial and lateral) in the adolescent cohort (P < .001). Higher body mass index was associated with increased odds of undergoing medial meniscectomy versus repair in the pediatric cohort (P = .03).

Conclusion

More than half of the young patients who underwent ACLR had meniscal pathology warranting surgical intervention. The prevalence of MM repair compared with meniscectomy in adolescents increased throughout the study period. Patients who underwent revision ACLR were more likely to undergo concomitant meniscal surgeries, which were more often meniscectomy. Female sex had mixed effects in both the pediatric and adolescent cohorts.

Collapse

Zgouridou A, Kenanidis E, Potoupnis M, Tsiridis E. Global mapping of institutional and hospital-based (Level II-IV) arthroplasty registries: a scoping review. EUROPEAN JOURNAL OF ORTHOPAEDIC SURGERY & TRAUMATOLOGY : ORTHOPEDIE TRAUMATOLOGIE 2024;34:1219-1251. [PMID: 37768398 PMCID: PMC10858160 DOI: 10.1007/s00590-023-03691-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 08/13/2023] [Indexed: 09/29/2023]

Macri CZ, Teoh SC, Bacchi S, Tan I, Casson R, Sun MT, Selva D, Chan W. A case study in applying artificial intelligence-based named entity recognition to develop an automated ophthalmic disease registry. Graefes Arch Clin Exp Ophthalmol 2023;261:3335-3344. [PMID: 37535181 PMCID: PMC10587337 DOI: 10.1007/s00417-023-06190-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 06/23/2023] [Accepted: 07/23/2023] [Indexed: 08/04/2023] Open

Pruneski JA, Pareek A, Nwachukwu BU, Martin RK, Kelly BT, Karlsson J, Pearle AD, Kiapour AM, Williams RJ. Natural language processing: using artificial intelligence to understand human language in orthopedics. Knee Surg Sports Traumatol Arthrosc 2022;31:1203-1211. [PMID: 36477347 DOI: 10.1007/s00167-022-07272-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 11/30/2022] [Indexed: 12/12/2022]

Haddad FS. Looking back over the past year. Bone Joint J 2022;104-B:1279-1280. [DOI: 10.1302/0301-620x.104b12.bjj-2022-1161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]

The development and deployment of machine learning models. Knee Surg Sports Traumatol Arthrosc 2022;30:3917-3923. [PMID: 36083354 DOI: 10.1007/s00167-022-07155-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 08/31/2022] [Indexed: 10/14/2022]

Polisetty TS, Jain S, Pang M, Karnuta JM, Vigdorchik JM, Nawabi DH, Wyles CC, Ramkumar PN. Concerns surrounding application of artificial intelligence in hip and knee arthroplasty. Bone Joint J 2022;104-B:1292-1303. [DOI: 10.1302/0301-620x.104b12.bjj-2022-0922.r1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]

Abstract Literature surrounding artificial intelligence (AI)-related applications for hip and knee arthroplasty has proliferated. However, meaningful advances that fundamentally transform the practice and delivery of joint arthroplasty are yet to be realized, despite the broad range of applications as we continue to search for meaningful and appropriate use of AI. AI literature in hip and knee arthroplasty between 2018 and 2021 regarding image-based analyses, value-based care, remote patient monitoring, and augmented reality was reviewed. Concerns surrounding meaningful use and appropriate methodological approaches of AI in joint arthroplasty research are summarized. Of the 233 AI-related orthopaedics articles published, 178 (76%) constituted original research, while the rest consisted of editorials or reviews. A total of 52% of original AI-related research concerns hip and knee arthroplasty (n = 92), and a narrative review is described. Three studies were externally validated. Pitfalls surrounding present-day research include conflating vernacular (“AI/machine learning”), repackaging limited registry data, prematurely releasing internally validated prediction models, appraising model architecture instead of inputted data, withholding code, and evaluating studies using antiquated regression-based guidelines. While AI has been applied to a variety of hip and knee arthroplasty applications with limited clinical impact, the future remains promising if the question is meaningful, the methodology is rigorous and transparent, the data are rich, and the model is externally validated. Simple checkpoints for meaningful AI adoption include ensuring applications focus on: administrative support over clinical evaluation and management; necessity of the advanced model; and the novelty of the question being answered. Cite this article: Bone Joint J 2022;104-B(12):1292–1303. Collapse

Karhade AV, Oosterhoff JHF, Groot OQ, Agaronnik N, Ehresman J, Bongers MER, Jaarsma RL, Poonnoose SI, Sciubba DM, Tobert DG, Doornberg JN, Schwab JH. Can We Geographically Validate a Natural Language Processing Algorithm for Automated Detection of Incidental Durotomy Across Three Independent Cohorts From Two Continents? Clin Orthop Relat Res 2022;480:1766-1775. [PMID: 35412473 PMCID: PMC9384904 DOI: 10.1097/corr.0000000000002200] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 03/11/2022] [Indexed: 01/31/2023]

Abstract

BACKGROUND

Incidental durotomy is an intraoperative complication in spine surgery that can lead to postoperative complications, increased length of stay, and higher healthcare costs. Natural language processing (NLP) is an artificial intelligence method that assists in understanding free-text notes that may be useful in the automated surveillance of adverse events in orthopaedic surgery. A previously developed NLP algorithm is highly accurate in the detection of incidental durotomy on internal validation and external validation in an independent cohort from the same country. External validation in a cohort with linguistic differences is required to assess the transportability of the developed algorithm, referred to geographical validation. Ideally, the performance of a prediction model, the NLP algorithm, is constant across geographic regions to ensure reproducibility and model validity.

QUESTION/PURPOSE

Can we geographically validate an NLP algorithm for the automated detection of incidental durotomy across three independent cohorts from two continents?

METHODS

Patients 18 years or older undergoing a primary procedure of (thoraco)lumbar spine surgery were included. In Massachusetts, between January 2000 and June 2018, 1000 patients were included from two academic and three community medical centers. In Maryland, between July 2016 and November 2018, 1279 patients were included from one academic center, and in Australia, between January 2010 and December 2019, 944 patients were included from one academic center. The authors retrospectively studied the free-text operative notes of included patients for the primary outcome that was defined as intraoperative durotomy. Incidental durotomy occurred in 9% (93 of 1000), 8% (108 of 1279), and 6% (58 of 944) of the patients, respectively, in the Massachusetts, Maryland, and Australia cohorts. No missing reports were observed. Three datasets (Massachusetts, Australian, and combined Massachusetts and Australian) were divided into training and holdout test sets in an 80:20 ratio. An extreme gradient boosting (an efficient and flexible tree-based algorithm) NLP algorithm was individually trained on each training set, and the performance of the three NLP algorithms (respectively American, Australian, and combined) was assessed by discrimination via area under the receiver operating characteristic curves (AUC-ROC; this measures the model's ability to distinguish patients who obtained the outcomes from those who did not), calibration metrics (which plot the predicted and the observed probabilities) and Brier score (a composite of discrimination and calibration). In addition, the sensitivity (true positives, recall), specificity (true negatives), positive predictive value (also known as precision), negative predictive value, F1-score (composite of precision and recall), positive likelihood ratio, and negative likelihood ratio were calculated.

RESULTS

The combined NLP algorithm (the combined Massachusetts and Australian data) achieved excellent performance on independent testing data from Australia (AUC-ROC 0.97 [95% confidence interval 0.87 to 0.99]), Massachusetts (AUC-ROC 0.99 [95% CI 0.80 to 0.99]) and Maryland (AUC-ROC 0.95 [95% CI 0.93 to 0.97]). The NLP developed based on the Massachusetts cohort had excellent performance in the Maryland cohort (AUC-ROC 0.97 [95% CI 0.95 to 0.99]) but worse performance in the Australian cohort (AUC-ROC 0.74 [95% CI 0.70 to 0.77]).

CONCLUSION

We demonstrated the clinical utility and reproducibility of an NLP algorithm with combined datasets retaining excellent performance in individual countries relative to algorithms developed in the same country alone for detection of incidental durotomy. Further multi-institutional, international collaborations can facilitate the creation of universal NLP algorithms that improve the quality and safety of orthopaedic surgery globally. The combined NLP algorithm has been incorporated into a freely accessible web application that can be found at https://sorg-apps.shinyapps.io/nlp_incidental_durotomy/ . Clinicians and researchers can use the tool to help incorporate the model in evaluating spine registries or quality and safety departments to automate detection of incidental durotomy and optimize prevention efforts.

LEVEL OF EVIDENCE

Level III, diagnostic study.

Collapse

DeMik DE, Carender CN, Glass NA, Brown TS, Elkins JM, Bedard NA. Not all Total Hip and Knee Arthroplasties Are the Same: What Are the Implications in Large Database Studies? J Arthroplasty 2022;37:1247-1252.e2. [PMID: 35271975 DOI: 10.1016/j.arth.2022.02.119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 02/26/2022] [Accepted: 02/28/2022] [Indexed: 02/02/2023] Open

Abstract

BACKGROUND

The use of claims databases for research after total hip and knee arthroplasty (THA, TKA) has increased exponentially. These studies rely on accurate coding, and inadvertent inclusion of patients with nonroutine indications may influence results. The purpose of this study was to evaluate the complexity of THA and TKA captured by CPT code and determine if complication rates vary based on the indication.

METHODS

The NSQIP database was queried using CPT codes 21730 and 27447 to identify patients undergoing THA and TKA from 2018 to 2019. The surgical indication was classified based on the ICD-10 diagnosis code as routine primary, complex primary, inflammatory, fracture, oncologic, revision, infection, or indeterminant. Patient factors and 30-day complications, readmission, reoperation, and wound complications were compared.

RESULTS

A total of 86,009 THA patients had 703 ICD-10 diagnosis codes and 91.4% were routine primary indications. Complication rates were: routine primary 7.4%, complex primary 11.3%, inflammatory 12.5%, fracture 23.9%, oncologic 32.4%, revision 26.9%, infection 38.7%, and indeterminant 10.3% (P < .0001). 137,500 TKA patients had 552 ICD-10 diagnosis codes and 96.1% were routine primary cases. Complication rates were: routine primary 5.9%, complex primary 8.0%, inflammatory 7.2%, fracture 38.9%, oncologic 32.7%, revision 13.3%, infection 37.7%, and indeterminant 9.6% (P < .0001). Routine primary arthroplasty had significantly lower rates of reoperation, readmission, and wound complications.

CONCLUSION

Using CPT code alone captures 10% of THA and 4% of TKA patients with procedures for nonroutine primary indications. It is essential to recognize identification of patients simply by CPT code has the potential to inadvertently introduce bias, and surgeons should critically assess methods used to define the study populations.

Collapse

Greenberg JK, Otun A, Ghogawala Z, Yen PY, Molina CA, Limbrick DD, Foraker RE, Kelly MP, Ray WZ. Translating Data Analytics Into Improved Spine Surgery Outcomes: A Roadmap for Biomedical Informatics Research in 2021. Global Spine J 2022;12:952-963. [PMID: 33973491 PMCID: PMC9344511 DOI: 10.1177/21925682211008424] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Polce EM, Kunze KN, Dooley MS, Piuzzi NS, Boettner F, Sculco PK. Efficacy and Applications of Artificial Intelligence and Machine Learning Analyses in Total Joint Arthroplasty: A Call for Improved Reporting. J Bone Joint Surg Am 2022;104:821-832. [PMID: 35045061 DOI: 10.2106/jbjs.21.00717] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Abstract

BACKGROUND

There has been a considerable increase in total joint arthroplasty (TJA) research using machine learning (ML). Therefore, the purposes of this study were to synthesize the applications and efficacies of ML reported in the TJA literature, and to assess the methodological quality of these studies.

METHODS

PubMed, OVID/MEDLINE, and Cochrane libraries were queried in January 2021 for articles regarding the use of ML in TJA. Study demographics, topic, primary and secondary outcomes, ML model development and testing, and model presentation and validation were recorded. The TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) guidelines were used to assess the methodological quality.

RESULTS

Fifty-five studies were identified: 31 investigated clinical outcomes and resource utilization; 11, activity and motion surveillance; 10, imaging detection; and 3, natural language processing. For studies reporting the area under the receiver operating characteristic curve (AUC), the median AUC (and range) was 0.80 (0.60 to 0.97) among 26 clinical outcome studies, 0.99 (0.83 to 1.00) among 6 imaging-based studies, and 0.88 (0.76 to 0.98) among 3 activity and motion surveillance studies. Twelve studies compared ML to logistic regression, with 9 (75%) reporting that ML was superior. The average number of TRIPOD guidelines met was 11.5 (range: 5 to 18), with 38 (69%) meeting greater than half of the criteria. Presentation and explanation of the full model for individual predictions and assessments of model calibration were poorly reported (<30%).

CONCLUSIONS

The performance of ML models was good to excellent when applied to a wide variety of clinically relevant outcomes in TJA. However, reporting of certain key methodological and model presentation criteria was inadequate. Despite the recent surge in TJA literature utilizing ML, the lack of consistent adherence to reporting guidelines needs to be addressed to bridge the gap between model development and clinical implementation.

Collapse

Han P, Fu S, Kolis J, Hughes R, Hallstrom BR, Carvour M, Maradit-Kremers H, Sohn S, Vydiswaran VGV. Multi-Center Validation of Natural Language Processing Algorithms for Detection of Common Data Elements in Operative Notes for Total Hip Arthroplasty (Preprint). JMIR Med Inform 2022;10:e38155. [PMID: 36044253 PMCID: PMC9475406 DOI: 10.2196/38155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/30/2022] [Accepted: 07/12/2022] [Indexed: 11/18/2022] Open

Abstract

Background

Natural language processing (NLP) methods are powerful tools for extracting and analyzing critical information from free-text data. MedTaggerIE, an open-source NLP pipeline for information extraction based on text patterns, has been widely used in the annotation of clinical notes. A rule-based system, MedTagger-total hip arthroplasty (THA), developed based on MedTaggerIE, was previously shown to correctly identify the surgical approach, fixation, and bearing surface from the THA operative notes at Mayo Clinic.

Objective

This study aimed to assess the implementability, usability, and portability of MedTagger-THA at two external institutions, Michigan Medicine and the University of Iowa, and provide lessons learned for best practices.

Methods

We conducted iterative test-apply-refinement processes with three involved sites—the development site (Mayo Clinic) and two deployment sites (Michigan Medicine and the University of Iowa). Mayo Clinic was the primary NLP development site, with the THA registry as the gold standard. The activities at the two deployment sites included the extraction of the operative notes, gold standard development (Michigan: registry data; Iowa: manual chart review), the refinement of NLP algorithms on training data, and the evaluation of test data. Error analyses were conducted to understand language variations across sites. To further assess the model specificity for approach and fixation, we applied the refined MedTagger-THA to arthroscopic hip procedures and periacetabular osteotomy cases, as neither of these operative notes should contain any approach or fixation keywords.

Results

MedTagger-THA algorithms were implemented and refined independently for both sites. At Michigan, the study comprised THA-related notes for 2569 patient-date pairs. Before model refinement, MedTagger-THA algorithms demonstrated excellent accuracy for approach (96.6%, 95% CI 94.6%-97.9%) and fixation (95.7%, 95% CI 92.4%-97.6%). These results were comparable with internal accuracy at the development site (99.2% for approach and 90.7% for fixation). Model refinement improved accuracies slightly for both approach (99%, 95% CI 97.6%-99.6%) and fixation (98%, 95% CI 95.3%-99.3%). The specificity of approach identification was 88.9% for arthroscopy cases, and the specificity of fixation identification was 100% for both periacetabular osteotomy and arthroscopy cases. At the Iowa site, the study comprised an overall data set of 100 operative notes (50 training notes and 50 test notes). MedTagger-THA algorithms achieved moderate-high performance on the training data. After model refinement, the model achieved high performance for approach (100%, 95% CI 91.3%-100%), fixation (98%, 95% CI 88.3%-100%), and bearing surface (92%, 95% CI 80.5%-97.3%).

Conclusions

High performance across centers was achieved for the MedTagger-THA algorithms, demonstrating that they were sufficiently implementable, usable, and portable to different deployment sites. This study provided important lessons learned during the model deployment and validation processes, and it can serve as a reference for transferring rule-based electronic health record models.

Collapse

Rubinger L, Gazendam A, Ekhtiari S, Bhandari M. Machine learning and artificial intelligence in research and healthcare^✰,✰✰. Injury 2022:S0020-1383(22)00076-6. [PMID: 35135685 DOI: 10.1016/j.injury.2022.01.046] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 01/29/2022] [Indexed: 02/02/2023]

Kunze KN, Orr M, Krebs V, Bhandari M, Piuzzi NS. Potential benefits, unintended consequences, and future roles of artificial intelligence in orthopaedic surgery research : a call to emphasize data quality and indications. Bone Jt Open 2022;3:93-97. [PMID: 35084227 PMCID: PMC9047073 DOI: 10.1302/2633-1462.31.bjo-2021-0123.r1] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open

How Accurate Is ICD-10 Coding for Revision Total Knee Arthroplasty? J Arthroplasty 2021;36:3950-3958. [PMID: 34538547 DOI: 10.1016/j.arth.2021.08.021] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 07/31/2021] [Accepted: 08/20/2021] [Indexed: 02/02/2023] Open

Abstract

BACKGROUND

The International Classification of Diseases-10 (ICD-10) came into effect in October 2015. The new procedural codes (ICD-10-PCS) were designed to specify granular aspects of the procedure, including laterality and revised components. This specificity could improve data collection in institutional databases, large registries, and administrative claims data. Given these possible applications, this study's purpose was to assess the accuracy of ICD-10-PCS coding for revision total knee arthroplasty (rTKA).

METHODS

This multicenter retrospective analysis utilized the rTKA databases at four academic medical centers for all aseptic rTKAs between October 1, 2015 and July 3, 2019. Operative reports were reviewed to determine laterality and revised components (tibial, femoral, liner, and patellar component), which were then compared with the ICD-10-PCS codes associated with the billing records. Proper coding required both component removal and replacement codes. The correct series of removal and replacement codes was determined using the American Joint Replacement Registry's guidelines.

RESULTS

In total, 1906 rTKAs were examined, and 98.0% had at least one proper ICD-10-PCS code, indicating an rTKA had occurred. Coding for components replaced was correct in 76.3% of cases. When examining both removal and replacement codes, accuracy dropped to 57.0%.

CONCLUSION

Nearly 25% of rTKA procedures were incorrectly coded for replaced components, and over 40% were incorrectly coded for removed and replaced components. ICD-10-PCS codes can accurately identify that an rTKA has occurred; however, the inaccuracy in identifying which specific components were revised should prompt further evaluation of the coding process before utilizing ICD-10-PCS codes to report granular rTKA data.

LEVEL OF EVIDENCE

III, retrospective observational analysis.

Collapse

Wyatt JM, Booth GJ, Goldman AH. Natural Language Processing and Its Use in Orthopaedic Research. Curr Rev Musculoskelet Med 2021;14:392-396. [PMID: 34755276 PMCID: PMC8577962 DOI: 10.1007/s12178-021-09734-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/19/2021] [Indexed: 12/29/2022]

Giori NJ, Radin J, Callahan A, Fries JA, Halilaj E, Ré C, Delp SL, Shah NH, Harris AHS. Assessment of Extractability and Accuracy of Electronic Health Record Data for Joint Implant Registries. JAMA Netw Open 2021;4:e211728. [PMID: 33720372 PMCID: PMC7961313 DOI: 10.1001/jamanetworkopen.2021.1728] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Abstract

IMPORTANCE

Implant registries provide valuable information on the performance of implants in a real-world setting, yet they have traditionally been expensive to establish and maintain. Electronic health records (EHRs) are widely used and may include the information needed to generate clinically meaningful reports similar to a formal implant registry.

OBJECTIVES

To quantify the extractability and accuracy of registry-relevant data from the EHR and to assess the ability of these data to track trends in implant use and the durability of implants (hereafter referred to as implant survivorship), using data stored since 2000 in the EHR of the largest integrated health care system in the United States.

DESIGN, SETTING, AND PARTICIPANTS

Retrospective cohort study of a large EHR of veterans who had 45 351 total hip arthroplasty procedures in Veterans Health Administration hospitals from 2000 to 2017. Data analysis was performed from January 1, 2000, to December 31, 2017.

EXPOSURES

Total hip arthroplasty.

MAIN OUTCOMES AND MEASURES

Number of total hip arthroplasty procedures extracted from the EHR, trends in implant use, and relative survivorship of implants.

RESULTS

A total of 45 351 total hip arthroplasty procedures were identified from 2000 to 2017 with 192 805 implant parts. Data completeness improved over the time. After 2014, 85% of prosthetic heads, 91% of shells, 81% of stems, and 85% of liners used in the Veterans Health Administration health care system were identified by part number. Revision burden and trends in metal vs ceramic prosthetic femoral head use were found to reflect data from the American Joint Replacement Registry. Recalled implants were obvious negative outliers in implant survivorship using Kaplan-Meier curves.

CONCLUSIONS AND RELEVANCE

Although loss to follow-up remains a challenge that requires additional attention to improve the quantitative nature of calculated implant survivorship, we conclude that data collected during routine clinical care and stored in the EHR of a large health system over 18 years were sufficient to provide clinically meaningful data on trends in implant use and to identify poor implants that were subsequently recalled. This automated approach was low cost and had no reporting burden. This low-cost, low-overhead method to assess implant use and performance within a large health care setting may be useful to internal quality assurance programs and, on a larger scale, to postmarket surveillance of implant performance.

Collapse