1
|
Nowroozi A, Salehi MA, Shobeiri P, Agahi S, Momtazmanesh S, Kaviani P, Kalra MK. Artificial intelligence diagnostic accuracy in fracture detection from plain radiographs and comparing it with clinicians: a systematic review and meta-analysis. Clin Radiol 2024; 79:579-588. [PMID: 38772766 DOI: 10.1016/j.crad.2024.04.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/09/2024] [Accepted: 04/15/2024] [Indexed: 05/23/2024]
Abstract
PURPOSE Fracture detection is one of the most commonly used and studied aspects of artificial intelligence (AI) in medicine. In this systematic review and meta-analysis, we aimed to summarize available literature and data regarding AI performance in fracture detection on plain radiographs and various factors affecting it. METHODS We systematically reviewed studies evaluating AI algorithms in detecting bone fractures in plain radiographs, combined their performance using meta-analysis (a bivariate regression approach), and compared it with that of clinicians. We also analyzed the factors potentially affecting algorithm performance using meta-regression. RESULTS Our analysis included 100 studies. In 83 studies with confusion matrices, AI algorithms showed a sensitivity of 91.43% and a specificity of 92.12% (Area under the summary receiver operator curve = 0.968). After adjustment and false discovery rate correction, tibia/fibula (excluding ankle) fractures were associated with higher (7.0%, p=0.004) AI sensitivity, while more recent publications (5.5%, p=0.003) and Xception architecture (6.6%, p<0.001) were associated with higher specificity. Clinicians and AI showed similar specificity in fracture identification, although AI leaned to higher sensitivity (7.6%, p=0.07). Radiologists, on the other hand, were more specific than AI overall and in several subgroups, and more sensitive to hip fractures before FDR correction. CONCLUSIONS Currently available AI aids could result in a significant improvement in care where radiologists are not readily available. Moreover, identifying factors affecting algorithm performance could guide AI development teams in their process of optimizing their products.
Collapse
Affiliation(s)
- A Nowroozi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - M A Salehi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - P Shobeiri
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - S Agahi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - S Momtazmanesh
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - P Kaviani
- Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - M K Kalra
- Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA.
| |
Collapse
|
2
|
Ruitenbeek HC, Oei EHG, Visser JJ, Kijowski R. Artificial intelligence in musculoskeletal imaging: realistic clinical applications in the next decade. Skeletal Radiol 2024:10.1007/s00256-024-04684-6. [PMID: 38902420 DOI: 10.1007/s00256-024-04684-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/06/2024] [Accepted: 04/15/2024] [Indexed: 06/22/2024]
Abstract
This article will provide a perspective review of the most extensively investigated deep learning (DL) applications for musculoskeletal disease detection that have the best potential to translate into routine clinical practice over the next decade. Deep learning methods for detecting fractures, estimating pediatric bone age, calculating bone measurements such as lower extremity alignment and Cobb angle, and grading osteoarthritis on radiographs have been shown to have high diagnostic performance with many of these applications now commercially available for use in clinical practice. Many studies have also documented the feasibility of using DL methods for detecting joint pathology and characterizing bone tumors on magnetic resonance imaging (MRI). However, musculoskeletal disease detection on MRI is difficult as it requires multi-task, multi-class detection of complex abnormalities on multiple image slices with different tissue contrasts. The generalizability of DL methods for musculoskeletal disease detection on MRI is also challenging due to fluctuations in image quality caused by the wide variety of scanners and pulse sequences used in routine MRI protocols. The diagnostic performance of current DL methods for musculoskeletal disease detection must be further evaluated in well-designed prospective studies using large image datasets acquired at different institutions with different imaging parameters and imaging hardware before they can be fully implemented in clinical practice. Future studies must also investigate the true clinical benefits of current DL methods and determine whether they could enhance quality, reduce error rates, improve workflow, and decrease radiologist fatigue and burnout with all of this weighed against the costs.
Collapse
Affiliation(s)
- Huibert C Ruitenbeek
- Department of Radiology and Nuclear Medicine, Erasmus MC, University Medical Center, P.O. Box 2040, 3000 CA, Rotterdam, The Netherlands
| | - Edwin H G Oei
- Department of Radiology and Nuclear Medicine, Erasmus MC, University Medical Center, P.O. Box 2040, 3000 CA, Rotterdam, The Netherlands
| | - Jacob J Visser
- Department of Radiology and Nuclear Medicine, Erasmus MC, University Medical Center, P.O. Box 2040, 3000 CA, Rotterdam, The Netherlands
| | - Richard Kijowski
- Department of Radiology, New York University Grossman School of Medicine, 660 First Avenue, 3rd Floor, New York, NY, 10016, USA.
| |
Collapse
|
3
|
Yıldız Potter İ, Yeritsyan D, Mahar S, Kheir N, Vaziri A, Putman M, Rodriguez EK, Wu J, Nazarian A, Vaziri A. Proximal femur fracture detection on plain radiography via feature pyramid networks. Sci Rep 2024; 14:12046. [PMID: 38802519 PMCID: PMC11130146 DOI: 10.1038/s41598-024-63001-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 05/23/2024] [Indexed: 05/29/2024] Open
Abstract
Hip fractures exceed 250,000 cases annually in the United States, with the worldwide incidence projected to increase by 240-310% by 2050. Hip fractures are predominantly diagnosed by radiologist review of radiographs. In this study, we developed a deep learning model by extending the VarifocalNet Feature Pyramid Network (FPN) for detection and localization of proximal femur fractures from plain radiography with clinically relevant metrics. We used a dataset of 823 hip radiographs of 150 subjects with proximal femur fractures and 362 controls to develop and evaluate the deep learning model. Our model attained 0.94 specificity and 0.95 sensitivity in fracture detection over the diverse imaging dataset. We compared the performance of our model against five benchmark FPN models, demonstrating 6-14% sensitivity and 1-9% accuracy improvement. In addition, we demonstrated that our model outperforms a state-of-the-art transformer model based on DINO network by 17% sensitivity and 5% accuracy, while taking half the time on average to process a radiograph. The developed model can aid radiologists and support on-premise integration with hospital cloud services to enable automatic, opportunistic screening for hip fractures.
Collapse
Affiliation(s)
| | - Diana Yeritsyan
- Carl J. Shapiro Department of Orthopaedic Surgery, Beth Israel Deaconess Medical Center (BIDMC) and Harvard Medical School, 330 Brookline Avenue, Stoneman 10, Boston, MA, 02215, USA
- Musculoskeletal Translational Innovation Initiative, Beth Israel Deaconess Medical Center and Harvard Medical School, 330 Brookline Avenue RN123, Boston, MA, 02215, USA
| | - Sarah Mahar
- Carl J. Shapiro Department of Orthopaedic Surgery, Beth Israel Deaconess Medical Center (BIDMC) and Harvard Medical School, 330 Brookline Avenue, Stoneman 10, Boston, MA, 02215, USA
- Musculoskeletal Translational Innovation Initiative, Beth Israel Deaconess Medical Center and Harvard Medical School, 330 Brookline Avenue RN123, Boston, MA, 02215, USA
| | - Nadim Kheir
- Carl J. Shapiro Department of Orthopaedic Surgery, Beth Israel Deaconess Medical Center (BIDMC) and Harvard Medical School, 330 Brookline Avenue, Stoneman 10, Boston, MA, 02215, USA
- Musculoskeletal Translational Innovation Initiative, Beth Israel Deaconess Medical Center and Harvard Medical School, 330 Brookline Avenue RN123, Boston, MA, 02215, USA
| | - Aidin Vaziri
- BioSensics, LLC, 57 Chapel Street, Newton, MA, 02458, USA
| | - Melissa Putman
- Division of Endocrinology, Massachusetts General Hospital and Harvard Medical School, 55 Fruit Street, Boston, MA, 02114, USA
| | - Edward K Rodriguez
- Carl J. Shapiro Department of Orthopaedic Surgery, Beth Israel Deaconess Medical Center (BIDMC) and Harvard Medical School, 330 Brookline Avenue, Stoneman 10, Boston, MA, 02215, USA
- Musculoskeletal Translational Innovation Initiative, Beth Israel Deaconess Medical Center and Harvard Medical School, 330 Brookline Avenue RN123, Boston, MA, 02215, USA
| | - Jim Wu
- Department of Radiology, Massachusetts General Brigham (MGB) and Harvard Medical School, 75 Francis Street, Boston, MA, 02215, USA
| | - Ara Nazarian
- Carl J. Shapiro Department of Orthopaedic Surgery, Beth Israel Deaconess Medical Center (BIDMC) and Harvard Medical School, 330 Brookline Avenue, Stoneman 10, Boston, MA, 02215, USA
- Musculoskeletal Translational Innovation Initiative, Beth Israel Deaconess Medical Center and Harvard Medical School, 330 Brookline Avenue RN123, Boston, MA, 02215, USA
- Department of Orthopaedic Surgery, Yerevan State University, Yerevan, Armenia
| | - Ashkan Vaziri
- BioSensics, LLC, 57 Chapel Street, Newton, MA, 02458, USA
| |
Collapse
|
4
|
Tanner IL, Ye K, Moore MS, Rechenmacher AJ, Ramirez MM, George SZ, Bolognesi MP, Horn ME. Developing a Computer Vision Model to Automate Quantitative Measurement of Hip-Knee-Ankle Angle in Total Hip and Knee Arthroplasty Patients. J Arthroplasty 2024:S0883-5403(24)00410-8. [PMID: 38679347 DOI: 10.1016/j.arth.2024.04.062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 04/19/2024] [Accepted: 04/21/2024] [Indexed: 05/01/2024] Open
Abstract
BACKGROUND Increasing deformity of the lower extremities, as measured by the hip-knee-ankle angle (HKAA), is associated with poor patient outcomes after total hip and knee arthroplasty (THA, TKA). Automated calculation of HKAA is imperative to reduce the burden on orthopaedic surgeons. We proposed a detection-based deep learning (DL) model to calculate HKAA in THA and TKA patients and assessed the agreement between DL-derived HKAAs and manual measurement. METHODS We retrospectively identified 1,379 long-leg radiographs (LLRs) from patients scheduled for THA or TKA within an academic medical center. There were 1,221 LLRs used to develop the model (randomly split into 70% training, 20% validation, and 10% held-out test sets); 158 LLRs were considered "difficult," as the femoral head was difficult to distinguish from surrounding tissue. There were 2 raters who annotated the HKAA of both lower extremities, and inter-rater reliability was calculated to compare the DL-derived HKAAs with manual measurement within the test set. RESULTS The DL model achieved a mean average precision of 0.985 on the test set. The average HKAA of the operative leg was 173.05 ± 4.54°; the nonoperative leg was 175.55 ± 3.56°. The inter-rater reliability between manual and DL-derived HKAA measurements on the operative leg and nonoperative leg indicated excellent reliability (intraclass correlation (2,k) = 0.987 [0.96, 0.99], intraclass correlation (2, k) = 0.987 [0.98, 0.99, respectively]). The standard error of measurement for the DL-derived HKAA for the operative and nonoperative legs was 0.515° and 0.403°, respectively. CONCLUSIONS A detection-based DL algorithm can calculate the HKAA in LLRs and is comparable to that calculated by manual measurement. The algorithm can detect the bilateral femoral head, knee, and ankle joints with high precision, even in patients where the femoral head is difficult to visualize.
Collapse
Affiliation(s)
- Irene L Tanner
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, North Carolina
| | - Ken Ye
- Trinity College of Arts & Sciences, Duke University, Durham, North Carolina
| | - Miles S Moore
- Physical Therapy Division, Duke University School of Medicine, Durham, North Carolina
| | - Albert J Rechenmacher
- Department of Orthopaedic Surgery, Duke University School of Medicine, Durham, North Carolina
| | - Michelle M Ramirez
- Department of Population Health Sciences, Department of Orthopaedic Surgery, Duke University School of Medicine, Durham, North Carolina
| | - Steven Z George
- Department of Orthopaedic Surgery, Department of Population Health Sciences, Duke Clinical Research Institute, Duke University, Durham, North Carolina
| | | | - Maggie E Horn
- Department of Population Health Sciences, Department of Orthopaedic Surgery, Duke University School of Medicine, Durham, North Carolina
| |
Collapse
|
5
|
Lakkimsetti M, Devella SG, Patel KB, Dhandibhotla S, Kaur J, Mathew M, Kataria J, Nallani M, Farwa UE, Patel T, Egbujo UC, Meenashi Sundaram D, Kenawy S, Roy M, Khan SF. Optimizing the Clinical Direction of Artificial Intelligence With Health Policy: A Narrative Review of the Literature. Cureus 2024; 16:e58400. [PMID: 38756258 PMCID: PMC11098056 DOI: 10.7759/cureus.58400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2024] [Indexed: 05/18/2024] Open
Abstract
Artificial intelligence (AI) has the ability to completely transform the healthcare industry by enhancing diagnosis, treatment, and resource allocation. To ensure patient safety and equitable access to healthcare, it also presents ethical and practical issues that need to be carefully addressed. Its integration into healthcare is a crucial topic. To realize its full potential, however, the ethical issues around data privacy, prejudice, and transparency, as well as the practical difficulties posed by workforce adaptability and statutory frameworks, must be addressed. While there is growing knowledge about the advantages of AI in healthcare, there is a significant lack of knowledge about the moral and practical issues that come with its application, particularly in the setting of emergency and critical care. The majority of current research tends to concentrate on the benefits of AI, but thorough studies that investigate the potential disadvantages and ethical issues are scarce. The purpose of our article is to identify and examine the ethical and practical difficulties that arise when implementing AI in emergency medicine and critical care, to provide solutions to these issues, and to give suggestions to healthcare professionals and policymakers. In order to responsibly and successfully integrate AI in these important healthcare domains, policymakers and healthcare professionals must collaborate to create strong regulatory frameworks, safeguard data privacy, remove prejudice, and give healthcare workers the necessary training.
Collapse
Affiliation(s)
| | - Swati G Devella
- Medicine, Kempegowda Institute of Medical Sciences, Bangalore, IND
| | - Keval B Patel
- Surgery, Narendra Modi Medical College, Ahmedabad, IND
| | | | | | - Midhun Mathew
- Internal Medicine, Trinitas Regional Medical Center, Elizabeth, USA
| | | | - Manisha Nallani
- Medicine, Kamineni Academy of Medical Sciences and Research Center, Hyderabad, IND
| | - Umm E Farwa
- Emergency Medicine, Jinnah Sindh Medical University, Karachi, PAK
| | - Tirath Patel
- Medicine, American University of Antigua, Saint John's, ATG
| | | | - Dakshin Meenashi Sundaram
- Internal Medicine, Employees' State Insurance Corporation (ESIC) Medical College & Post Graduate Institute of Medical Science and Research (PGIMSR), Chennai, IND
| | | | - Mehak Roy
- Internal Medicine, School of Medicine Science and Research, Delhi, IND
| | | |
Collapse
|
6
|
Chen CC, Huang JF, Lin WC, Cheng CT, Chen SC, Fu CY, Lee MS, Liao CH, Chung CY. The Feasibility and Performance of Total Hip Replacement Prediction Deep Learning Algorithm with Real World Data. Bioengineering (Basel) 2023; 10:bioengineering10040458. [PMID: 37106645 PMCID: PMC10136253 DOI: 10.3390/bioengineering10040458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 03/15/2023] [Accepted: 04/04/2023] [Indexed: 04/29/2023] Open
Abstract
(1) Background: Hip degenerative disorder is a common geriatric disease is the main causes to lead to total hip replacement (THR). The surgical timing of THR is crucial for post-operative recovery. Deep learning (DL) algorithms can be used to detect anomalies in medical images and predict the need for THR. The real world data (RWD) were used to validate the artificial intelligence and DL algorithm in medicine but there was no previous study to prove its function in THR prediction. (2) Methods: We designed a sequential two-stage hip replacement prediction deep learning algorithm to identify the possibility of THR in three months of hip joints by plain pelvic radiography (PXR). We also collected RWD to validate the performance of this algorithm. (3) Results: The RWD totally included 3766 PXRs from 2018 to 2019. The overall accuracy of the algorithm was 0.9633; sensitivity was 0.9450; specificity was 1.000 and the precision was 1.000. The negative predictive value was 0.9009, the false negative rate was 0.0550, and the F1 score was 0.9717. The area under curve was 0.972 with 95% confidence interval from 0.953 to 0.987. (4) Conclusions: In summary, this DL algorithm can provide an accurate and reliable method for detecting hip degeneration and predicting the need for further THR. RWD offered an alternative support of the algorithm and validated its function to save time and cost.
Collapse
Affiliation(s)
- Chih-Chi Chen
- Department of Physical Medicine and Rehabilitation, Chang Gung Memorial Hospital, Chang Gung University, Linkou, Taoyuan 33328, Taiwan
| | - Jen-Fu Huang
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Chang Gung University, Linkou, Taoyuan 33328, Taiwan
| | - Wei-Cheng Lin
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Chang Gung University, Linkou, Taoyuan 33328, Taiwan
- Department of Electrical Engineering, Chang Gung University, Taoyuan 33302, Taiwan
| | - Chi-Tung Cheng
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Chang Gung University, Linkou, Taoyuan 33328, Taiwan
| | - Shann-Ching Chen
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Chang Gung University, Linkou, Taoyuan 33328, Taiwan
| | - Chih-Yuan Fu
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Chang Gung University, Linkou, Taoyuan 33328, Taiwan
| | - Mel S Lee
- Department of Orthopaedic Surgery, Pao-Chien Hospital, Pingtung 90078, Taiwan
| | - Chien-Hung Liao
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Chang Gung University, Linkou, Taoyuan 33328, Taiwan
| | - Chia-Ying Chung
- Department of Physical Medicine and Rehabilitation, Chang Gung Memorial Hospital, Chang Gung University, Linkou, Taoyuan 33328, Taiwan
| |
Collapse
|
7
|
Kim T, Goh TS, Lee JS, Lee JH, Kim H, Jung ID. Transfer learning-based ensemble convolutional neural network for accelerated diagnosis of foot fractures. Phys Eng Sci Med 2023; 46:265-277. [PMID: 36625995 DOI: 10.1007/s13246-023-01215-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 01/02/2023] [Indexed: 01/11/2023]
Abstract
The complex shape of the foot, consisting of 26 bones, variable ligaments, tendons, and muscles leads to misdiagnosis of foot fractures. Despite the introduction of artificial intelligence (AI) to diagnose fractures, the accuracy of foot fracture diagnosis is lower than that of conventional methods. We developed an AI assistant system that assists with consistent diagnosis and helps interns or non-experts improve their diagnosis of foot fractures, and compared the effectiveness of the AI assistance on various groups with different proficiency. Contrast-limited adaptive histogram equalization was used to improve the visibility of original radiographs and data augmentation was applied to prevent overfitting. Preprocessed radiographs were fed to an ensemble model of a transfer learning-based convolutional neural network (CNN) that was developed for foot fracture detection with three models: InceptionResNetV2, MobilenetV1, and ResNet152V2. After training the model, score class activation mapping was applied to visualize the fracture based on the model prediction. The prediction result was evaluated by the receiver operating characteristic (ROC) curve and its area under the curve (AUC), and the F1-Score. Regarding the test set, the ensemble model exhibited better classification ability (F1-Score: 0.837, AUC: 0.95, Accuracy: 86.1%) than other single models that showed an accuracy of 82.4%. With AI assistance for the orthopedic fellow, resident, intern, and student group, the accuracy of each group improved by 3.75%, 7.25%, 6.25%, and 7% respectively and diagnosis time was reduced by 21.9%, 14.7%, 24.4%, and 34.6% respectively.
Collapse
Affiliation(s)
- Taekyeong Kim
- Department of Mechanical Engineering, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea
| | - Tae Sik Goh
- Department of Orthopaedic Surgery, Biomedical Research Institute, Pusan National University Hospital, Pusan National University School of Medicine, Busan, 49241, Republic of Korea
| | - Jung Sub Lee
- Department of Orthopaedic Surgery, Biomedical Research Institute, Pusan National University Hospital, Pusan National University School of Medicine, Busan, 49241, Republic of Korea
| | - Ji Hyun Lee
- Health Insurance Review & Assessment Service, Wonju, 26465, Republic of Korea
| | - Hayeol Kim
- Department of Mechanical Engineering, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea
| | - Im Doo Jung
- Department of Mechanical Engineering, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea.
| |
Collapse
|
8
|
Zhang X, Yang Y, Shen YW, Zhang KR, Jiang ZK, Ma LT, Ding C, Wang BY, Meng Y, Liu H. Diagnostic accuracy and potential covariates of artificial intelligence for diagnosing orthopedic fractures: a systematic literature review and meta-analysis. Eur Radiol 2022; 32:7196-7216. [PMID: 35754091 DOI: 10.1007/s00330-022-08956-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 05/07/2022] [Accepted: 06/08/2022] [Indexed: 02/05/2023]
Abstract
OBJECTIVES To systematically quantify the diagnostic accuracy and identify potential covariates affecting the performance of artificial intelligence (AI) in diagnosing orthopedic fractures. METHODS PubMed, Embase, Web of Science, and Cochrane Library were systematically searched for studies on AI applications in diagnosing orthopedic fractures from inception to September 29, 2021. Pooled sensitivity and specificity and the area under the receiver operating characteristic curves (AUC) were obtained. This study was registered in the PROSPERO database prior to initiation (CRD 42021254618). RESULTS Thirty-nine were eligible for quantitative analysis. The overall pooled AUC, sensitivity, and specificity were 0.96 (95% CI 0.94-0.98), 90% (95% CI 87-92%), and 92% (95% CI 90-94%), respectively. In subgroup analyses, multicenter designed studies yielded higher sensitivity (92% vs. 88%) and specificity (94% vs. 91%) than single-center studies. AI demonstrated higher sensitivity with transfer learning (with vs. without: 92% vs. 87%) or data augmentation (with vs. without: 92% vs. 87%), compared to those without. Utilizing plain X-rays as input images for AI achieved results comparable to CT (AUC 0.96 vs. 0.96). Moreover, AI achieved comparable results to humans (AUC 0.97 vs. 0.97) and better results than non-expert human readers (AUC 0.98 vs. 0.96; sensitivity 95% vs. 88%). CONCLUSIONS AI demonstrated high accuracy in diagnosing orthopedic fractures from medical images. Larger-scale studies with higher design quality are needed to validate our findings. KEY POINTS • Multicenter study design, application of transfer learning, and data augmentation are closely related to improving the performance of artificial intelligence models in diagnosing orthopedic fractures. • Utilizing plain X-rays as input images for AI to diagnose fractures achieved results comparable to CT (AUC 0.96 vs. 0.96). • AI achieved comparable results to humans (AUC 0.97 vs. 0.97) but was superior to non-expert human readers (AUC 0.98 vs. 0.96, sensitivity 95% vs. 88%) in diagnosing fractures.
Collapse
Affiliation(s)
- Xiang Zhang
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Yi Yang
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Yi-Wei Shen
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Ke-Rui Zhang
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Ze-Kun Jiang
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610000, China
| | - Li-Tai Ma
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Chen Ding
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Bei-Yu Wang
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Yang Meng
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Hao Liu
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China.
| |
Collapse
|
9
|
Kuo RYL, Harrison C, Curran TA, Jones B, Freethy A, Cussons D, Stewart M, Collins GS, Furniss D. Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis. Radiology 2022; 304:50-62. [PMID: 35348381 DOI: 10.1148/radiol.211785] [Citation(s) in RCA: 67] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Background Patients with fractures are a common emergency presentation and may be misdiagnosed at radiologic imaging. An increasing number of studies apply artificial intelligence (AI) techniques to fracture detection as an adjunct to clinician diagnosis. Purpose To perform a systematic review and meta-analysis comparing the diagnostic performance in fracture detection between AI and clinicians in peer-reviewed publications and the gray literature (ie, articles published on preprint repositories). Materials and Methods A search of multiple electronic databases between January 2018 and July 2020 (updated June 2021) was performed that included any primary research studies that developed and/or validated AI for the purposes of fracture detection at any imaging modality and excluded studies that evaluated image segmentation algorithms. Meta-analysis with a hierarchical model to calculate pooled sensitivity and specificity was used. Risk of bias was assessed by using a modified Prediction Model Study Risk of Bias Assessment Tool, or PROBAST, checklist. Results Included for analysis were 42 studies, with 115 contingency tables extracted from 32 studies (55 061 images). Thirty-seven studies identified fractures on radiographs and five studies identified fractures on CT images. For internal validation test sets, the pooled sensitivity was 92% (95% CI: 88, 93) for AI and 91% (95% CI: 85, 95) for clinicians, and the pooled specificity was 91% (95% CI: 88, 93) for AI and 92% (95% CI: 89, 92) for clinicians. For external validation test sets, the pooled sensitivity was 91% (95% CI: 84, 95) for AI and 94% (95% CI: 90, 96) for clinicians, and the pooled specificity was 91% (95% CI: 81, 95) for AI and 94% (95% CI: 91, 95) for clinicians. There were no statistically significant differences between clinician and AI performance. There were 22 of 42 (52%) studies that were judged to have high risk of bias. Meta-regression identified multiple sources of heterogeneity in the data, including risk of bias and fracture type. Conclusion Artificial intelligence (AI) and clinicians had comparable reported diagnostic performance in fracture detection, suggesting that AI technology holds promise as a diagnostic adjunct in future clinical practice. Clinical trial registration no. CRD42020186641 © RSNA, 2022 Online supplemental material is available for this article. See also the editorial by Cohen and McInnes in this issue.
Collapse
Affiliation(s)
- Rachel Y L Kuo
- From the Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, Old Road Headington, Oxford OX3 7LD, UK (R.Y.L.K., C.H., M.S., G.S.C., D.F.); Department of Plastic Surgery, John Radcliffe Hospital, Oxford, UK (T.A.C., A.F.); Department of Vascular Surgery, Royal Berkshire Hospital, Reading, UK (B.J.); Department of Plastic Surgery, Stoke Mandeville Hospital, Aylesbury, Buckinghamshire UK (D.C.); and UK EQUATOR Center, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford UK (G.S.C.)
| | - Conrad Harrison
- From the Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, Old Road Headington, Oxford OX3 7LD, UK (R.Y.L.K., C.H., M.S., G.S.C., D.F.); Department of Plastic Surgery, John Radcliffe Hospital, Oxford, UK (T.A.C., A.F.); Department of Vascular Surgery, Royal Berkshire Hospital, Reading, UK (B.J.); Department of Plastic Surgery, Stoke Mandeville Hospital, Aylesbury, Buckinghamshire UK (D.C.); and UK EQUATOR Center, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford UK (G.S.C.)
| | - Terry-Ann Curran
- From the Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, Old Road Headington, Oxford OX3 7LD, UK (R.Y.L.K., C.H., M.S., G.S.C., D.F.); Department of Plastic Surgery, John Radcliffe Hospital, Oxford, UK (T.A.C., A.F.); Department of Vascular Surgery, Royal Berkshire Hospital, Reading, UK (B.J.); Department of Plastic Surgery, Stoke Mandeville Hospital, Aylesbury, Buckinghamshire UK (D.C.); and UK EQUATOR Center, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford UK (G.S.C.)
| | - Benjamin Jones
- From the Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, Old Road Headington, Oxford OX3 7LD, UK (R.Y.L.K., C.H., M.S., G.S.C., D.F.); Department of Plastic Surgery, John Radcliffe Hospital, Oxford, UK (T.A.C., A.F.); Department of Vascular Surgery, Royal Berkshire Hospital, Reading, UK (B.J.); Department of Plastic Surgery, Stoke Mandeville Hospital, Aylesbury, Buckinghamshire UK (D.C.); and UK EQUATOR Center, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford UK (G.S.C.)
| | - Alexander Freethy
- From the Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, Old Road Headington, Oxford OX3 7LD, UK (R.Y.L.K., C.H., M.S., G.S.C., D.F.); Department of Plastic Surgery, John Radcliffe Hospital, Oxford, UK (T.A.C., A.F.); Department of Vascular Surgery, Royal Berkshire Hospital, Reading, UK (B.J.); Department of Plastic Surgery, Stoke Mandeville Hospital, Aylesbury, Buckinghamshire UK (D.C.); and UK EQUATOR Center, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford UK (G.S.C.)
| | - David Cussons
- From the Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, Old Road Headington, Oxford OX3 7LD, UK (R.Y.L.K., C.H., M.S., G.S.C., D.F.); Department of Plastic Surgery, John Radcliffe Hospital, Oxford, UK (T.A.C., A.F.); Department of Vascular Surgery, Royal Berkshire Hospital, Reading, UK (B.J.); Department of Plastic Surgery, Stoke Mandeville Hospital, Aylesbury, Buckinghamshire UK (D.C.); and UK EQUATOR Center, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford UK (G.S.C.)
| | - Max Stewart
- From the Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, Old Road Headington, Oxford OX3 7LD, UK (R.Y.L.K., C.H., M.S., G.S.C., D.F.); Department of Plastic Surgery, John Radcliffe Hospital, Oxford, UK (T.A.C., A.F.); Department of Vascular Surgery, Royal Berkshire Hospital, Reading, UK (B.J.); Department of Plastic Surgery, Stoke Mandeville Hospital, Aylesbury, Buckinghamshire UK (D.C.); and UK EQUATOR Center, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford UK (G.S.C.)
| | - Gary S Collins
- From the Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, Old Road Headington, Oxford OX3 7LD, UK (R.Y.L.K., C.H., M.S., G.S.C., D.F.); Department of Plastic Surgery, John Radcliffe Hospital, Oxford, UK (T.A.C., A.F.); Department of Vascular Surgery, Royal Berkshire Hospital, Reading, UK (B.J.); Department of Plastic Surgery, Stoke Mandeville Hospital, Aylesbury, Buckinghamshire UK (D.C.); and UK EQUATOR Center, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford UK (G.S.C.)
| | - Dominic Furniss
- From the Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, Old Road Headington, Oxford OX3 7LD, UK (R.Y.L.K., C.H., M.S., G.S.C., D.F.); Department of Plastic Surgery, John Radcliffe Hospital, Oxford, UK (T.A.C., A.F.); Department of Vascular Surgery, Royal Berkshire Hospital, Reading, UK (B.J.); Department of Plastic Surgery, Stoke Mandeville Hospital, Aylesbury, Buckinghamshire UK (D.C.); and UK EQUATOR Center, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Centre for Statistics in Medicine, Oxford UK (G.S.C.)
| |
Collapse
|