1
|
Bharadwaj UU, Chin CT, Majumdar S. Practical Applications of Artificial Intelligence in Spine Imaging: A Review. Radiol Clin North Am 2024; 62:355-370. [PMID: 38272627 DOI: 10.1016/j.rcl.2023.10.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
Artificial intelligence (AI), a transformative technology with unprecedented potential in medical imaging, can be applied to various spinal pathologies. AI-based approaches may improve imaging efficiency, diagnostic accuracy, and interpretation, which is essential for positive patient outcomes. This review explores AI algorithms, techniques, and applications in spine imaging, highlighting diagnostic impact and challenges with future directions for integrating AI into spine imaging workflow.
Collapse
Affiliation(s)
- Upasana Upadhyay Bharadwaj
- Department of Radiology and Biomedical Imaging, University of California San Francisco, 1700 4th Street, Byers Hall, Suite 203, Room 203D, San Francisco, CA 94158, USA
| | - Cynthia T Chin
- Department of Radiology and Biomedical Imaging, University of California San Francisco, 505 Parnassus Avenue, Box 0628, San Francisco, CA 94143, USA.
| | - Sharmila Majumdar
- Department of Radiology and Biomedical Imaging, University of California San Francisco, 1700 4th Street, Byers Hall, Suite 203, Room 203D, San Francisco, CA 94158, USA
| |
Collapse
|
2
|
Liu G, Wang L, You S, Wang Z, Zhu S, Chen C, Ma X, Yang L, Zhang S, Yang Q. Automatic Detection and Classification of Modic Changes in MRI Images Using Deep Learning: Intelligent Assisted Diagnosis System. Orthop Surg 2024; 16:196-206. [PMID: 37933461 PMCID: PMC10782244 DOI: 10.1111/os.13894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 08/08/2023] [Accepted: 08/10/2023] [Indexed: 11/08/2023] Open
Abstract
OBJECTIVE Modic changes (MCs) are the most prevalent classification system for describing intravertebral MRI signal intensity changes. However, interpreting these intricate MRI images is a complex and time-consuming process. This study investigates the performance of single shot multibox detector (SSD) and ResNet18 network-based automatic detection and classification of MCs. Additionally, it compares the inter-observer agreement and observer-classifier agreement in MCs diagnosis to validate the feasibility of deep learning network-assisted detection of classified MCs. METHOD A retrospective analysis of 140 patients with MCs who underwent MRI diagnosis and met the inclusion and exclusion criteria in Tianjin Hospital from June 2020 to June 2021 was used as the internal dataset. This group consisted of 55 males and 85 females, aged 25 to 89 years, with a mean age of (59.0 ± 13.7) years. An external test dataset of 28 patients, who met the same criteria and were assessed using different MRI equipment at Tianjin Hospital, was also gathered, including 11 males and 17 females, aged 31 to 84 years, with a mean age of 62.7 ± 10.9 years. After Physician 1 (with 15 years of experience) annotated all MRI images, the internal dataset was imported into the deep learning model for training. The model comprises an SSD network for lesion localization and a ResNet18 network for lesion classification. Performance metrics, including accuracy, recall, precision, F1 score, confusion matrix, and inter-observer agreement parameter Kappa value, were used to evaluate the model's performance on the internal and external datasets. Physician 2 (with 1 year of experience) re-labeled the internal and external test datasets to compare the inter-observer agreement and observer-classifier agreement. RESULTS In the internal dataset, when models were utilized for the detection and classification of MCs, the accuracy, recall, precision and F1 score reached 86.25%, 87.77%, 84.92% and 85.60%, respectively. The Kappa value of the inter-observer agreement was 0.768 (95% CI: 0.656, 0.847),while observer-classifier agreement was 0.717 (95% CI: 0.589, 0.809).In the external test dataset, the model's the accuracy, recall, precision and F1 scores for diagnosing MCs reached 75%, 77.08%, 77.80% and 74.97%, respectively. The inter-observer agreement was 0.681 (95% CI: 0.512, 0.677), and observer-classifier agreement was 0.519 (95% CI: 0.290, 0.690). CONCLUSION The model demonstrated strong performance in detecting and classifying MCs, achieving high agreement with physicians in MCs diagnosis. These results suggest that deep learning models have the potential to facilitate the application of intelligent assisted diagnosis techniques in the field of spine research.
Collapse
Affiliation(s)
- Gang Liu
- Clinical School/College of OrthopaedicsTianjin Medical UniversityTianjinChina
- Department of Spine Surgery, Tianjin HospitalTianjin UniversityTianjinChina
| | - Lei Wang
- Department of Spine Surgery, Tianjin HospitalTianjin UniversityTianjinChina
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical EngineeringHebei University of TechnologyTianjinChina
| | - Sheng‐nan You
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical EngineeringHebei University of TechnologyTianjinChina
| | - Zhi Wang
- Department of Spine Surgery, Tianjin HospitalTianjin UniversityTianjinChina
| | - Shan Zhu
- Department of Spine Surgery, Tianjin HospitalTianjin UniversityTianjinChina
| | - Chao Chen
- Department of Spine Surgery, Tianjin HospitalTianjin UniversityTianjinChina
| | - Xin‐long Ma
- Department of Spine Surgery, Tianjin HospitalTianjin UniversityTianjinChina
| | - Lei Yang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical EngineeringHebei University of TechnologyTianjinChina
| | - Shuai Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences & Biomedical EngineeringHebei University of TechnologyTianjinChina
| | - Qiang Yang
- Department of Spine Surgery, Tianjin HospitalTianjin UniversityTianjinChina
| |
Collapse
|
3
|
Cina A, Haschtmann D, Damopoulos D, Gerber N, Loibl M, Fekete T, Kleinstück F, Galbusera F. Comparing image normalization techniques in an end-to-end model for automated modic changes classification from MRI images. BRAIN & SPINE 2023; 4:102738. [PMID: 38510635 PMCID: PMC10951698 DOI: 10.1016/j.bas.2023.102738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 11/07/2023] [Accepted: 12/20/2023] [Indexed: 03/22/2024]
Abstract
Introduction Modic Changes (MCs) are MRI alterations in spine vertebrae's signal intensity. This study introduces an end-to-end model to automatically detect and classify MCs in lumbar MRIs. The model's two-step process involves locating intervertebral regions and then categorizing MC types (MC0, MC1, MC2) using paired T1-and T2-weighted images. This approach offers a promising solution for efficient and standardized MC assessment. Research question The aim is to investigate how different MRI normalization techniques affect MCs classification and how the model can be used in a clinical setting. Material and methods A combination of Faster R-CNN and a 3D Convolutional Neural Network (CNN) is employed. The model first identifies intervertebral regions and then classifies MC types (MC0, MC1, MC2) using paired T1-and T2-weighted lumbar MRIs. Two datasets are used for model development and evaluation. Results The detection model achieves high accuracy in identifying intervertebral areas, with Intersection over Union (IoU) values above 0.7, indicating strong localization alignment. Confidence scores above 0.9 demonstrate the model's accurate levels identification. In the classification task, standardization proves the best performances for MC type assessment, achieving mean sensitivities of 0.83 for MC0, 0.85 for MC1, and 0.78 for MC2, along with balanced accuracy of 0.80 and F1 score of 0.88. Discussion and conclusion The study's end-to-end model shows promise in automating MC assessment, contributing to standardized diagnostics and treatment planning. Limitations include dataset size, class imbalance, and lack of external validation. Future research should focus on external validation, refining model generalization, and improving clinical applicability.
Collapse
Affiliation(s)
- Andrea Cina
- ETH Zürich, Department of Health Sciences and Technologies, Zürich, Switzerland
- Schulthess Klinik, Department of Teaching, Research and Development, Zürich, Switzerland
| | - Daniel Haschtmann
- Schulthess Klinik, Department of Spine Surgery and Neurosurgery, Zürich, Switzerland
| | | | - Nicolas Gerber
- Personalised Medicine Research, School of Biomedical and Precision Engineering, University of Bern, Switzerland
| | - Markus Loibl
- Schulthess Klinik, Department of Spine Surgery and Neurosurgery, Zürich, Switzerland
| | - Tamas Fekete
- Schulthess Klinik, Department of Spine Surgery and Neurosurgery, Zürich, Switzerland
| | - Frank Kleinstück
- Schulthess Klinik, Department of Spine Surgery and Neurosurgery, Zürich, Switzerland
| | - Fabio Galbusera
- Schulthess Klinik, Department of Teaching, Research and Development, Zürich, Switzerland
| |
Collapse
|
4
|
Compte R, Granville Smith I, Isaac A, Danckert N, McSweeney T, Liantis P, Williams FMK. Are current machine learning applications comparable to radiologist classification of degenerate and herniated discs and Modic change? A systematic review and meta-analysis. EUROPEAN SPINE JOURNAL : OFFICIAL PUBLICATION OF THE EUROPEAN SPINE SOCIETY, THE EUROPEAN SPINAL DEFORMITY SOCIETY, AND THE EUROPEAN SECTION OF THE CERVICAL SPINE RESEARCH SOCIETY 2023; 32:3764-3787. [PMID: 37150769 PMCID: PMC10164619 DOI: 10.1007/s00586-023-07718-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 02/08/2023] [Accepted: 04/09/2023] [Indexed: 05/09/2023]
Abstract
INTRODUCTION Low back pain is the leading contributor to disability burden globally. It is commonly due to degeneration of the lumbar intervertebral discs (LDD). Magnetic resonance imaging (MRI) is the current best tool to visualize and diagnose LDD, but places high time demands on clinical radiologists. Automated reading of spine MRIs could improve speed, accuracy, reliability and cost effectiveness in radiology departments. The aim of this review and meta-analysis was to determine if current machine learning algorithms perform well identifying disc degeneration, herniation, bulge and Modic change compared to radiologists. METHODS A PRISMA systematic review protocol was developed and four electronic databases and reference lists were searched. Strict inclusion and exclusion criteria were defined. A PROBAST risk of bias and applicability analysis was performed. RESULTS 1350 articles were extracted. Duplicates were removed and title and abstract searching identified original research articles that used machine learning (ML) algorithms to identify disc degeneration, herniation, bulge and Modic change from MRIs. 27 studies were included in the review; 25 and 14 studies were included multi-variate and bivariate meta-analysis, respectively. Studies used machine learning algorithms to assess LDD, disc herniation, bulge and Modic change. Models using deep learning, support vector machine, k-nearest neighbors, random forest and naïve Bayes algorithms were included. Meta-analyses found no differences in algorithm or classification performance. When algorithms were tested in replication or external validation studies, they did not perform as well as when assessed in developmental studies. Data augmentation improved algorithm performance when compared to models used with smaller datasets, there were no performance differences between augmented data and large datasets. DISCUSSION This review highlights several shortcomings of current approaches, including few validation attempts or use of large sample sizes. To the best of the authors' knowledge, this is the first systematic review to explore this topic. We suggest the utilization of deep learning coupled with semi- or unsupervised learning approaches. Use of all information contained in MRI data will improve accuracy. Clear and complete reporting of study design, statistics and results will improve the reliability and quality of published literature.
Collapse
Affiliation(s)
- Roger Compte
- Department of Twin Research, King's College London, St Thomas' Hospital Campus, 4th Floor South Wing, Block D, Westminster Bridge Road, London, SE1 7EH, UK.
| | - Isabelle Granville Smith
- Department of Twin Research, King's College London, St Thomas' Hospital Campus, 4th Floor South Wing, Block D, Westminster Bridge Road, London, SE1 7EH, UK.
| | - Amanda Isaac
- School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK
| | - Nathan Danckert
- Department of Twin Research, King's College London, St Thomas' Hospital Campus, 4th Floor South Wing, Block D, Westminster Bridge Road, London, SE1 7EH, UK
| | - Terence McSweeney
- Research Unit of Health Sciences and Technology, University of Oulu, Oulu, Finland
| | - Panagiotis Liantis
- Guy's and St Thomas' National Health Services Foundation Trust, London, UK
| | - Frances M K Williams
- Department of Twin Research, King's College London, St Thomas' Hospital Campus, 4th Floor South Wing, Block D, Westminster Bridge Road, London, SE1 7EH, UK
| |
Collapse
|
5
|
de Vries BM, Zwezerijnen GJC, Burchell GL, van Velden FHP, Menke-van der Houven van Oordt CW, Boellaard R. Explainable artificial intelligence (XAI) in radiology and nuclear medicine: a literature review. Front Med (Lausanne) 2023; 10:1180773. [PMID: 37250654 PMCID: PMC10213317 DOI: 10.3389/fmed.2023.1180773] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 04/17/2023] [Indexed: 05/31/2023] Open
Abstract
Rational Deep learning (DL) has demonstrated a remarkable performance in diagnostic imaging for various diseases and modalities and therefore has a high potential to be used as a clinical tool. However, current practice shows low deployment of these algorithms in clinical practice, because DL algorithms lack transparency and trust due to their underlying black-box mechanism. For successful employment, explainable artificial intelligence (XAI) could be introduced to close the gap between the medical professionals and the DL algorithms. In this literature review, XAI methods available for magnetic resonance (MR), computed tomography (CT), and positron emission tomography (PET) imaging are discussed and future suggestions are made. Methods PubMed, Embase.com and Clarivate Analytics/Web of Science Core Collection were screened. Articles were considered eligible for inclusion if XAI was used (and well described) to describe the behavior of a DL model used in MR, CT and PET imaging. Results A total of 75 articles were included of which 54 and 17 articles described post and ad hoc XAI methods, respectively, and 4 articles described both XAI methods. Major variations in performance is seen between the methods. Overall, post hoc XAI lacks the ability to provide class-discriminative and target-specific explanation. Ad hoc XAI seems to tackle this because of its intrinsic ability to explain. However, quality control of the XAI methods is rarely applied and therefore systematic comparison between the methods is difficult. Conclusion There is currently no clear consensus on how XAI should be deployed in order to close the gap between medical professionals and DL algorithms for clinical implementation. We advocate for systematic technical and clinical quality assessment of XAI methods. Also, to ensure end-to-end unbiased and safe integration of XAI in clinical workflow, (anatomical) data minimization and quality control methods should be included.
Collapse
Affiliation(s)
- Bart M. de Vries
- Department of Radiology and Nuclear Medicine, Cancer Center Amsterdam, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Gerben J. C. Zwezerijnen
- Department of Radiology and Nuclear Medicine, Cancer Center Amsterdam, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | | | | | | | - Ronald Boellaard
- Department of Radiology and Nuclear Medicine, Cancer Center Amsterdam, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
6
|
Chen K, Zhai X, Wang S, Li X, Lu Z, Xia D, Li M. Emerging trends and research foci of deep learning in spine: bibliometric and visualization study. Neurosurg Rev 2023; 46:81. [PMID: 37000304 DOI: 10.1007/s10143-023-01987-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 03/10/2023] [Accepted: 03/26/2023] [Indexed: 04/01/2023]
Abstract
As the cognition of spine develops, deep learning (DL) emerges as a powerful tool with tremendous potential for advancing research in this field. To provide a comprehensive overview of DL-spine research, our study utilized bibliometric and visual methods to retrieve relevant articles from the Web of Science database. VOSviewer and CiteSpace were primarily used for literature measurement and knowledge graph analysis. A total of 273 studies focusing on deep learning in the spine, with a combined total of 2302 citations, were retrieved. Additionally, the overall number of articles published on this topic demonstrated a continuous upward trend. China was the country with the highest number of publications, whereas the USA had the most citations. The two most prominent journals were "European Spine Journal" and "Medical Image Analysis," and the most involved research area was Radiology Nuclear Medicine Medical Imaging. VOSviewer identified three visually distinct clusters: "segmentation," "area," and "neural network." Meanwhile, CiteSpace highlighted "magnetic resonance image" and "lumbar" as the keywords with the longest usage, and "agreement" and "automated detection" as the most commonly used keywords. Although the application of DL in spine is still in its infancy, its future is promising. Intercontinental cooperation, extensive application, and more interpretable algorithms will invigorate DL in the field of spine.
Collapse
Affiliation(s)
- Kai Chen
- Department of Orthopedics, Shanghai Changhai Hospital, Shanghai, 200433, China
| | - Xiao Zhai
- Department of Orthopedics, Shanghai Changhai Hospital, Shanghai, 200433, China
| | - Sheng Wang
- Department of Emergency, Shanghai Changhai Hospital, Shanghai, China
| | - Xiaoyu Li
- Department of Orthopedics, Shanghai Changhai Hospital, Shanghai, 200433, China
| | - Zhikai Lu
- Department of Orthopedics, No. 906 Hospital of Joint Logistic Support Force of PLA, Ningbo, Zhejiang, China.
| | - Demeng Xia
- Luodian Clinical Drug Research Center, Shanghai Baoshan Luodian Hospital, Shanghai University, Shanghai, China.
- Emergency Department, Naval Hospital of Eastern Theater, Zhoushan, Zhejiang, China.
| | - Ming Li
- Department of Orthopedics, Shanghai Changhai Hospital, Shanghai, 200433, China.
| |
Collapse
|
7
|
A More Posterior Tibial Tubercle (Decreased Sagittal Tibial Tubercle-Trochlear Groove Distance) Is Significantly Associated With Patellofemoral Joint Degenerative Cartilage Change: A Deep Learning Analysis. Arthroscopy 2022; 39:1493-1501.e2. [PMID: 36581003 DOI: 10.1016/j.arthro.2022.11.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 11/23/2022] [Accepted: 11/30/2022] [Indexed: 12/27/2022]
Abstract
PURPOSE To perform patellofemoral joint (PFJ) geometric measurements on knee magnetic resonance imaging scans and determine their relations with chondral lesions in a multicenter cohort using deep learning. METHODS The sagittal tibial tubercle-trochlear groove (sTTTG) distance, tibial tubercle-trochlear groove distance, trochlear sulcus angle, trochlear depth, Caton-Deschamps Index (CDI), and flexion angle were measured by use of deep learning-generated segmentations on a subset of the Osteoarthritis Initiative study with radiologist-graded PFJ cartilage grades (n = 2,461). Kruskal-Wallis H tests were performed to compare differences in PFJ morphology between subjects without PFJ osteoarthritis (OA) and those with PFJ OA. PFJ morphology was correlated with secondary outcomes of mean patellar cartilage thickness and mean patellar cartilage T2 relaxation time using linear regression models controlling for age, sex, and body mass index. RESULTS A total of 1,626 knees did not have PFJ OA, whereas 835 knees had PFJ OA. Knees without PFJ OA had an increased (anterior) sTTTG distance (mean ± standard deviation, 11.1 ± 12.8 mm) compared with knees with PFJ OA (8.4 ± 12.7 mm) (P < .001), indicating a more posterior tibial tubercle in subjects with PFJ OA. Knees without PFJ OA had a decreased sulcus angle (127.4° ± 7.1° vs 128.0° ± 8.4°, P = .01) and increased trochlear depth (9.1 ± 1.7 mm vs 9.0 ± 2.0 mm, P = .03) compared with knees with PFJ OA. Decreased patellar cartilage thickness was associated with decreased trochlear depth (β = 0.12, P = .002) and increased CDI (β = -0.07, P < .001). Increased patellar cartilage T2 relaxation time was correlated with decreased sTTTG distance (β = -0.08, P = .01), decreased sulcus angle (β = -0.12, P = .04), and decreased CDI (β = -0.12, P < .001). CONCLUSIONS PFJ OA, patellar cartilage thickness, and patellar cartilage T2 relaxation time were shown to be associated with the underlying geometries within the PFJ. This large longitudinal study highlights that a decreased sTTTG distance (i.e., a more posterior tibial tubercle) is significantly associated with PFJ degenerative cartilage change. LEVEL OF EVIDENCE Level III, retrospective comparative prognostic trial.
Collapse
|