1
|
Fabijan A, Zawadzka-Fabijan A, Fabijan R, Zakrzewski K, Nowosławska E, Polis B. Assessing the Accuracy of Artificial Intelligence Models in Scoliosis Classification and Suggested Therapeutic Approaches. J Clin Med 2024; 13:4013. [PMID: 39064053 PMCID: PMC11278075 DOI: 10.3390/jcm13144013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 06/30/2024] [Accepted: 07/06/2024] [Indexed: 07/28/2024] Open
Abstract
Background: Open-source artificial intelligence models (OSAIMs) are increasingly being applied in various fields, including IT and medicine, offering promising solutions for diagnostic and therapeutic interventions. In response to the growing interest in AI for clinical diagnostics, we evaluated several OSAIMs-such as ChatGPT 4, Microsoft Copilot, Gemini, PopAi, You Chat, Claude, and the specialized PMC-LLaMA 13B-assessing their abilities to classify scoliosis severity and recommend treatments based on radiological descriptions from AP radiographs. Methods: Our study employed a two-stage methodology, where descriptions of single-curve scoliosis were analyzed by AI models following their evaluation by two independent neurosurgeons. Statistical analysis involved the Shapiro-Wilk test for normality, with non-normal distributions described using medians and interquartile ranges. Inter-rater reliability was assessed using Fleiss' kappa, and performance metrics, like accuracy, sensitivity, specificity, and F1 scores, were used to evaluate the AI systems' classification accuracy. Results: The analysis indicated that although some AI systems, like ChatGPT 4, Copilot, and PopAi, accurately reflected the recommended Cobb angle ranges for disease severity and treatment, others, such as Gemini and Claude, required further calibration. Particularly, PMC-LLaMA 13B expanded the classification range for moderate scoliosis, potentially influencing clinical decisions and delaying interventions. Conclusions: These findings highlight the need for the continuous refinement of AI models to enhance their clinical applicability.
Collapse
Affiliation(s)
- Artur Fabijan
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (K.Z.); (E.N.); (B.P.)
| | - Agnieszka Zawadzka-Fabijan
- Department of Rehabilitation Medicine, Faculty of Health Sciences, Medical University of Lodz, 90-419 Lodz, Poland;
| | | | - Krzysztof Zakrzewski
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (K.Z.); (E.N.); (B.P.)
| | - Emilia Nowosławska
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (K.Z.); (E.N.); (B.P.)
| | - Bartosz Polis
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (K.Z.); (E.N.); (B.P.)
| |
Collapse
|
2
|
Hey HWD, Low TL, Soh HL, Tan KA, Tan JH, Tan TH, Thomas AC, Ka-Po Liu G, Wong HK, Tan JHJ. Prevalence and Risk Factors of Degenerative Spondylolisthesis and Retrolisthesis in the Thoracolumbar and Lumbar Spine - An EOS Study Using Updated Radiographic Parameters. Global Spine J 2024; 14:1137-1147. [PMID: 36749604 PMCID: PMC11289555 DOI: 10.1177/21925682221134044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
STUDY DESIGN Single centre, cross-sectional study. OBJECTIVES The objective is to report the prevalence of spondylolisthesis and retrolisthesis, analyse both conditions in terms of the affected levels and severity, as well as identify their risk factors. METHODS A review of clinical data and radiographic images of consecutive spine patients seen in outpatient clinics over a 1-month period is performed. Images are obtained using the EOS® technology under standardised protocol, and radiographic measurements were performed by 2 independent, blinded spine surgeons. The prevalence of both conditions were shown and categorised based on the spinal level involvement and severity. Associated risk factors were identified. RESULTS A total of 256 subjects (46.1% males) with 2304 discs from T9/10 to L5/S1 were studied. Their mean age was 52.2(± 18.7) years. The overall prevalence of spondylolisthesis and retrolisthesis was 25.9% and 17.1% respectively. Spondylolisthesis occurs frequently at L4/5(16.3%), and retrolisthesis at L3/4(6.8%). Majority of the patients with spondylolisthesis had a Grade I slip (84.3%), while those with retrolisthesis had a Grade I slip. The presence of spondylolisthesis was found associated with increased age (P < .001), female gender (OR: 2.310; P = .005), predominantly sitting occupations (OR:2.421; P = .008), higher American Society of Anaesthesiology grades (P = .001), and lower limb radiculopathy (OR: 2.175; P = .007). Patients with spondylolisthesis had larger Pelvic Incidence (P < .001), Pelvic Tilt (P < .001) and Knee alignment angle (P = .011), but smaller Thoracolumbar junctional angle (P = .008), Spinocoxa angle (P = .007). Retrolisthesis was associated with a larger Thoracolumbar junctional angle (P =.039). CONCLUSION This is the first study that details the prevalence of spondylolisthesis and retrolisthesis simultaneously, using the EOS technology and updated sagittal radiographic parameters. It allows better understanding of both conditions, their mutual relationship, and associated clinical and radiographic risk factors.
Collapse
Affiliation(s)
- Hwee Weng Dennis Hey
- Department of Orthopaedic Surgery, National University Hospital (NUH), Singapore
| | - Tian Ling Low
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Hui Ling Soh
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Kimberly-Anne Tan
- Department of Orthopaedic Surgery, National University Hospital (NUH), Singapore
| | - Jun-Hao Tan
- Department of Orthopaedic Surgery, National University Hospital (NUH), Singapore
| | - Tuan Hao Tan
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | | | - Gabriel Ka-Po Liu
- Department of Orthopaedic Surgery, National University Hospital (NUH), Singapore
| | - Hee-Kit Wong
- Department of Orthopaedic Surgery, National University Hospital (NUH), Singapore
| | | |
Collapse
|
3
|
Fabijan A, Zawadzka-Fabijan A, Fabijan R, Zakrzewski K, Nowosławska E, Polis B. Artificial Intelligence in Medical Imaging: Analyzing the Performance of ChatGPT and Microsoft Bing in Scoliosis Detection and Cobb Angle Assessment. Diagnostics (Basel) 2024; 14:773. [PMID: 38611686 PMCID: PMC11011528 DOI: 10.3390/diagnostics14070773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/24/2024] [Accepted: 04/04/2024] [Indexed: 04/14/2024] Open
Abstract
Open-source artificial intelligence models (OSAIM) find free applications in various industries, including information technology and medicine. Their clinical potential, especially in supporting diagnosis and therapy, is the subject of increasingly intensive research. Due to the growing interest in artificial intelligence (AI) for diagnostic purposes, we conducted a study evaluating the capabilities of AI models, including ChatGPT and Microsoft Bing, in the diagnosis of single-curve scoliosis based on posturographic radiological images. Two independent neurosurgeons assessed the degree of spinal deformation, selecting 23 cases of severe single-curve scoliosis. Each posturographic image was separately implemented onto each of the mentioned platforms using a set of formulated questions, starting from 'What do you see in the image?' and ending with a request to determine the Cobb angle. In the responses, we focused on how these AI models identify and interpret spinal deformations and how accurately they recognize the direction and type of scoliosis as well as vertebral rotation. The Intraclass Correlation Coefficient (ICC) with a 'two-way' model was used to assess the consistency of Cobb angle measurements, and its confidence intervals were determined using the F test. Differences in Cobb angle measurements between human assessments and the AI ChatGPT model were analyzed using metrics such as RMSEA, MSE, MPE, MAE, RMSLE, and MAPE, allowing for a comprehensive assessment of AI model performance from various statistical perspectives. The ChatGPT model achieved 100% effectiveness in detecting scoliosis in X-ray images, while the Bing model did not detect any scoliosis. However, ChatGPT had limited effectiveness (43.5%) in assessing Cobb angles, showing significant inaccuracy and discrepancy compared to human assessments. This model also had limited accuracy in determining the direction of spinal curvature, classifying the type of scoliosis, and detecting vertebral rotation. Overall, although ChatGPT demonstrated potential in detecting scoliosis, its abilities in assessing Cobb angles and other parameters were limited and inconsistent with expert assessments. These results underscore the need for comprehensive improvement of AI algorithms, including broader training with diverse X-ray images and advanced image processing techniques, before they can be considered as auxiliary in diagnosing scoliosis by specialists.
Collapse
Affiliation(s)
- Artur Fabijan
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (K.Z.); (E.N.); (B.P.)
| | - Agnieszka Zawadzka-Fabijan
- Department of Rehabilitation Medicine, Faculty of Health Sciences, Medical University of Lodz, 90-419 Lodz, Poland;
| | | | - Krzysztof Zakrzewski
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (K.Z.); (E.N.); (B.P.)
| | - Emilia Nowosławska
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (K.Z.); (E.N.); (B.P.)
| | - Bartosz Polis
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (K.Z.); (E.N.); (B.P.)
| |
Collapse
|
4
|
Fabijan A, Polis B, Fabijan R, Zakrzewski K, Nowosławska E, Zawadzka-Fabijan A. Artificial Intelligence in Scoliosis Classification: An Investigation of Language-Based Models. J Pers Med 2023; 13:1695. [PMID: 38138922 PMCID: PMC10744696 DOI: 10.3390/jpm13121695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 12/03/2023] [Accepted: 12/07/2023] [Indexed: 12/24/2023] Open
Abstract
Open-source artificial intelligence models are finding free application in various industries, including computer science and medicine. Their clinical potential, especially in assisting diagnosis and therapy, is the subject of increasingly intensive research. Due to the growing interest in AI for diagnostics, we conducted a study evaluating the abilities of AI models, including ChatGPT, Microsoft Bing, and Scholar AI, in classifying single-curve scoliosis based on radiological descriptions. Fifty-six posturographic images depicting single-curve scoliosis were selected and assessed by two independent neurosurgery specialists, who classified them as mild, moderate, or severe based on Cobb angles. Subsequently, descriptions were developed that accurately characterized the degree of spinal deformation, based on the measured values of Cobb angles. These descriptions were then provided to AI language models to assess their proficiency in diagnosing spinal pathologies. The artificial intelligence models conducted classification using the provided data. Our study also focused on identifying specific sources of information and criteria applied in their decision-making algorithms, aiming for a deeper understanding of the determinants influencing AI decision processes in scoliosis classification. The classification quality of the predictions was evaluated using performance evaluation metrics such as sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and balanced accuracy. Our study strongly supported our hypothesis, showing that among four AI models, ChatGPT 4 and Scholar AI Premium excelled in classifying single-curve scoliosis with perfect sensitivity and specificity. These models demonstrated unmatched rater concordance and excellent performance metrics. In comparing real and AI-generated scoliosis classifications, they showed impeccable precision in all posturographic images, indicating total accuracy (1.0, MAE = 0.0) and remarkable inter-rater agreement, with a perfect Fleiss' Kappa score. This was consistent across scoliosis cases with a Cobb's angle range of 11-92 degrees. Despite high accuracy in classification, each model used an incorrect angular range for the mild stage of scoliosis. Our findings highlight the immense potential of AI in analyzing medical data sets. However, the diversity in competencies of AI models indicates the need for their further development to more effectively meet specific needs in clinical practice.
Collapse
Affiliation(s)
- Artur Fabijan
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (B.P.); (K.Z.); (E.N.)
| | - Bartosz Polis
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (B.P.); (K.Z.); (E.N.)
| | | | - Krzysztof Zakrzewski
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (B.P.); (K.Z.); (E.N.)
| | - Emilia Nowosławska
- Department of Neurosurgery, Polish-Mother’s Memorial Hospital Research Institute, 93-338 Lodz, Poland; (B.P.); (K.Z.); (E.N.)
| | - Agnieszka Zawadzka-Fabijan
- Department of Rehabilitation Medicine, Faculty of Health Sciences, Medical University of Lodz, 90-419 Lodz, Poland;
| |
Collapse
|
5
|
Fabijan A, Fabijan R, Zawadzka-Fabijan A, Nowosławska E, Zakrzewski K, Polis B. Evaluating Scoliosis Severity Based on Posturographic X-ray Images Using a Contrastive Language-Image Pretraining Model. Diagnostics (Basel) 2023; 13:2142. [PMID: 37443536 DOI: 10.3390/diagnostics13132142] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 06/10/2023] [Accepted: 06/20/2023] [Indexed: 07/15/2023] Open
Abstract
Assessing severe scoliosis requires the analysis of posturographic X-ray images. One way to analyse these images may involve the use of open-source artificial intelligence models (OSAIMs), such as the contrastive language-image pretraining (CLIP) system, which was designed to combine images with text. This study aims to determine whether the CLIP model can recognise visible severe scoliosis in posturographic X-ray images. This study used 23 posturographic images of patients diagnosed with severe scoliosis that were evaluated by two independent neurosurgery specialists. Subsequently, the X-ray images were input into the CLIP system, where they were subjected to a series of questions with varying levels of difficulty and comprehension. The predictions obtained using the CLIP models in the form of probabilities ranging from 0 to 1 were compared with the actual data. To evaluate the quality of image recognition, true positives, false negatives, and sensitivity were determined. The results of this study show that the CLIP system can perform a basic assessment of X-ray images showing visible severe scoliosis with a high level of sensitivity. It can be assumed that, in the future, OSAIMs dedicated to image analysis may become commonly used to assess X-ray images, including those of scoliosis.
Collapse
Affiliation(s)
- Artur Fabijan
- Department of Neurosurgery, Polish-Mother's Memorial Hospital Research Institute, 93-338 Lodz, Poland
| | | | | | - Emilia Nowosławska
- Department of Neurosurgery, Polish-Mother's Memorial Hospital Research Institute, 93-338 Lodz, Poland
| | - Krzysztof Zakrzewski
- Department of Neurosurgery, Polish-Mother's Memorial Hospital Research Institute, 93-338 Lodz, Poland
| | - Bartosz Polis
- Department of Neurosurgery, Polish-Mother's Memorial Hospital Research Institute, 93-338 Lodz, Poland
| |
Collapse
|
6
|
Tsagkaris C, Widmer J, Wanivenhaus F, Redaelli A, Lamartina C, Farshad M. The sitting vs standing spine. NORTH AMERICAN SPINE SOCIETY JOURNAL 2022; 9:100108. [PMID: 35310424 PMCID: PMC8924684 DOI: 10.1016/j.xnsj.2022.100108] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 02/21/2022] [Accepted: 02/24/2022] [Indexed: 11/28/2022]
Abstract
Background Planning of surgical procedures for spinal fusion is performed on standing radiographs, neglecting the fact that patients are mostly in the sitting position during daily life. The awareness about the differences in the standing and sitting configuration of the spine has increased during the last years. The purpose was to provide an overview of studies related to seated imaging for spinal fusion surgery, identify knowledge gaps and evaluate future research questions. Methods A literature search according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) extension for Scoping Reviews (PRISMASc) was performed to identify reports related to seated imaging for spinal deformity surgery. A summary of the finding is presented for healthy individuals as well as patients with a spinal disorder and/or surgery. Results The systematic search identified 30 original studies reporting on 1) the pre- and postoperative use of seated imaging of the spine (n=12), 2) seated imaging of the spine for non - surgical evaluation (n=7) and 3) seated imaging of the spine among healthy individuals (12). The summarized evidence illuminates that sitting leads to a straightening of the spine decreasing thoracic kyphosis (TK), lumbar lordosis (LL), the sacral slope (SS). Further, the postural change between standing and sitting is more significant on the lower segments of the spine. Also, the adjacent segment compensates the needed postural change of the lumbar spine while sitting with hyperkyphosis. Conclusions The spine has a different configuration in standing and sitting. This systematic review summarizes the current knowledge about such differences and reveals that there is minimal evidence about their consideration for surgical planning of spinal fusion surgery. Further, it identifies gaps in knowledge and areas of further research.
Collapse
Affiliation(s)
- Christos Tsagkaris
- Department of Orthopedics, Balgrist University Hospital, Zurich, Switzerland
- Spine Biomechanics, Department of Orthopaedics, Balgrist University Hospital, Zurich, Switzerland
| | - Jonas Widmer
- Department of Orthopedics, Balgrist University Hospital, Zurich, Switzerland
- Spine Biomechanics, Department of Orthopaedics, Balgrist University Hospital, Zurich, Switzerland
| | - Florian Wanivenhaus
- Department of Orthopedics, Balgrist University Hospital, Zurich, Switzerland
| | - Andrea Redaelli
- GSpine4 - I.R.C.C.S. Istituto Ortopedico Galeazzi, Milan, Italy
| | | | - Mazda Farshad
- Department of Orthopedics, Balgrist University Hospital, Zurich, Switzerland
| |
Collapse
|