1
|
Chen H, Zhang B, Huang J. Recent advances and applications of artificial intelligence in 3D bioprinting. BIOPHYSICS REVIEWS 2024; 5:031301. [PMID: 39036708 PMCID: PMC11260195 DOI: 10.1063/5.0190208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 06/11/2024] [Indexed: 07/23/2024]
Abstract
3D bioprinting techniques enable the precise deposition of living cells, biomaterials, and biomolecules, emerging as a promising approach for engineering functional tissues and organs. Meanwhile, recent advances in 3D bioprinting enable researchers to build in vitro models with finely controlled and complex micro-architecture for drug screening and disease modeling. Recently, artificial intelligence (AI) has been applied to different stages of 3D bioprinting, including medical image reconstruction, bioink selection, and printing process, with both classical AI and machine learning approaches. The ability of AI to handle complex datasets, make complex computations, learn from past experiences, and optimize processes dynamically makes it an invaluable tool in advancing 3D bioprinting. The review highlights the current integration of AI in 3D bioprinting and discusses future approaches to harness the synergistic capabilities of 3D bioprinting and AI for developing personalized tissues and organs.
Collapse
Affiliation(s)
| | - Bin Zhang
- Department of Mechanical and Aerospace Engineering, Brunel University London, London, United Kingdom
| | - Jie Huang
- Department of Mechanical Engineering, University College London, London, United Kingdom
| |
Collapse
|
2
|
Rainey C, Bond R, McConnell J, Hughes C, Kumar D, McFadden S. Reporting radiographers' interaction with Artificial Intelligence-How do different forms of AI feedback impact trust and decision switching? PLOS DIGITAL HEALTH 2024; 3:e0000560. [PMID: 39110687 PMCID: PMC11305567 DOI: 10.1371/journal.pdig.0000560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 06/22/2024] [Indexed: 08/10/2024]
Abstract
Artificial Intelligence (AI) has been increasingly integrated into healthcare settings, including the radiology department to aid radiographic image interpretation, including reporting by radiographers. Trust has been cited as a barrier to effective clinical implementation of AI. Appropriating trust will be important in the future with AI to ensure the ethical use of these systems for the benefit of the patient, clinician and health services. Means of explainable AI, such as heatmaps have been proposed to increase AI transparency and trust by elucidating which parts of image the AI 'focussed on' when making its decision. The aim of this novel study was to quantify the impact of different forms of AI feedback on the expert clinicians' trust. Whilst this study was conducted in the UK, it has potential international application and impact for AI interface design, either globally or in countries with similar cultural and/or economic status to the UK. A convolutional neural network was built for this study; trained, validated and tested on a publicly available dataset of MUsculoskeletal RAdiographs (MURA), with binary diagnoses and Gradient Class Activation Maps (GradCAM) as outputs. Reporting radiographers (n = 12) were recruited to this study from all four regions of the UK. Qualtrics was used to present each participant with a total of 18 complete examinations from the MURA test dataset (each examination contained more than one radiographic image). Participants were presented with the images first, images with heatmaps next and finally an AI binary diagnosis in a sequential order. Perception of trust in the AI systems was obtained following the presentation of each heatmap and binary feedback. The participants were asked to indicate whether they would change their mind (or decision switch) in response to the AI feedback. Participants disagreed with the AI heatmaps for the abnormal examinations 45.8% of the time and agreed with binary feedback on 86.7% of examinations (26/30 presentations).'Only two participants indicated that they would decision switch in response to all AI feedback (GradCAM and binary) (0.7%, n = 2) across all datasets. 22.2% (n = 32) of participants agreed with the localisation of pathology on the heatmap. The level of agreement with the GradCAM and binary diagnosis was found to be correlated with trust (GradCAM:-.515;-.584, significant large negative correlation at 0.01 level (p = < .01 and-.309;-.369, significant medium negative correlation at .01 level (p = < .01) for GradCAM and binary diagnosis respectively). This study shows that the extent of agreement with both AI binary diagnosis and heatmap is correlated with trust in AI for the participants in this study, where greater agreement with the form of AI feedback is associated with greater trust in AI, in particular in the heatmap form of AI feedback. Forms of explainable AI should be developed with cognisance of the need for precision and accuracy in localisation to promote appropriate trust in clinical end users.
Collapse
Affiliation(s)
- Clare Rainey
- Ulster University, School of Health Sciences, York St, Belfast, Northern Ireland
| | - Raymond Bond
- Ulster University, School of Computing, York St, Belfast, Northern Ireland
| | | | - Ciara Hughes
- Ulster University, School of Health Sciences, York St, Belfast, Northern Ireland
| | - Devinder Kumar
- School of Medicine, Stanford University, California, United States of America
| | - Sonyia McFadden
- Ulster University, School of Health Sciences, York St, Belfast, Northern Ireland
| |
Collapse
|
3
|
Nowroozi A, Salehi MA, Shobeiri P, Agahi S, Momtazmanesh S, Kaviani P, Kalra MK. Artificial intelligence diagnostic accuracy in fracture detection from plain radiographs and comparing it with clinicians: a systematic review and meta-analysis. Clin Radiol 2024; 79:579-588. [PMID: 38772766 DOI: 10.1016/j.crad.2024.04.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/09/2024] [Accepted: 04/15/2024] [Indexed: 05/23/2024]
Abstract
PURPOSE Fracture detection is one of the most commonly used and studied aspects of artificial intelligence (AI) in medicine. In this systematic review and meta-analysis, we aimed to summarize available literature and data regarding AI performance in fracture detection on plain radiographs and various factors affecting it. METHODS We systematically reviewed studies evaluating AI algorithms in detecting bone fractures in plain radiographs, combined their performance using meta-analysis (a bivariate regression approach), and compared it with that of clinicians. We also analyzed the factors potentially affecting algorithm performance using meta-regression. RESULTS Our analysis included 100 studies. In 83 studies with confusion matrices, AI algorithms showed a sensitivity of 91.43% and a specificity of 92.12% (Area under the summary receiver operator curve = 0.968). After adjustment and false discovery rate correction, tibia/fibula (excluding ankle) fractures were associated with higher (7.0%, p=0.004) AI sensitivity, while more recent publications (5.5%, p=0.003) and Xception architecture (6.6%, p<0.001) were associated with higher specificity. Clinicians and AI showed similar specificity in fracture identification, although AI leaned to higher sensitivity (7.6%, p=0.07). Radiologists, on the other hand, were more specific than AI overall and in several subgroups, and more sensitive to hip fractures before FDR correction. CONCLUSIONS Currently available AI aids could result in a significant improvement in care where radiologists are not readily available. Moreover, identifying factors affecting algorithm performance could guide AI development teams in their process of optimizing their products.
Collapse
Affiliation(s)
- A Nowroozi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - M A Salehi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - P Shobeiri
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - S Agahi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - S Momtazmanesh
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - P Kaviani
- Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - M K Kalra
- Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA.
| |
Collapse
|
4
|
Tikhomirov L, Semmler C, McCradden M, Searston R, Ghassemi M, Oakden-Rayner L. Medical artificial intelligence for clinicians: the lost cognitive perspective. Lancet Digit Health 2024; 6:e589-e594. [PMID: 39059890 DOI: 10.1016/s2589-7500(24)00095-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 03/08/2024] [Accepted: 05/01/2024] [Indexed: 07/28/2024]
Abstract
The development and commercialisation of medical decision systems based on artificial intelligence (AI) far outpaces our understanding of their value for clinicians. Although applicable across many forms of medicine, we focus on characterising the diagnostic decisions of radiologists through the concept of ecologically bounded reasoning, review the differences between clinician decision making and medical AI model decision making, and reveal how these differences pose fundamental challenges for integrating AI into radiology. We argue that clinicians are contextually motivated, mentally resourceful decision makers, whereas AI models are contextually stripped, correlational decision makers, and discuss misconceptions about clinician-AI interaction stemming from this misalignment of capabilities. We outline how future research on clinician-AI interaction could better address the cognitive considerations of decision making and be used to enhance the safety and usability of AI models in high-risk medical decision-making contexts.
Collapse
Affiliation(s)
- Lana Tikhomirov
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia.
| | - Carolyn Semmler
- School of Psychology, University of Adelaide, Adelaide, SA, Australia
| | - Melissa McCradden
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia; School of Public Health, Hospital for Sick Children, University of Toronto, Toronto, ON, Canada
| | - Rachel Searston
- School of Psychology, University of Adelaide, Adelaide, SA, Australia
| | - Marzyeh Ghassemi
- Department of Electrical Engineering and Computer Science and Institute for Medical and Evaluative Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia
| |
Collapse
|
5
|
Nolin-Lapalme A, Corbin D, Tastet O, Avram R, Hussin JG. Advancing Fairness in Cardiac Care: Strategies for Mitigating Bias in Artificial Intelligence Models Within Cardiology. Can J Cardiol 2024:S0828-282X(24)00357-X. [PMID: 38735528 DOI: 10.1016/j.cjca.2024.04.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/03/2024] [Accepted: 04/22/2024] [Indexed: 05/14/2024] Open
Abstract
In the dynamic field of medical artificial intelligence (AI), cardiology stands out as a key area for its technological advancements and clinical application. In this review we explore the complex issue of data bias, specifically addressing those encountered during the development and implementation of AI tools in cardiology. We dissect the origins and effects of these biases, which challenge their reliability and widespread applicability in health care. Using a case study, we highlight the complexities involved in addressing these biases from a clinical viewpoint. The goal of this review is to equip researchers and clinicians with the practical knowledge needed to identify, understand, and mitigate these biases, advocating for the creation of AI solutions that are not just technologically sound, but also fair and effective for all patients.
Collapse
Affiliation(s)
- Alexis Nolin-Lapalme
- Department of Medicine, Montreal Heart Institute, Montreal, Quebec, Canada; Faculté de Médecine, Université de Montréal, Montreal, Quebec, Canada; Mila - Québec AI Institute, Montreal, Quebec, Canada; Heartwise (heartwise.ai), Montreal Heart Institute, Montreal, Quebec, Canada.
| | - Denis Corbin
- Department of Medicine, Montreal Heart Institute, Montreal, Quebec, Canada
| | - Olivier Tastet
- Department of Medicine, Montreal Heart Institute, Montreal, Quebec, Canada
| | - Robert Avram
- Department of Medicine, Montreal Heart Institute, Montreal, Quebec, Canada; Faculté de Médecine, Université de Montréal, Montreal, Quebec, Canada; Heartwise (heartwise.ai), Montreal Heart Institute, Montreal, Quebec, Canada
| | - Julie G Hussin
- Department of Medicine, Montreal Heart Institute, Montreal, Quebec, Canada; Faculté de Médecine, Université de Montréal, Montreal, Quebec, Canada; Mila - Québec AI Institute, Montreal, Quebec, Canada
| |
Collapse
|
6
|
Hansen V, Jensen J, Kusk MW, Gerke O, Tromborg HB, Lysdahlgaard S. Deep learning performance compared to healthcare experts in detecting wrist fractures from radiographs: A systematic review and meta-analysis. Eur J Radiol 2024; 174:111399. [PMID: 38428318 DOI: 10.1016/j.ejrad.2024.111399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 01/29/2024] [Accepted: 02/26/2024] [Indexed: 03/03/2024]
Abstract
OBJECTIVE To perform a systematic review and meta-analysis of the diagnostic accuracy of deep learning (DL) algorithms in the diagnosis of wrist fractures (WF) on plain wrist radiographs, taking healthcare experts consensus as reference standard. METHODS Embase, Medline, PubMed, Scopus and Web of Science were searched in the period from 1 Jan 2012 to 9 March 2023. Eligible studies were patients with wrist radiographs for radial and ulnar fractures as the target condition, studies using DL algorithms based on convolutional neural networks (CNN), and healthcare experts consensus as the minimum reference standard. Studies were assessed with a modified QUADAS-2 tool, and we applied a bivariate random-effects model for meta-analysis of diagnostic test accuracy data. RESULTS Our study was registered at PROSPERO with ID: CRD42023431398. We included 6 unique studies for meta-analysis, with a total of 33,026 radiographs. CNN performance compared to reference standards for the included articles found a summary sensitivity of 92% (95% CI: 80%-97%) and a summary specificity of 93% (95% CI: 76%-98%). The generalized bivariate I-squared statistic indicated considerable heterogeneity between the studies (81.90%). Four studies had one or more domains at high risk of bias and two studies had concerns regarding applicability. CONCLUSION The diagnostic accuracy of CNNs was comparable to that of healthcare experts in wrist radiographs for investigation of WF. There is a need for studies with a robust reference standard, external data-set validation and investigation of diagnostic performance of healthcare experts aided with CNNs. CLINICAL RELEVANCE STATEMENT DL matches healthcare experts in diagnosing WFs, which potentially benefits patient diagnosis.
Collapse
Affiliation(s)
- V Hansen
- Department of Radiology and Nuclear Medicine, Hospital of South West Jutland, University Hospital of Southern Denmark, Esbjerg, Denmark
| | - J Jensen
- Department of Radiology, Odense University Hospital, Odense, Denmark; Research and Innovation Unit of Radiology, University of Southern Denmark, Odense, Denmark
| | - M W Kusk
- Department of Radiology and Nuclear Medicine, Hospital of South West Jutland, University Hospital of Southern Denmark, Esbjerg, Denmark; Department of Regional Health Research, Faculty of Health Sciences, University of Southern Denmark, Odense, Denmark; Imaging Research Initiative Southwest (IRIS), Hospital of South West Jutland, University Hospital of Southern Denmark, Esbjerg, Denmark; Radiography and Diagnostic Imaging, School of Medicine, University College Dublin, Belfield 4, Dublin, Ireland
| | - O Gerke
- Department of Nuclear Medicine, Odense University Hospital, Odense, Denmark; Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - H B Tromborg
- Department of Clinical Research, University of Southern Denmark, Odense, Denmark; Department of Orthopedic Surgery, Odense University Hospital, Odense, Denmark
| | - S Lysdahlgaard
- Department of Radiology and Nuclear Medicine, Hospital of South West Jutland, University Hospital of Southern Denmark, Esbjerg, Denmark; Department of Regional Health Research, Faculty of Health Sciences, University of Southern Denmark, Odense, Denmark; Imaging Research Initiative Southwest (IRIS), Hospital of South West Jutland, University Hospital of Southern Denmark, Esbjerg, Denmark.
| |
Collapse
|
7
|
Kim JY, Hasan A, Kellogg KC, Ratliff W, Murray SG, Suresh H, Valladares A, Shaw K, Tobey D, Vidal DE, Lifson MA, Patel M, Raji ID, Gao M, Knechtle W, Tang L, Balu S, Sendak MP. Development and preliminary testing of Health Equity Across the AI Lifecycle (HEAAL): A framework for healthcare delivery organizations to mitigate the risk of AI solutions worsening health inequities. PLOS DIGITAL HEALTH 2024; 3:e0000390. [PMID: 38723025 PMCID: PMC11081364 DOI: 10.1371/journal.pdig.0000390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 03/15/2024] [Indexed: 05/12/2024]
Abstract
The use of data-driven technologies such as Artificial Intelligence (AI) and Machine Learning (ML) is growing in healthcare. However, the proliferation of healthcare AI tools has outpaced regulatory frameworks, accountability measures, and governance standards to ensure safe, effective, and equitable use. To address these gaps and tackle a common challenge faced by healthcare delivery organizations, a case-based workshop was organized, and a framework was developed to evaluate the potential impact of implementing an AI solution on health equity. The Health Equity Across the AI Lifecycle (HEAAL) is co-designed with extensive engagement of clinical, operational, technical, and regulatory leaders across healthcare delivery organizations and ecosystem partners in the US. It assesses 5 equity assessment domains-accountability, fairness, fitness for purpose, reliability and validity, and transparency-across the span of eight key decision points in the AI adoption lifecycle. It is a process-oriented framework containing 37 step-by-step procedures for evaluating an existing AI solution and 34 procedures for evaluating a new AI solution in total. Within each procedure, it identifies relevant key stakeholders and data sources used to conduct the procedure. HEAAL guides how healthcare delivery organizations may mitigate the potential risk of AI solutions worsening health inequities. It also informs how much resources and support are required to assess the potential impact of AI solutions on health inequities.
Collapse
Affiliation(s)
- Jee Young Kim
- Duke Institute for Health Innovation, Duke Health, Durham, North Carolina, United States of America
| | - Alifia Hasan
- Duke Institute for Health Innovation, Duke Health, Durham, North Carolina, United States of America
| | - Katherine C. Kellogg
- Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - William Ratliff
- Duke Institute for Health Innovation, Duke Health, Durham, North Carolina, United States of America
| | - Sara G. Murray
- Division of Hospital Medicine, University of California San Francisco, San Francisco, California, United States of America
| | - Harini Suresh
- Cornell University, New York, New York, United States of America
| | | | - Keo Shaw
- FDA Regulatory Group, DLA Piper, San Francisco, California, United States of America
| | - Danny Tobey
- AI and Data Analytics, DLA Piper, Dallas, Texas, United States of America
| | - David E. Vidal
- Center for Digital Health, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Mark A. Lifson
- Center for Digital Health, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Manesh Patel
- Division of Cardiology, Duke Health, Durham, North Carolina, United States of America
| | - Inioluwa Deborah Raji
- Department of Electrical Engineering and Computer Science, University of California Berkeley, Berkeley, California, United States of America
| | - Michael Gao
- Duke Institute for Health Innovation, Duke Health, Durham, North Carolina, United States of America
| | - William Knechtle
- Duke Institute for Health Innovation, Duke Health, Durham, North Carolina, United States of America
| | - Linda Tang
- School of Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Suresh Balu
- Duke Institute for Health Innovation, Duke Health, Durham, North Carolina, United States of America
| | - Mark P. Sendak
- Duke Institute for Health Innovation, Duke Health, Durham, North Carolina, United States of America
| |
Collapse
|
8
|
Liu XS, Nie R, Duan AW, Yang L, Li X, Zhang LT, Guo GK, Guo QS, Zhao DC, Li Y, Zhang HH. YOLOX-SwinT algorithm improves the accuracy of AO/OTA classification of intertrochanteric fractures by orthopedic trauma surgeons. Chin J Traumatol 2024:S1008-1275(24)00051-8. [PMID: 38762418 DOI: 10.1016/j.cjtee.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 03/18/2024] [Accepted: 04/09/2024] [Indexed: 05/20/2024] Open
Abstract
PURPOSE Intertrochanteric fracture (ITF) classification is crucial for surgical decision-making. However, orthopedic trauma surgeons have shown lower accuracy in ITF classification than expected. The objective of this study was to utilize an artificial intelligence (AI) method to improve the accuracy of ITF classification. METHODS We trained a network called YOLOX-SwinT, which is based on the You Only Look Once X (YOLOX) object detection network with Swin Transformer (SwinT) as the backbone architecture, using 762 radiographic ITF examinations as the training set. Subsequently, we recruited 5 senior orthopedic trauma surgeons (SOTS) and 5 junior orthopedic trauma surgeons (JOTS) to classify the 85 original images in the test set, as well as the images with the prediction results of the network model in sequence. Statistical analysis was performed using the Statistical Package for the Social Sciences (SPSS) 20.0 (IBM Corp., Armonk, NY, USA) to compare the differences among the SOTS, JOTS, SOTS + AI, JOTS + AI, SOTS + JOTS, and SOTS + JOTS + AI groups. All images were classified according to the AO/OTA 2018 classification system by 2 experienced trauma surgeons and verified by another expert in this field. Based on the actual clinical needs, after discussion, we integrated 8 subgroups into 5 new subgroups, and the dataset was divided into training, validation, and test sets by the ratio of 8:1:1. RESULTS The mean average precision at the intersection over union (IoU) of 0.5 (mAP50) for subgroup detection reached 90.29%. The classification accuracy values of SOTS, JOTS, SOTS + AI, and JOTS + AI groups were 56.24% ± 4.02%, 35.29% ± 18.07%, 79.53% ± 7.14%, and 71.53% ± 5.22%, respectively. The paired t-test results showed that the difference between the SOTS and SOTS + AI groups was statistically significant, as well as the difference between the JOTS and JOTS + AI groups, and the SOTS + JOTS and SOTS + JOTS + AI groups. Moreover, the difference between the SOTS + JOTS and SOTS + JOTS + AI groups in each subgroup was statistically significant, with all p < 0.05. The independent samples t-test results showed that the difference between the SOTS and JOTS groups was statistically significant, while the difference between the SOTS + AI and JOTS + AI groups was not statistically significant. With the assistance of AI, the subgroup classification accuracy of both SOTS and JOTS was significantly improved, and JOTS achieved the same level as SOTS. CONCLUSION In conclusion, the YOLOX-SwinT network algorithm enhances the accuracy of AO/OTA subgroups classification of ITF by orthopedic trauma surgeons.
Collapse
Affiliation(s)
- Xue-Si Liu
- Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Rui Nie
- Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Ao-Wen Duan
- Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Li Yang
- Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Xiang Li
- Department of Information, Southwest Hospital, Army Medical University, Chongqing, 400038, China
| | - Le-Tian Zhang
- Department of Radiology, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Guang-Kuo Guo
- Department of Radiology, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Qing-Shan Guo
- Division of Trauma and War Injury, Daping Hospital, Army Medical University of PLA, State Key Laboratory of Trauma and Chemical Poisoning, Chongqing, 400042, China
| | - Dong-Chu Zhao
- Division of Trauma and War Injury, Daping Hospital, Army Medical University of PLA, State Key Laboratory of Trauma and Chemical Poisoning, Chongqing, 400042, China
| | - Yang Li
- Division of Trauma and War Injury, Daping Hospital, Army Medical University of PLA, State Key Laboratory of Trauma and Chemical Poisoning, Chongqing, 400042, China.
| | - He-Hua Zhang
- Department of Medical Engineering, Daping Hospital, Army Medical University, Chongqing, 400042, China.
| |
Collapse
|
9
|
Lasko TA, Strobl EV, Stead WW. Why do probabilistic clinical models fail to transport between sites. NPJ Digit Med 2024; 7:53. [PMID: 38429353 PMCID: PMC10907678 DOI: 10.1038/s41746-024-01037-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/14/2024] [Indexed: 03/03/2024] Open
Abstract
The rising popularity of artificial intelligence in healthcare is highlighting the problem that a computational model achieving super-human clinical performance at its training sites may perform substantially worse at new sites. In this perspective, we argue that we should typically expect this failure to transport, and we present common sources for it, divided into those under the control of the experimenter and those inherent to the clinical data-generating process. Of the inherent sources we look a little deeper into site-specific clinical practices that can affect the data distribution, and propose a potential solution intended to isolate the imprint of those practices on the data from the patterns of disease cause and effect that are the usual target of probabilistic clinical models.
Collapse
Affiliation(s)
- Thomas A Lasko
- Vanderbilt University Medical Center, Nashville, TN, USA.
| | - Eric V Strobl
- Vanderbilt University Medical Center, Nashville, TN, USA
| | | |
Collapse
|
10
|
Yi PH, Garner HW, Hirschmann A, Jacobson JA, Omoumi P, Oh K, Zech JR, Lee YH. Clinical Applications, Challenges, and Recommendations for Artificial Intelligence in Musculoskeletal and Soft-Tissue Ultrasound: AJR Expert Panel Narrative Review. AJR Am J Roentgenol 2024; 222:e2329530. [PMID: 37436032 DOI: 10.2214/ajr.23.29530] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2023]
Abstract
Artificial intelligence (AI) is increasingly used in clinical practice for musculoskeletal imaging tasks, such as disease diagnosis and image reconstruction. AI applications in musculoskeletal imaging have focused primarily on radiography, CT, and MRI. Although musculoskeletal ultrasound stands to benefit from AI in similar ways, such applications have been relatively underdeveloped. In comparison with other modalities, ultrasound has unique advantages and disadvantages that must be considered in AI algorithm development and clinical translation. Challenges in developing AI for musculoskeletal ultrasound involve both clinical aspects of image acquisition and practical limitations in image processing and annotation. Solutions from other radiology subspecialties (e.g., crowdsourced annotations coordinated by professional societies), along with use cases (most commonly rotator cuff tendon tears and palpable soft-tissue masses), can be applied to musculoskeletal ultrasound to help develop AI. To facilitate creation of high-quality imaging datasets for AI model development, technologists and radiologists should focus on increasing uniformity in musculoskeletal ultrasound performance and increasing annotations of images for specific anatomic regions. This Expert Panel Narrative Review summarizes available evidence regarding AI's potential utility in musculoskeletal ultrasound and challenges facing its development. Recommendations for future AI advancement and clinical translation in musculoskeletal ultrasound are discussed.
Collapse
Affiliation(s)
- Paul H Yi
- University of Maryland Medical Intelligent Imaging Center, University of Maryland School of Medicine, Baltimore, MD
- Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore, MD
| | | | - Anna Hirschmann
- Imamed Radiology Nordwest, Basel, Switzerland
- Department of Radiology, University of Basel, Basel, Switzerland
| | - Jon A Jacobson
- Lenox Hill Radiology, New York, NY
- Department of Radiology, University of California, San Diego Medical Center, San Diego, CA
| | - Patrick Omoumi
- Department of Radiology, Lausanne University Hospital, Lausanne, Switzerland
- Department of Radiology, University of Lausanne, Lausanne, Switzerland
| | - Kangrok Oh
- Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, South Korea
| | - John R Zech
- Department of Radiology, Columbia University Irving Medical Center, New York-Presbyterian Hospital, New York, NY
| | - Young Han Lee
- Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, South Korea
| |
Collapse
|
11
|
Huang W, Wang J, Xu J, Guo G, Chen Z, Xue H. Multivariable machine learning models for clinical prediction of subsequent hip fractures in older people using the Chinese population database. Age Ageing 2024; 53:afae045. [PMID: 38497235 DOI: 10.1093/ageing/afae045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Indexed: 03/19/2024] Open
Abstract
PURPOSE This study aimed to develop and validate clinical prediction models using machine learning (ML) algorithms for reliable prediction of subsequent hip fractures in older individuals, who had previously sustained a first hip fracture, and facilitate early prevention and diagnosis, therefore effectively managing rapidly rising healthcare costs in China. METHODS Data were obtained from Grade A Tertiary hospitals for older patients (age ≥ 60 years) diagnosed with hip fractures in southwest China between 1 January 2009 and 1 April 2020. The database was built by collecting clinical and administrative data from outpatients and inpatients nationwide. Data were randomly split into training (80%) and testing datasets (20%), followed by six ML-based prediction models using 19 variables for hip fracture patients within 2 years of the first fracture. RESULTS A total of 40,237 patients with a median age of 66.0 years, who were admitted to acute-care hospitals for hip fractures, were randomly split into a training dataset (32,189 patients) and a testing dataset (8,048 patients). Our results indicated that three of our ML-based models delivered an excellent prediction of subsequent hip fracture outcomes (the area under the receiver operating characteristics curve: 0.92 (0.91-0.92), 0.92 (0·92-0·93), 0.92 (0·92-0·93)), outperforming previous prediction models based on claims and cohort data. CONCLUSIONS Our prediction models identify Chinese older people at high risk of subsequent hip fractures with specific baseline clinical and demographic variables such as length of hospital stay. These models might guide future targeted preventative treatments.
Collapse
Affiliation(s)
- Wenbo Huang
- Department of Medicine, Beijing Municipal Welfare Medical Research Institute Ltd, Beijing 102400, China
| | - Jie Wang
- Department of data analytics, School of Information Studies (iSchool), Syracuse University, NY 13244, USA
| | - Jilai Xu
- Department of Rehabilitation Medicine, Graduate School of Medicine, Juntendo University, Bunkyo, Tokyo 113-8421, Japan
| | - Guinan Guo
- Aerospace Information Research Institute, Chinese Academy of Sciences, Guangzhou, Guangdong 100864, China
| | - Zhenlei Chen
- Department of Physical Education, School of Physical Education, Hubei University of Education, Wuhan, Hubei 430000, China
| | - Haolei Xue
- Department of Rehabilitation Medicine, Graduate School of Medicine, Juntendo University, Bunkyo, Tokyo 113-8421, Japan
| |
Collapse
|
12
|
Saab K, Tang S, Taha M, Lee-Messer C, Ré C, Rubin DL. Towards trustworthy seizure onset detection using workflow notes. NPJ Digit Med 2024; 7:42. [PMID: 38383884 PMCID: PMC10881468 DOI: 10.1038/s41746-024-01008-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 01/10/2024] [Indexed: 02/23/2024] Open
Abstract
A major barrier to deploying healthcare AI is trustworthiness. One form of trustworthiness is a model's robustness across subgroups: while models may exhibit expert-level performance on aggregate metrics, they often rely on non-causal features, leading to errors in hidden subgroups. To take a step closer towards trustworthy seizure onset detection from EEG, we propose to leverage annotations that are produced by healthcare personnel in routine clinical workflows-which we refer to as workflow notes-that include multiple event descriptions beyond seizures. Using workflow notes, we first show that by scaling training data to 68,920 EEG hours, seizure onset detection performance significantly improves by 12.3 AUROC (Area Under the Receiver Operating Characteristic) points compared to relying on smaller training sets with gold-standard labels. Second, we reveal that our binary seizure onset detection model underperforms on clinically relevant subgroups (e.g., up to a margin of 6.5 AUROC points between pediatrics and adults), while having significantly higher FPRs (False Positive Rates) on EEG clips showing non-epileptiform abnormalities (+19 FPR points). To improve model robustness to hidden subgroups, we train a multilabel model that classifies 26 attributes other than seizures (e.g., spikes and movement artifacts) and significantly improve overall performance (+5.9 AUROC points) while greatly improving performance among subgroups (up to +8.3 AUROC points) and decreasing false positives on non-epileptiform abnormalities (by 8 FPR points). Finally, we find that our multilabel model improves clinical utility (false positives per 24 EEG hours) by a factor of 2×.
Collapse
Affiliation(s)
- Khaled Saab
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA.
| | - Siyi Tang
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | - Mohamed Taha
- Department of Neurology, Stanford University, Stanford, CA, USA
| | | | - Christopher Ré
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Daniel L Rubin
- Department of Biomedical Data Science, Radiology, and Medicine, Stanford University, Stanford, CA, USA.
| |
Collapse
|
13
|
Xie Y, Li X, Chen F, Wen R, Jing Y, Liu C, Wang J. Artificial intelligence diagnostic model for multi-site fracture X-ray images of extremities based on deep convolutional neural networks. Quant Imaging Med Surg 2024; 14:1930-1943. [PMID: 38415122 PMCID: PMC10895109 DOI: 10.21037/qims-23-878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 11/24/2023] [Indexed: 02/29/2024]
Abstract
Background The rapid and accurate diagnosis of fractures is crucial for timely treatment of trauma patients. Deep learning, one of the most widely used forms of artificial intelligence (AI), is now commonly employed in medical imaging for fracture detection. This study aimed to construct a deep learning model using big data to recognize multiple-fracture X-ray images of extremity bones. Methods Radiographic imaging data of extremities were retrospectively collected from five hospitals between January 2017 and September 2020. The total number of people finally included was 25,635 and the total number of images included was 26,098. After labeling the lesions, the randomized method used 90% of the data as the training set to develop the fracture detection model, and the remaining 10% was used as the validation set to verify the model. The faster region convolutional neural networks (R-CNN) algorithm was adopted to construct diagnostic models for detection. The Dice coefficient was used to evaluate the image segmentation accuracy. The performances of detection models were evaluated with sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). Results The free-response receiver operating characteristic (FROC) curve value was 0.886 and 0.843 for the detection of single and multiple fractures, respectively. Additionally, the effective identification AUC for all parts was higher than 0.920. Notably, the AUC for wrist fractures reached 0.952. The average accuracy in detecting bone fracture regions in the extremities was 0.865. When analyzing single and multiple lesions at the patient level, the sensitivity was 0.957 for patients with multiple lesions and 0.852 for those with single lesions. In the segmentation task, the training set (the data set used by the machine learning model to train and learn) and the validation set (the data set used to evaluate the performance of the model) reached 0.996 and 0.975, respectively. Conclusions The faster R-CNN training algorithm exhibits excellent performance in simultaneously identifying fractures in the hands, feet, wrists, ankles, radius and ulna, and tibia and fibula on X-ray images. It demonstrates high accuracy, low false-negative rates, and controllable false-positive rates. It can serve as a valuable screening tool.
Collapse
Affiliation(s)
- Yanling Xie
- Department of Radiology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, China
| | - Xiaoming Li
- Department of Radiology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, China
| | - Fengxi Chen
- Department of Radiology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, China
| | - Ru Wen
- Department of Radiology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, China
| | - Yang Jing
- Huiying Medical Technology Co., Ltd., Beijing, China
| | - Chen Liu
- Department of Radiology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, China
| | - Jian Wang
- Department of Radiology, Southwest Hospital, Army Medical University (Third Military Medical University), Chongqing, China
| |
Collapse
|
14
|
Russe MF, Rebmann P, Tran PH, Kellner E, Reisert M, Bamberg F, Kotter E, Kim S. AI-based X-ray fracture analysis of the distal radius: accuracy between representative classification, detection and segmentation deep learning models for clinical practice. BMJ Open 2024; 14:e076954. [PMID: 38262641 PMCID: PMC10823998 DOI: 10.1136/bmjopen-2023-076954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 12/21/2023] [Indexed: 01/25/2024] Open
Abstract
OBJECTIVES To aid in selecting the optimal artificial intelligence (AI) solution for clinical application, we directly compared performances of selected representative custom-trained or commercial classification, detection and segmentation models for fracture detection on musculoskeletal radiographs of the distal radius by aligning their outputs. DESIGN AND SETTING This single-centre retrospective study was conducted on a random subset of emergency department radiographs from 2008 to 2018 of the distal radius in Germany. MATERIALS AND METHODS An image set was created to be compatible with training and testing classification and segmentation models by annotating examinations for fractures and overlaying fracture masks, if applicable. Representative classification and segmentation models were trained on 80% of the data. After output binarisation, their derived fracture detection performances as well as that of a standard commercially available solution were compared on the remaining X-rays (20%) using mainly accuracy and area under the receiver operating characteristic (AUROC). RESULTS A total of 2856 examinations with 712 (24.9%) fractures were included in the analysis. Accuracies reached up to 0.97 for the classification model, 0.94 for the segmentation model and 0.95 for BoneView. Cohen's kappa was at least 0.80 in pairwise comparisons, while Fleiss' kappa was 0.83 for all models. Fracture predictions were visualised with all three methods at different levels of detail, ranking from downsampled image region for classification over bounding box for detection to single pixel-level delineation for segmentation. CONCLUSIONS All three investigated approaches reached high performances for detection of distal radius fractures with simple preprocessing and postprocessing protocols on the custom-trained models. Despite their underlying structural differences, selection of one's fracture analysis AI tool in the frame of this study reduces to the desired flavour of automation: automated classification, AI-assisted manual fracture reading or minimised false negatives.
Collapse
Affiliation(s)
- Maximilian Frederik Russe
- Department of Diagnostic and Interventional Radiology, Universitätsklinikum Freiburg Medizinische Universitätsklinik, Freiburg im Breisgau, Germany
| | - Philipp Rebmann
- Department of Diagnostic and Interventional Radiology, Universitätsklinikum Freiburg Medizinische Universitätsklinik, Freiburg im Breisgau, Germany
| | - Phuong Hien Tran
- Department of Diagnostic and Interventional Radiology, Universitätsklinikum Freiburg Medizinische Universitätsklinik, Freiburg im Breisgau, Germany
| | - Elias Kellner
- Department of Medical Physics, Universitätsklinikum Freiburg Medizinische Universitätsklinik, Freiburg im Breisgau, Germany
| | - Marco Reisert
- Department of Medical Physics, Universitätsklinikum Freiburg Medizinische Universitätsklinik, Freiburg im Breisgau, Germany
| | - Fabian Bamberg
- Department of Diagnostic and Interventional Radiology, Universitätsklinikum Freiburg Medizinische Universitätsklinik, Freiburg im Breisgau, Germany
| | - Elmar Kotter
- Department of Diagnostic and Interventional Radiology, Universitätsklinikum Freiburg Medizinische Universitätsklinik, Freiburg im Breisgau, Germany
| | - Suam Kim
- Department of Diagnostic and Interventional Radiology, Universitätsklinikum Freiburg Medizinische Universitätsklinik, Freiburg im Breisgau, Germany
| |
Collapse
|
15
|
Wang LX, Zhu ZH, Chen QC, Jiang WB, Wang YZ, Sun NK, Hu BS, Rui G, Wang LS. Development and validation of a deep-learning model for the detection of non-displaced femoral neck fractures with anteroposterior and lateral hip radiographs. Quant Imaging Med Surg 2024; 14:527-539. [PMID: 38223105 PMCID: PMC10784052 DOI: 10.21037/qims-23-814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 10/24/2023] [Indexed: 01/16/2024]
Abstract
Background Hip fractures, including femoral neck fractures, are a significant cause of morbidity and mortality in the elderly population and are typically diagnosed using plain radiography. However, diagnosing non-displaced femoral neck fractures can be challenging due to their subtle appearance on hip radiographs. Previous deep-learning models have shown low accuracy in identifying these fractures on anteroposterior (AP) radiographs; however, no studies have used lateral radiographs. This study aimed to evaluate the potential of using deep-learning with both AP and lateral hip radiographs to automatically identify non-displaced femoral neck fractures. Methods We conducted a retrospective analysis of patients with femoral neck fractures at The First Affiliated Hospital of Xiamen University. All the hip radiographs were reviewed, and cases of non-displaced femoral neck fractures were included in the study. Additionally, 439 participants with normal hip radiographs were also included in the study. A vision transformer (Vit) model was developed using 1,536 AP and lateral hip radiograph. The model's performance was compared to the performance of two groups of human observers: an expert group comprising orthopedic surgeons and radiologists, and a non-expert group, including emergency physicians and general practice doctors. We also carried out the external validation using two additional data sets to assess the generalizability of the model. Results The Vit model showed exceptional performance in detecting non-displaced femoral neck fractures on paired AP and lateral hip radiographs, achieving a binary accuracy of 95.8% [95% confidence interval (CI): 94.9%, 96.8%] and an area under the curve (AUC) of 0.988. Compared to the human observers, the model had a higher accuracy of 96.7% (95% CI: 93.9%, 99.5%) on the paired AP and lateral hip radiographs, while the accuracy of the expert group was 90.5% (95% CI: 85.7%, 95.2%). Further, the model maintained good performance during the external validation, with an AUC of 0.959 on the paired AP and lateral views. Conclusions Our Vit model showed expert-level performance in identifying non-displaced femoral neck fractures on paired AP and lateral hip radiographs. This model has the potential to enhance diagnosis accuracy and improve patient outcomes by reducing the need for additional examinations and preoperative time.
Collapse
Affiliation(s)
- Lian-Xin Wang
- Department of Orthopedics, The First Affiliated Hospital of Xiamen University, Xiamen, China
| | - Zhong-Hang Zhu
- Department of Computer Science, Xiamen University, Xiamen, China
| | - Qi-Chang Chen
- Department of Computer Science, Xiamen University, Xiamen, China
| | - Wei-Bo Jiang
- Department of Orthopedics, The Second Affiliated Hospital of Jilin University, Changchun, China
| | - Yao-Zong Wang
- Department of Orthopedics, Zhongshan Hospital of Xiamen University, Xiamen, China
| | - Nai-Kun Sun
- Department of Orthopedics, The First Affiliated Hospital of Xiamen University, Xiamen, China
| | - Bao-Shan Hu
- Department of Orthopedics, The First Affiliated Hospital of Xiamen University, Xiamen, China
| | - Gang Rui
- Department of Orthopedics, The First Affiliated Hospital of Xiamen University, Xiamen, China
| | - Lian-Sheng Wang
- Department of Computer Science, Xiamen University, Xiamen, China
| |
Collapse
|
16
|
O'Shea R, Manickavasagar T, Horst C, Hughes D, Cusack J, Tsoka S, Cook G, Goh V. Weakly supervised segmentation models as explainable radiological classifiers for lung tumour detection on CT images. Insights Imaging 2023; 14:195. [PMID: 37980637 PMCID: PMC10657919 DOI: 10.1186/s13244-023-01542-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 10/13/2023] [Indexed: 11/21/2023] Open
Abstract
PURPOSE Interpretability is essential for reliable convolutional neural network (CNN) image classifiers in radiological applications. We describe a weakly supervised segmentation model that learns to delineate the target object, trained with only image-level labels ("image contains object" or "image does not contain object"), presenting a different approach towards explainable object detectors for radiological imaging tasks. METHODS A weakly supervised Unet architecture (WSUnet) was trained to learn lung tumour segmentation from image-level labelled data. WSUnet generates voxel probability maps with a Unet and then constructs an image-level prediction by global max-pooling, thereby facilitating image-level training. WSUnet's voxel-level predictions were compared to traditional model interpretation techniques (class activation mapping, integrated gradients and occlusion sensitivity) in CT data from three institutions (training/validation: n = 412; testing: n = 142). Methods were compared using voxel-level discrimination metrics and clinical value was assessed with a clinician preference survey on data from external institutions. RESULTS Despite the absence of voxel-level labels in training, WSUnet's voxel-level predictions localised tumours precisely in both validation (precision: 0.77, 95% CI: [0.76-0.80]; dice: 0.43, 95% CI: [0.39-0.46]), and external testing (precision: 0.78, 95% CI: [0.76-0.81]; dice: 0.33, 95% CI: [0.32-0.35]). WSUnet's voxel-level discrimination outperformed the best comparator in validation (area under precision recall curve (AUPR): 0.55, 95% CI: [0.49-0.56] vs. 0.23, 95% CI: [0.21-0.25]) and testing (AUPR: 0.40, 95% CI: [0.38-0.41] vs. 0.36, 95% CI: [0.34-0.37]). Clinicians preferred WSUnet predictions in most instances (clinician preference rate: 0.72 95% CI: [0.68-0.77]). CONCLUSION Weakly supervised segmentation is a viable approach by which explainable object detection models may be developed for medical imaging. CRITICAL RELEVANCE STATEMENT WSUnet learns to segment images at voxel level, training only with image-level labels. A Unet backbone first generates a voxel-level probability map and then extracts the maximum voxel prediction as the image-level prediction. Thus, training uses only image-level annotations, reducing human workload. WSUnet's voxel-level predictions provide a causally verifiable explanation for its image-level prediction, improving interpretability. KEY POINTS • Explainability and interpretability are essential for reliable medical image classifiers. • This study applies weakly supervised segmentation to generate explainable image classifiers. • The weakly supervised Unet inherently explains its image-level predictions at voxel level.
Collapse
Affiliation(s)
- Robert O'Shea
- Department of Cancer Imaging, King's College London, London, UK.
| | | | - Carolyn Horst
- Department of Radiology, Guy's and St Thomas' NHS Foundation Trust, London, UK
| | - Daniel Hughes
- Department of Cancer Imaging, King's College London, London, UK
| | - James Cusack
- Department of Radiology, Liverpool University Hospitals NHS Foundation Trust, Liverpool, UK
| | - Sophia Tsoka
- Department of Natural and Mathematical Sciences, King's College London, London, UK
| | - Gary Cook
- King's College London & Guy's and St Thomas' PET Centre, Guy's and St Thomas' NHS Foundation Trust, London, UK
| | - Vicky Goh
- Department of Radiology, Guy's and St Thomas' NHS Foundation Trust, London, UK
| |
Collapse
|
17
|
Khosravi B, Mickley JP, Rouzrokh P, Taunton MJ, Larson AN, Erickson BJ, Wyles CC. Anonymizing Radiographs Using an Object Detection Deep Learning Algorithm. Radiol Artif Intell 2023; 5:e230085. [PMID: 38074777 PMCID: PMC10698585 DOI: 10.1148/ryai.230085] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 08/11/2023] [Accepted: 08/25/2023] [Indexed: 02/02/2024]
Abstract
Radiographic markers contain protected health information that must be removed before public release. This work presents a deep learning algorithm that localizes radiographic markers and selectively removes them to enable de-identified data sharing. The authors annotated 2000 hip and pelvic radiographs to train an object detection computer vision model. Data were split into training, validation, and test sets at the patient level. Extracted markers were then characterized using an image processing algorithm, and potentially useful markers (eg, "L" and "R") without identifying information were retained. The model achieved an area under the precision-recall curve of 0.96 on the internal test set. The de-identification accuracy was 100% (400 of 400), with a de-identification false-positive rate of 1% (eight of 632) and a retention accuracy of 93% (359 of 386) for laterality markers. The algorithm was further validated on an external dataset of chest radiographs, achieving a de-identification accuracy of 96% (221 of 231). After fine-tuning the model on 20 images from the external dataset to investigate the potential for improvement, a 99.6% (230 of 231, P = .04) de-identification accuracy and decreased false-positive rate of 5% (26 of 512) were achieved. These results demonstrate the effectiveness of a two-pass approach in image de-identification. Keywords: Conventional Radiography, Skeletal-Axial, Thorax, Experimental Investigations, Supervised Learning, Transfer Learning, Convolutional Neural Network (CNN) Supplemental material is available for this article. © RSNA, 2023 See also the commentary by Chang and Li in this issue.
Collapse
Affiliation(s)
| | | | - Pouria Rouzrokh
- From the Orthopedic Surgery Artificial Intelligence Laboratory,
Department of Orthopedic Surgery (B.K., J.P.M., P.R., M.J.T., A.N.L., C.C.W.),
Radiology Informatics Laboratory, Department of Radiology (B.K., P.R., B.J.E.),
Department of Orthopedic Surgery (M.J.T., A.N.L., C.C.W.), and Department of
Clinical Anatomy (C.C.W.), Mayo Clinic, 200 1st St SW, Rochester, MN
55905
| | - Michael J. Taunton
- From the Orthopedic Surgery Artificial Intelligence Laboratory,
Department of Orthopedic Surgery (B.K., J.P.M., P.R., M.J.T., A.N.L., C.C.W.),
Radiology Informatics Laboratory, Department of Radiology (B.K., P.R., B.J.E.),
Department of Orthopedic Surgery (M.J.T., A.N.L., C.C.W.), and Department of
Clinical Anatomy (C.C.W.), Mayo Clinic, 200 1st St SW, Rochester, MN
55905
| | - A. Noelle Larson
- From the Orthopedic Surgery Artificial Intelligence Laboratory,
Department of Orthopedic Surgery (B.K., J.P.M., P.R., M.J.T., A.N.L., C.C.W.),
Radiology Informatics Laboratory, Department of Radiology (B.K., P.R., B.J.E.),
Department of Orthopedic Surgery (M.J.T., A.N.L., C.C.W.), and Department of
Clinical Anatomy (C.C.W.), Mayo Clinic, 200 1st St SW, Rochester, MN
55905
| | - Bradley J. Erickson
- From the Orthopedic Surgery Artificial Intelligence Laboratory,
Department of Orthopedic Surgery (B.K., J.P.M., P.R., M.J.T., A.N.L., C.C.W.),
Radiology Informatics Laboratory, Department of Radiology (B.K., P.R., B.J.E.),
Department of Orthopedic Surgery (M.J.T., A.N.L., C.C.W.), and Department of
Clinical Anatomy (C.C.W.), Mayo Clinic, 200 1st St SW, Rochester, MN
55905
| | - Cody C. Wyles
- From the Orthopedic Surgery Artificial Intelligence Laboratory,
Department of Orthopedic Surgery (B.K., J.P.M., P.R., M.J.T., A.N.L., C.C.W.),
Radiology Informatics Laboratory, Department of Radiology (B.K., P.R., B.J.E.),
Department of Orthopedic Surgery (M.J.T., A.N.L., C.C.W.), and Department of
Clinical Anatomy (C.C.W.), Mayo Clinic, 200 1st St SW, Rochester, MN
55905
| |
Collapse
|
18
|
Su Z, Adam A, Nasrudin MF, Ayob M, Punganan G. Skeletal Fracture Detection with Deep Learning: A Comprehensive Review. Diagnostics (Basel) 2023; 13:3245. [PMID: 37892066 PMCID: PMC10606060 DOI: 10.3390/diagnostics13203245] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 10/12/2023] [Accepted: 10/13/2023] [Indexed: 10/29/2023] Open
Abstract
Deep learning models have shown great promise in diagnosing skeletal fractures from X-ray images. However, challenges remain that hinder progress in this field. Firstly, a lack of clear definitions for recognition, classification, detection, and localization tasks hampers the consistent development and comparison of methodologies. The existing reviews often lack technical depth or have limited scope. Additionally, the absence of explainable facilities undermines the clinical application and expert confidence in results. To address these issues, this comprehensive review analyzes and evaluates 40 out of 337 recent papers identified in prestigious databases, including WOS, Scopus, and EI. The objectives of this review are threefold. Firstly, precise definitions are established for the bone fracture recognition, classification, detection, and localization tasks within deep learning. Secondly, each study is summarized based on key aspects such as the bones involved, research objectives, dataset sizes, methods employed, results obtained, and concluding remarks. This process distills the diverse approaches into a generalized processing framework or workflow. Moreover, this review identifies the crucial areas for future research in deep learning models for bone fracture diagnosis. These include enhancing the network interpretability, integrating multimodal clinical information, providing therapeutic schedule recommendations, and developing advanced visualization methods for clinical application. By addressing these challenges, deep learning models can be made more intelligent and specialized in this domain. In conclusion, this review fills the gap in precise task definitions within deep learning for bone fracture diagnosis and provides a comprehensive analysis of the recent research. The findings serve as a foundation for future advancements, enabling improved interpretability, multimodal integration, clinical decision support, and advanced visualization techniques.
Collapse
Affiliation(s)
- Zhihao Su
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (Z.S.); (M.F.N.); (M.A.)
| | - Afzan Adam
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (Z.S.); (M.F.N.); (M.A.)
| | - Mohammad Faidzul Nasrudin
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (Z.S.); (M.F.N.); (M.A.)
| | - Masri Ayob
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (Z.S.); (M.F.N.); (M.A.)
| | - Gauthamen Punganan
- Department of Orthopedics and Traumatology, Hospital Raja Permaisuri Bainun, Ipoh 30450, Perak, Malaysia;
| |
Collapse
|
19
|
Liu Y, Liu W, Chen H, Xie S, Wang C, Liang T, Yu Y, Liu X. Artificial intelligence versus radiologist in the accuracy of fracture detection based on computed tomography images: a multi-dimensional, multi-region analysis. Quant Imaging Med Surg 2023; 13:6424-6433. [PMID: 37869340 PMCID: PMC10585498 DOI: 10.21037/qims-23-428] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 08/18/2023] [Indexed: 10/24/2023]
Abstract
Background Extremities fractures are a leading cause of death and disability, especially in the elderly. Avulsion fracture are also the most commonly missed diagnosis, and delayed diagnosis leads to higher litigation rates. Therefore, this study evaluates the diagnostic efficiency of the artificial intelligence (AI) model before and after optimization based on computed tomography (CT) images and then compares it with that of radiologists, especially for avulsion fractures. Methods The digital X-ray photography [digital radiography (DR)] and CT images of adult limb trauma in our hospital from 2017 to 2020 were retrospectively collected, with or without 1 or more fractures of the shoulder, elbow, wrist, hand, hip, knee, ankle, and foot. Labeling of the fracture referred to the visualization of the fracture on the corresponding CT images. After training the pre-optimized AI model, the diagnostic performance of the pre-optimized AI, optimized AI model, and the initial radiological reports were evaluated. For the lesion level, the detection rate of avulsion and non-avulsion fractures was analyzed, whereas for the case level, the accuracy, sensitivity, and specificity were compared among them. Results The total datasets (1,035 cases) were divided into a training set (n=675), a validation set (n=169), and a test set (n=191) in a balanced joint distribution. At the lesion level, the detection rates of avulsion fracture (57.89% vs. 35.09%, P=0.004) and non-avulsion fracture (85.64% vs. 71.29%, P<0.001) by the optimized AI were significantly higher than that by pre-optimized AI. The average precision (AP) of the optimized AI model for all lesions was higher than that of pre-optimized AI model (0.582 vs. 0.425). The detection rate of avulsion fracture by the optimized AI model was significantly higher than that by radiologists (57.89% vs. 29.82%, P=0.002). For the non-avulsion fracture, there was no significant difference of detection rate between the optimized AI model and radiologists (P=0.853). At the case level, the accuracy (86.40% vs. 71.93%, P<0.001) and sensitivity (87.29% vs. 73.48%, P<0.001) of the optimized AI were significantly higher than those of the pre-optimized AI model. There was no statistical difference in accuracy, sensitivity, and specificity between the optimized AI model and the radiologists (P>0.05). Conclusions The optimized AI model improves the diagnostic efficacy in detecting extremity fractures on radiographs, and the optimized AI model is significantly better than radiologists in detecting avulsion fractures, which may be helpful in the clinical practice of orthopedic emergency.
Collapse
Affiliation(s)
- Yunxia Liu
- Department of Radiology, The Third Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Weifang Liu
- Department of Radiology, Civil Aviation General Hospital, Beijing, China
| | | | - Sheng Xie
- Department of Radiology, China-Japan Friendship Hospital, Beijing, China
| | - Ce Wang
- Department of Radiology, China-Japan Friendship Hospital, Beijing, China
| | - Tian Liang
- Department of Radiology, China-Japan Friendship Hospital, Beijing, China
| | | | | |
Collapse
|
20
|
Lonsdale H, Gray GM, Ahumada LM, Matava CT. Machine Vision and Image Analysis in Anesthesia: Narrative Review and Future Prospects. Anesth Analg 2023; 137:830-840. [PMID: 37712476 DOI: 10.1213/ane.0000000000006679] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/16/2023]
Abstract
Machine vision describes the use of artificial intelligence to interpret, analyze, and derive predictions from image or video data. Machine vision-based techniques are already in clinical use in radiology, ophthalmology, and dermatology, where some applications currently equal or exceed the performance of specialty physicians in areas of image interpretation. While machine vision in anesthesia has many potential applications, its development remains in its infancy in our specialty. Early research for machine vision in anesthesia has focused on automated recognition of anatomical structures during ultrasound-guided regional anesthesia or line insertion; recognition of the glottic opening and vocal cords during video laryngoscopy; prediction of the difficult airway using facial images; and clinical alerts for endobronchial intubation detected on chest radiograph. Current machine vision applications measuring the distance between endotracheal tube tip and carina have demonstrated noninferior performance compared to board-certified physicians. The performance and potential uses of machine vision for anesthesia will only grow with the advancement of underlying machine vision algorithm technical performance developed outside of medicine, such as convolutional neural networks and transfer learning. This article summarizes recently published works of interest, provides a brief overview of techniques used to create machine vision applications, explains frequently used terms, and discusses challenges the specialty will encounter as we embrace the advantages that this technology may bring to future clinical practice and patient care. As machine vision emerges onto the clinical stage, it is critically important that anesthesiologists are prepared to confidently assess which of these devices are safe, appropriate, and bring added value to patient care.
Collapse
Affiliation(s)
- Hannah Lonsdale
- From the Division of Pediatric Anesthesiology, Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Geoffrey M Gray
- Center for Pediatric Data Science and Analytics Methodology, Johns Hopkins All Children's Hospital, St Petersburg, Florida
| | - Luis M Ahumada
- Center for Pediatric Data Science and Analytics Methodology, Johns Hopkins All Children's Hospital, St Petersburg, Florida
| | - Clyde T Matava
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Anesthesiology and Pain Medicine, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
21
|
Hsieh C, Nobre IB, Sousa SC, Ouyang C, Brereton M, Nascimento JC, Jorge J, Moreira C. MDF-Net for abnormality detection by fusing X-rays with clinical data. Sci Rep 2023; 13:15873. [PMID: 37741833 PMCID: PMC10517966 DOI: 10.1038/s41598-023-41463-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 08/27/2023] [Indexed: 09/25/2023] Open
Abstract
This study investigates the effects of including patients' clinical information on the performance of deep learning (DL) classifiers for disease location in chest X-ray images. Although current classifiers achieve high performance using chest X-ray images alone, consultations with practicing radiologists indicate that clinical data is highly informative and essential for interpreting medical images and making proper diagnoses. In this work, we propose a novel architecture consisting of two fusion methods that enable the model to simultaneously process patients' clinical data (structured data) and chest X-rays (image data). Since these data modalities are in different dimensional spaces, we propose a spatial arrangement strategy, spatialization, to facilitate the multimodal learning process in a Mask R-CNN model. We performed an extensive experimental evaluation using MIMIC-Eye, a dataset comprising different modalities: MIMIC-CXR (chest X-ray images), MIMIC IV-ED (patients' clinical data), and REFLACX (annotations of disease locations in chest X-rays). Results show that incorporating patients' clinical data in a DL model together with the proposed fusion methods improves the disease localization in chest X-rays by 12% in terms of Average Precision compared to a standard Mask R-CNN using chest X-rays alone. Further ablation studies also emphasize the importance of multimodal DL architectures and the incorporation of patients' clinical data in disease localization. In the interest of fostering scientific reproducibility, the architecture proposed within this investigation has been made publicly accessible( https://github.com/ChihchengHsieh/multimodal-abnormalities-detection ).
Collapse
Affiliation(s)
| | | | | | - Chun Ouyang
- Queensland University of Technology, Brisbane, Australia
| | | | - Jacinto C Nascimento
- Institute for Systems and Robotics, Instituto Superior Técnico, University of Lisbon, Lisbon, Portugal
| | - Joaquim Jorge
- Instituto Superior Técnico, University of Lisbon, Portugal, Lisbon, Portugal
| | - Catarina Moreira
- Queensland University of Technology, Brisbane, Australia.
- Instituto Superior Técnico, University of Lisbon, Portugal, Lisbon, Portugal.
- Human Technology Institute, University of Technology Sydney, Ultimo, Australia.
- INESC-ID, Lisbon, Portugal.
| |
Collapse
|
22
|
Horry MJ, Chakraborty S, Pradhan B, Paul M, Zhu J, Loh HW, Barua PD, Acharya UR. Development of Debiasing Technique for Lung Nodule Chest X-ray Datasets to Generalize Deep Learning Models. SENSORS (BASEL, SWITZERLAND) 2023; 23:6585. [PMID: 37514877 PMCID: PMC10385599 DOI: 10.3390/s23146585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/16/2023] [Accepted: 07/20/2023] [Indexed: 07/30/2023]
Abstract
Screening programs for early lung cancer diagnosis are uncommon, primarily due to the challenge of reaching at-risk patients located in rural areas far from medical facilities. To overcome this obstacle, a comprehensive approach is needed that combines mobility, low cost, speed, accuracy, and privacy. One potential solution lies in combining the chest X-ray imaging mode with federated deep learning, ensuring that no single data source can bias the model adversely. This study presents a pre-processing pipeline designed to debias chest X-ray images, thereby enhancing internal classification and external generalization. The pipeline employs a pruning mechanism to train a deep learning model for nodule detection, utilizing the most informative images from a publicly available lung nodule X-ray dataset. Histogram equalization is used to remove systematic differences in image brightness and contrast. Model training is then performed using combinations of lung field segmentation, close cropping, and rib/bone suppression. The resulting deep learning models, generated through this pre-processing pipeline, demonstrate successful generalization on an independent lung nodule dataset. By eliminating confounding variables in chest X-ray images and suppressing signal noise from the bone structures, the proposed deep learning lung nodule detection algorithm achieves an external generalization accuracy of 89%. This approach paves the way for the development of a low-cost and accessible deep learning-based clinical system for lung cancer screening.
Collapse
Affiliation(s)
- Michael J Horry
- Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia
- IBM Australia Limited, Sydney, NSW 2000, Australia
| | - Subrata Chakraborty
- Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia
- Faculty of Science, Agriculture, Business and Law, University of New England, Armidale, NSW 2351, Australia
| | - Biswajeet Pradhan
- Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia
- Earth Observation Center, Institute of Climate Change, Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia
| | - Manoranjan Paul
- Machine Vision and Digital Health (MaViDH), School of Computing and Mathematics, Charles Sturt University, Bathurst, NSW 2795, Australia
| | - Jing Zhu
- Department of Radiology, Westmead Hospital, Westmead, NSW 2145, Australia
| | - Hui Wen Loh
- School of Science and Technology, Singapore University of Social Sciences, Singapore 599494, Singapore
| | - Prabal Datta Barua
- Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia
- Faculty of Science, Agriculture, Business and Law, University of New England, Armidale, NSW 2351, Australia
- Cogninet Brain Team, Cogninet Australia, Sydney, NSW 2010, Australia
- School of Business (Information Systems), Faculty of Business, Education, Law & Arts, University of Southern Queensland, Toowoomba, QLD 4350, Australia
| | - U Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, QLD 4300, Australia
| |
Collapse
|
23
|
Chen H, Liu Y, Balabani S, Hirayama R, Huang J. Machine Learning in Predicting Printable Biomaterial Formulations for Direct Ink Writing. RESEARCH (WASHINGTON, D.C.) 2023; 6:0197. [PMID: 37469394 PMCID: PMC10353544 DOI: 10.34133/research.0197] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 06/29/2023] [Indexed: 07/21/2023]
Abstract
Three-dimensional (3D) printing is emerging as a transformative technology for biomedical engineering. The 3D printed product can be patient-specific by allowing customizability and direct control of the architecture. The trial-and-error approach currently used for developing the composition of printable inks is time- and resource-consuming due to the increasing number of variables requiring expert knowledge. Artificial intelligence has the potential to reshape the ink development process by forming a predictive model for printability from experimental data. In this paper, we constructed machine learning (ML) algorithms including decision tree, random forest (RF), and deep learning (DL) to predict the printability of biomaterials. A total of 210 formulations including 16 different bioactive and smart materials and 4 solvents were 3D printed, and their printability was assessed. All ML methods were able to learn and predict the printability of a variety of inks based on their biomaterial formulations. In particular, the RF algorithm has achieved the highest accuracy (88.1%), precision (90.6%), and F1 score (87.0%), indicating the best overall performance out of the 3 algorithms, while DL has the highest recall (87.3%). Furthermore, the ML algorithms have predicted the printability window of biomaterials to guide the ink development. The printability map generated with DL has finer granularity than other algorithms. ML has proven to be an effective and novel strategy for developing biomaterial formulations with desired 3D printability for biomedical engineering applications.
Collapse
Affiliation(s)
- Hongyi Chen
- Department of Mechanical Engineering,
University College London, London, UK
- Department of Computer Science,
University College London, London, UK
| | - Yuanchang Liu
- Department of Mechanical Engineering,
University College London, London, UK
| | - Stavroula Balabani
- Department of Mechanical Engineering,
University College London, London, UK
- Wellcome-EPSRC Centre for Interventional Surgical Sciences (WEISS),
University College London, London, UK
| | - Ryuji Hirayama
- Department of Computer Science,
University College London, London, UK
| | - Jie Huang
- Department of Mechanical Engineering,
University College London, London, UK
| |
Collapse
|
24
|
Zhang LH, Ranganath R. Robustness to Spurious Correlations Improves Semantic Out-of-Distribution Detection. PROCEEDINGS OF THE ... AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE. AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE 2023; 37:15305-15312. [PMID: 38464961 PMCID: PMC10923583 DOI: 10.1609/aaai.v37i12.26785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Methods which utilize the outputs or feature representations of predictive models have emerged as promising approaches for out-of-distribution (ood) detection of image inputs. However, these methods struggle to detect ood inputs that share nuisance values (e.g. background) with in-distribution inputs. The detection of shared-nuisance out-of-distribution (sn-ood) inputs is particularly relevant in real-world applications, as anomalies and in-distribution inputs tend to be captured in the same settings during deployment. In this work, we provide a possible explanation for sn-ood detection failures and propose nuisance-aware ood detection to address them. Nuisance-aware ood detection substitutes a classifier trained via Empirical Risk Minimization (erm) and cross-entropy loss with one that 1. is trained under a distribution where the nuisance-label relationship is broken and 2. yields representations that are independent of the nuisance under this distribution, both marginally and conditioned on the label. We can train a classifier to achieve these objectives using Nuisance-Randomized Distillation (NURD), an algorithm developed for ood generalization under spurious correlations. Output- and feature-based nuisance-aware ood detection perform substantially better than their original counterparts, succeeding even when detection based on domain generalization algorithms fails to improve performance.
Collapse
Affiliation(s)
| | - Rajesh Ranganath
- Center for Data Science, New York University
- Courant Institute of Mathematical Sciences, New York University
| |
Collapse
|
25
|
Kocak B, Baessler B, Bakas S, Cuocolo R, Fedorov A, Maier-Hein L, Mercaldo N, Müller H, Orlhac F, Pinto Dos Santos D, Stanzione A, Ugga L, Zwanenburg A. CheckList for EvaluAtion of Radiomics research (CLEAR): a step-by-step reporting guideline for authors and reviewers endorsed by ESR and EuSoMII. Insights Imaging 2023; 14:75. [PMID: 37142815 PMCID: PMC10160267 DOI: 10.1186/s13244-023-01415-8] [Citation(s) in RCA: 100] [Impact Index Per Article: 100.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 03/24/2023] [Indexed: 05/06/2023] Open
Abstract
Even though radiomics can hold great potential for supporting clinical decision-making, its current use is mostly limited to academic research, without applications in routine clinical practice. The workflow of radiomics is complex due to several methodological steps and nuances, which often leads to inadequate reporting and evaluation, and poor reproducibility. Available reporting guidelines and checklists for artificial intelligence and predictive modeling include relevant good practices, but they are not tailored to radiomic research. There is a clear need for a complete radiomics checklist for study planning, manuscript writing, and evaluation during the review process to facilitate the repeatability and reproducibility of studies. We here present a documentation standard for radiomic research that can guide authors and reviewers. Our motivation is to improve the quality and reliability and, in turn, the reproducibility of radiomic research. We name the checklist CLEAR (CheckList for EvaluAtion of Radiomics research), to convey the idea of being more transparent. With its 58 items, the CLEAR checklist should be considered a standardization tool providing the minimum requirements for presenting clinical radiomics research. In addition to a dynamic online version of the checklist, a public repository has also been set up to allow the radiomics community to comment on the checklist items and adapt the checklist for future versions. Prepared and revised by an international group of experts using a modified Delphi method, we hope the CLEAR checklist will serve well as a single and complete scientific documentation tool for authors and reviewers to improve the radiomics literature.
Collapse
Affiliation(s)
- Burak Kocak
- Department of Radiology, University of Health Sciences, Basaksehir Cam and Sakura City Hospital, Basaksehir, Istanbul, 34480, Turkey.
| | - Bettina Baessler
- Institute of Diagnostic and Interventional Radiology, University Hospital Würzburg, Würzburg, Germany
| | - Spyridon Bakas
- Center for Artificial Intelligence for Integrated Diagnostics (AI2D) & Center for Biomedical Image Computing & Analytics (CBICA), University of Pennsylvania, Philadelphia, PA, USA
- Department of Radiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Renato Cuocolo
- Department of Medicine, Surgery, and Dentistry, University of Salerno, Baronissi, Italy
| | - Andrey Fedorov
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Lena Maier-Hein
- Division of Intelligent Medical Systems, German Cancer Research Center, Heidelberg, Germany
- National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Nathaniel Mercaldo
- Institute for Technology Assessment, Massachusetts General Hospital, Boston, MA, USA
- Department of Radiology, Massachusetts General Hospital, Boston, MA, USA
| | - Henning Müller
- University of Applied Sciences of Western Switzerland (HES-SO Valais), Valais, Switzerland
- Department of Radiology and Medical Informatics, University of Geneva (UniGe), Geneva, Switzerland
| | - Fanny Orlhac
- Laboratoire d'Imagerie Translationnelle en Oncologie (LITO)-U1288, Institut Curie, Inserm, Université PSL, Orsay, France
| | - Daniel Pinto Dos Santos
- Department of Radiology, University Hospital of Cologne, Cologne, Germany
- Institute for Diagnostic and Interventional Radiology, Goethe-University Frankfurt Am Main, Frankfurt, Germany
| | - Arnaldo Stanzione
- Department of Advanced Biomedical Sciences, University of Naples "Federico II", Naples, Italy
| | - Lorenzo Ugga
- Department of Advanced Biomedical Sciences, University of Naples "Federico II", Naples, Italy
| | - Alex Zwanenburg
- OncoRay-National Center for Radiation Research in Oncology, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Helmholtz-Zentrum Dresden-Rossendorf, Dresden, Germany
- National Center for Tumor Diseases (NCT), Partner Site Dresden, Dresden, Germany
- German Cancer Research Center (DKFZ), Heidelberg, Germany
| |
Collapse
|
26
|
Ahlquist KD, Sugden LA, Ramachandran S. Enabling interpretable machine learning for biological data with reliability scores. PLoS Comput Biol 2023; 19:e1011175. [PMID: 37235578 PMCID: PMC10249903 DOI: 10.1371/journal.pcbi.1011175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 06/08/2023] [Accepted: 05/10/2023] [Indexed: 05/28/2023] Open
Abstract
Machine learning tools have proven useful across biological disciplines, allowing researchers to draw conclusions from large datasets, and opening up new opportunities for interpreting complex and heterogeneous biological data. Alongside the rapid growth of machine learning, there have also been growing pains: some models that appear to perform well have later been revealed to rely on features of the data that are artifactual or biased; this feeds into the general criticism that machine learning models are designed to optimize model performance over the creation of new biological insights. A natural question arises: how do we develop machine learning models that are inherently interpretable or explainable? In this manuscript, we describe the SWIF(r) reliability score (SRS), a method building on the SWIF(r) generative framework that reflects the trustworthiness of the classification of a specific instance. The concept of the reliability score has the potential to generalize to other machine learning methods. We demonstrate the utility of the SRS when faced with common challenges in machine learning including: 1) an unknown class present in testing data that was not present in training data, 2) systemic mismatch between training and testing data, and 3) instances of testing data that have missing values for some attributes. We explore these applications of the SRS using a range of biological datasets, from agricultural data on seed morphology, to 22 quantitative traits in the UK Biobank, and population genetic simulations and 1000 Genomes Project data. With each of these examples, we demonstrate how the SRS can allow researchers to interrogate their data and training approach thoroughly, and to pair their domain-specific knowledge with powerful machine-learning frameworks. We also compare the SRS to related tools for outlier and novelty detection, and find that it has comparable performance, with the advantage of being able to operate when some data are missing. The SRS, and the broader discussion of interpretable scientific machine learning, will aid researchers in the biological machine learning space as they seek to harness the power of machine learning without sacrificing rigor and biological insight.
Collapse
Affiliation(s)
- K. D. Ahlquist
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, Rhode Island, United States of America
| | - Lauren A. Sugden
- Department of Mathematics and Computer Science, Duquesne University, Pittsburgh, Pennsylvania, United States of America
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, Rhode Island, United States of America
- Data Science Initiative, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
27
|
Van Calster B, Steyerberg EW, Wynants L, van Smeden M. There is no such thing as a validated prediction model. BMC Med 2023; 21:70. [PMID: 36829188 PMCID: PMC9951847 DOI: 10.1186/s12916-023-02779-w] [Citation(s) in RCA: 49] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 02/10/2023] [Indexed: 02/26/2023] Open
Abstract
BACKGROUND Clinical prediction models should be validated before implementation in clinical practice. But is favorable performance at internal validation or one external validation sufficient to claim that a prediction model works well in the intended clinical context? MAIN BODY We argue to the contrary because (1) patient populations vary, (2) measurement procedures vary, and (3) populations and measurements change over time. Hence, we have to expect heterogeneity in model performance between locations and settings, and across time. It follows that prediction models are never truly validated. This does not imply that validation is not important. Rather, the current focus on developing new models should shift to a focus on more extensive, well-conducted, and well-reported validation studies of promising models. CONCLUSION Principled validation strategies are needed to understand and quantify heterogeneity, monitor performance over time, and update prediction models when appropriate. Such strategies will help to ensure that prediction models stay up-to-date and safe to support clinical decision-making.
Collapse
Affiliation(s)
- Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- EPI-Center, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, Netherlands
| | | | - Laure Wynants
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- EPI-Center, KU Leuven, Leuven, Belgium
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
| | - Maarten van Smeden
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Universiteitsweg 100, 3584 CG, Utrecht, Netherlands.
| |
Collapse
|
28
|
Hongbiao S, Shaochun X, Xiang W, YuRun T, Yang L, Mingzi Z, Hua Y, Keyang Z, Chi-Cheng F, Qu F, Pengchen G, Yi X, Shiyuan L. Comparison and verification of two deep learning models for the detection of chest CT rib fractures. Acta Radiol 2023; 64:542-551. [PMID: 35300519 DOI: 10.1177/02841851221083519] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
BACKGROUND A high false-positive rate remains a technical glitch hindering the broad spectrum of application of deep-learning-based diagnostic tools in routine radiological practice from assisting in diagnosing rib fractures. PURPOSE To examine the performance of two versions of deep-learning-based software tools in aiding radiologists in diagnosing rib fractures on chest computed tomography (CT) images. MATERIAL AND METHODS In total, 123 patients (708 rib fractures) were included in this retrospective study. Two groups of radiologists with different experience levels retrospectively reviewed images for rib fractures in the concurrent mode aided with RibFrac-High Sensitivity (HS) and RibFrac-High Precision (HP). We compared their diagnostic performance against the reference standard in terms of sensitivity and positive predictive value (PPV). RESULTS On a per-patient basis, RibFrac-HS exhibited a higher sensitivity compared with RibFrac-HP (mean difference=0.051, 95% CI=0.012-0.090; P = 0.011), whereas the latter significantly outperformed the former in terms of the PPV (mean difference=0.273, 95% CI=0.238-0.308; P < 0.0001). The use of RibFrac-HP significantly improved the junior and the senior groups' sensitivities respectively by 0.058 (95% CI=0.033-0.083; P < 0.0001) and 0.058 (95% CI=0.034-0.081; P < 0.0001), and decreased the diagnosis time by 206 s (95% CI=191-220; P < 0.0001) and 79 s (95% CI=67-92; P < 0.0001), respectively, when compared to no software assistance. CONCLUSION The sensitivity and efficiency of radiologists in identifying rib fractures can be improved by using RibFrac-HS and/or RibFrac-HP. With an added module for false-positive suppression, RibFrac-HP maintains the sensitivity and increases the PPV in fracture detection compared to Rib-Frac-HS.
Collapse
Affiliation(s)
- Sun Hongbiao
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai, PR China
| | - Xu Shaochun
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai, PR China
| | - Wang Xiang
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai, PR China
| | - Tang YuRun
- Company 13, College of Basic Medical Sciences, Naval Medical University, Shanghai, PR China
| | - Lu Yang
- Shanghai Aitrox Technology Corporation Limited, Shanghai, PR China
| | - Zhang Mingzi
- Shanghai Aitrox Technology Corporation Limited, Shanghai, PR China
| | - Yang Hua
- Shanghai Aitrox Technology Corporation Limited, Shanghai, PR China
| | - Zhao Keyang
- Shanghai Aitrox Technology Corporation Limited, Shanghai, PR China
| | - Fu Chi-Cheng
- Shanghai Aitrox Technology Corporation Limited, Shanghai, PR China
| | - Fang Qu
- Shanghai Aitrox Technology Corporation Limited, Shanghai, PR China
| | - Gu Pengchen
- Shanghai Aitrox Technology Corporation Limited, Shanghai, PR China
| | - Xiao Yi
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai, PR China
| | - Liu Shiyuan
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai, PR China
| |
Collapse
|
29
|
Geng EA, Cho BH, Valliani AA, Arvind V, Patel AV, Cho SK, Kim JS, Cagle PJ. Development of a machine learning algorithm to identify total and reverse shoulder arthroplasty implants from X-ray images. J Orthop 2023; 35:74-78. [PMID: 36411845 PMCID: PMC9674869 DOI: 10.1016/j.jor.2022.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 10/16/2022] [Accepted: 11/07/2022] [Indexed: 11/13/2022] Open
Abstract
Introduction Demand for total shoulder arthroplasty (TSA) has risen significantly and is projected to continue growing. From 2012 to 2017, the incidence of reverse total shoulder arthroplasty (rTSA) rose from 7.3 cases per 100,000 to 19.3 per 100,000. Anatomical TSA saw a growth from 9.5 cases per 100,000 to 12.5 per 100,000. Failure to identify implants in a timely manner can increase operative time, cost and risk of complications. Several machine learning models have been developed to perform medical image analysis. However, they have not been widely applied in shoulder surgery. The authors developed a machine learning model to identify shoulder implant manufacturers and type from anterior-posterior X-ray images. Methods The model deployed was a convolutional neural network (CNN), which has been widely used in computer vision tasks. 696 radiographs were obtained from a single institution. 70% were used to train the model, while evaluation was done on 30%. Results On the evaluation set, the model performed with an overall accuracy of 93.9% with positive predictive value, sensitivity and F-1 scores of 94% across 10 different implant types (4 reverse, 6 anatomical). Average identification time was 0.110 s per implant. Conclusion This proof of concept study demonstrates that machine learning can assist with preoperative planning and improve cost-efficiency in shoulder surgery.
Collapse
Affiliation(s)
- Eric A. Geng
- Department of Orthopaedic Surgery, Mount Sinai Health System, New York, NY, 10029, USA
| | - Brian H. Cho
- Department of Orthopaedic Surgery, Mount Sinai Health System, New York, NY, 10029, USA
| | - Aly A. Valliani
- Department of Orthopaedic Surgery, Mount Sinai Health System, New York, NY, 10029, USA
| | - Varun Arvind
- Department of Orthopaedic Surgery, Mount Sinai Health System, New York, NY, 10029, USA
| | - Akshar V. Patel
- Department of Orthopaedic Surgery, Mount Sinai Health System, New York, NY, 10029, USA
| | - Samuel K. Cho
- Department of Orthopaedic Surgery, Mount Sinai Health System, New York, NY, 10029, USA
| | - Jun S. Kim
- Department of Orthopaedic Surgery, Mount Sinai Health System, New York, NY, 10029, USA
| | - Paul J. Cagle
- Department of Orthopaedic Surgery, Mount Sinai Health System, New York, NY, 10029, USA
| |
Collapse
|
30
|
Hamdan S, Love BC, von Polier GG, Weis S, Schwender H, Eickhoff SB, Patil KR. Confound-leakage: confound removal in machine learning leads to leakage. Gigascience 2022; 12:giad071. [PMID: 37776368 PMCID: PMC10541796 DOI: 10.1093/gigascience/giad071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 06/01/2023] [Accepted: 08/17/2023] [Indexed: 10/02/2023] Open
Abstract
BACKGROUND Machine learning (ML) approaches are a crucial component of modern data analysis in many fields, including epidemiology and medicine. Nonlinear ML methods often achieve accurate predictions, for instance, in personalized medicine, as they are capable of modeling complex relationships between features and the target. Problematically, ML models and their predictions can be biased by confounding information present in the features. To remove this spurious signal, researchers often employ featurewise linear confound regression (CR). While this is considered a standard approach for dealing with confounding, possible pitfalls of using CR in ML pipelines are not fully understood. RESULTS We provide new evidence that, contrary to general expectations, linear confound regression can increase the risk of confounding when combined with nonlinear ML approaches. Using a simple framework that uses the target as a confound, we show that information leaked via CR can increase null or moderate effects to near-perfect prediction. By shuffling the features, we provide evidence that this increase is indeed due to confound-leakage and not due to revealing of information. We then demonstrate the danger of confound-leakage in a real-world clinical application where the accuracy of predicting attention-deficit/hyperactivity disorder is overestimated using speech-derived features when using depression as a confound. CONCLUSIONS Mishandling or even amplifying confounding effects when building ML models due to confound-leakage, as shown, can lead to untrustworthy, biased, and unfair predictions. Our expose of the confound-leakage pitfall and provided guidelines for dealing with it can help create more robust and trustworthy ML models.
Collapse
Affiliation(s)
- Sami Hamdan
- Institute of Neuroscience and Medicine, Brain and Behaviour (INM-7), Forschungszentrum Jülich, 52428 Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich-Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Bradley C Love
- Department of Experimental Psychology, University College London, WC1H 0AP London, UK
- The Alan Turing Institute, London NW1 2DB, UK
- European Lab for Learning & Intelligent Systems (ELLIS), WC1E 6BT, London, UK
| | - Georg G von Polier
- Institute of Neuroscience and Medicine, Brain and Behaviour (INM-7), Forschungszentrum Jülich, 52428 Jülich, Germany
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, University Hospital Frankfurt, 60528 Frankfurt, Germany
- Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, RWTH Aachen University, 52074 Aachen, Germany
| | - Susanne Weis
- Institute of Neuroscience and Medicine, Brain and Behaviour (INM-7), Forschungszentrum Jülich, 52428 Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich-Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Holger Schwender
- Institute of Mathematics, Heinrich-Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Simon B Eickhoff
- Institute of Neuroscience and Medicine, Brain and Behaviour (INM-7), Forschungszentrum Jülich, 52428 Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich-Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Kaustubh R Patil
- Institute of Neuroscience and Medicine, Brain and Behaviour (INM-7), Forschungszentrum Jülich, 52428 Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich-Heine University Düsseldorf, 40225 Düsseldorf, Germany
| |
Collapse
|
31
|
Artificial Intelligence (AI) for Fracture Diagnosis: An Overview of Current Products and Considerations for Clinical Adoption, From the AJR Special Series on AI Applications. AJR Am J Roentgenol 2022; 219:869-878. [PMID: 35731103 DOI: 10.2214/ajr.22.27873] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Fractures are common injuries that can be difficult to diagnose, with missed fractures accounting for most misdiagnoses in the emergency department. Artificial intelligence (AI) and, specifically, deep learning have shown a strong ability to accurately detect fractures and augment the performance of radiologists in proof-of-concept research settings. Although the number of real-world AI products available for clinical use continues to increase, guidance for practicing radiologists in the adoption of this new technology is limited. This review describes how AI and deep learning algorithms can help radiologists to better diagnose fractures. The article also provides an overview of commercially available U.S. FDA-cleared AI tools for fracture detection as well as considerations for the clinical adoption of these tools by radiology practices.
Collapse
|
32
|
Yang L, Gao S, Li P, Shi J, Zhou F. Recognition and Segmentation of Individual Bone Fragments with a Deep Learning Approach in CT Scans of Complex Intertrochanteric Fractures: A Retrospective Study. J Digit Imaging 2022; 35:1681-1689. [PMID: 35711073 PMCID: PMC9712885 DOI: 10.1007/s10278-022-00669-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 05/04/2022] [Accepted: 06/07/2022] [Indexed: 10/18/2022] Open
Abstract
The characteristics of bone fragments are the main influencing factors for the choice of treatment in intertrochanteric fractures. This study aimed to develop a deep learning algorithm for recognizing and segmenting individual fragments in CT images of complex intertrochanteric fractures for orthopedic surgeons. This study was based on 160 hip CT scans (43,510 images) of complex fractures of three types based on the Evans-Jensen classification (40 cases of type 3 (IIA) fractures, 80 cases of type 4 (IIB)fractures, and 40 cases of type 5 (III)fractures) retrospectively. The images were randomly split into two groups to construct a training set of 120 CT scans (32,045 images) and a testing set of 40 CT scans (11,465 images). A deep learning model was built into a cascaded architecture composed by a convolutional neural network (CNN) for location of the fracture ROI and another CNN for recognition and segmentation of individual fragments within the ROI. The accuracy of object detection and dice coefficient of segmentation of individual fragments were used to evaluate model performance. The model yielded an average accuracy of 89.4% for individual fragment recognition and an average dice coefficient of 90.5% for segmentation in CT images. The results demonstrated the feasibility of recognition and segmentation of individual fragments in complex intertrochanteric fractures with a deep learning approach. Altogether, these promising results suggest the potential of our model to be applied to many clinical scenarios.
Collapse
Affiliation(s)
- Lv Yang
- Department of Orthopedics, Peking University Third Hospital, Beijing, China
| | - Shan Gao
- Department of Orthopedics, Peking University Third Hospital, Beijing, China
| | - Pengfei Li
- Department of Orthopedics, Peking University Third Hospital, Beijing, China
| | - Jiancheng Shi
- Department of Radiology, Peking University Third Hospital, Yanqing Hospital, Beijing, China
| | - Fang Zhou
- Department of Orthopedics, Peking University Third Hospital, Beijing, China.
| |
Collapse
|
33
|
Ashkani-Esfahani S, Mojahed Yazdi R, Bhimani R, Kerkhoffs GM, Maas M, DiGiovanni CW, Lubberts B, Guss D. Detection of ankle fractures using deep learning algorithms. Foot Ankle Surg 2022; 28:1259-1265. [PMID: 35659710 DOI: 10.1016/j.fas.2022.05.005] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 03/27/2022] [Accepted: 05/19/2022] [Indexed: 02/04/2023]
Abstract
BACKGROUND Early and accurate detection of ankle fractures are crucial for optimizing treatment and thus reducing future complications. Radiographs are the most abundant imaging techniques for assessing fractures. Deep learning (DL) methods, through adequately trained deep convolutional neural networks (DCNNs), have been previously shown to faster and accurately analyze radiographic images without human intervention. Herein, we aimed to assess the performance of two different DCNNs in detecting ankle fractures using radiographs compared to the ground truth. METHODS In this retrospective case-control study, our DCNNs were trained using radiographs obtained from 1050 patients with ankle fracture and the same number of individuals with otherwise healthy ankles. Inception V3 and Renet-50 pretrained models were used in our algorithms. Danis-Weber classification method was used. Out of 1050, 72 individuals were labeled as occult fractures as they were not detected in the primary radiographic assessment. Single-view (anteroposterior) radiographs was compared with 3-views (anteroposterior, mortise, lateral) for training the DCNNs. RESULTS Our DCNNs showed a better performance using 3-views images versus single-view based on greater values for accuracy, F-score, and area under the curve (AUC). The highest sensitivity was 98.7 % and specificity was 98.6 % in detection of ankle fractures using 3-views using inception V3. This model missed only one fracture on radiographs. CONCLUSION The performance of our DCNNs showed that it can be used for developing the currently used image interpretation programs or as a separate assistant solution for the clinicians to detect ankle fractures faster and more precisely. LEVEL OF EVIDENCE III.
Collapse
Affiliation(s)
- Soheil Ashkani-Esfahani
- Foot & Ankle Research and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston 02114, MA, USA; Department of Orthopaedic Surgery, Amsterdam University Medical Center, University of Amsterdam, Amsterdam Movement Sciences, Amsterdam, the Netherlands; Foot & Ankle Service, Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston 02114, MA, USA.
| | - Reza Mojahed Yazdi
- Foot & Ankle Research and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston 02114, MA, USA.
| | - Rohan Bhimani
- Foot & Ankle Research and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston 02114, MA, USA.
| | - Gino M Kerkhoffs
- Department of Orthopaedic Surgery, Amsterdam University Medical Center, University of Amsterdam, Amsterdam Movement Sciences, Amsterdam, the Netherlands.
| | - Mario Maas
- Department of Radiology, Amsterdam University Medical Center, University of Amsterdam, Amsterdam Movement Sciences, Amsterdam, the Netherlands.
| | - Christopher W DiGiovanni
- Foot & Ankle Research and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston 02114, MA, USA; Foot & Ankle Service, Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston 02114, MA, USA.
| | - Bart Lubberts
- Foot & Ankle Research and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston 02114, MA, USA.
| | - Daniel Guss
- Foot & Ankle Research and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston 02114, MA, USA; Foot & Ankle Service, Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston 02114, MA, USA.
| |
Collapse
|
34
|
Prijs J, Liao Z, To MS, Verjans J, Jutte PC, Stirler V, Olczak J, Gordon M, Guss D, DiGiovanni CW, Jaarsma RL, IJpma FFA, Doornberg JN. Development and external validation of automated detection, classification, and localization of ankle fractures: inside the black box of a convolutional neural network (CNN). Eur J Trauma Emerg Surg 2022; 49:1057-1069. [PMID: 36374292 PMCID: PMC10175446 DOI: 10.1007/s00068-022-02136-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 10/10/2022] [Indexed: 11/16/2022]
Abstract
Abstract
Purpose
Convolutional neural networks (CNNs) are increasingly being developed for automated fracture detection in orthopaedic trauma surgery. Studies to date, however, are limited to providing classification based on the entire image—and only produce heatmaps for approximate fracture localization instead of delineating exact fracture morphology. Therefore, we aimed to answer (1) what is the performance of a CNN that detects, classifies, localizes, and segments an ankle fracture, and (2) would this be externally valid?
Methods
The training set included 326 isolated fibula fractures and 423 non-fracture radiographs. The Detectron2 implementation of the Mask R-CNN was trained with labelled and annotated radiographs. The internal validation (or ‘test set’) and external validation sets consisted of 300 and 334 radiographs, respectively. Consensus agreement between three experienced fellowship-trained trauma surgeons was defined as the ground truth label. Diagnostic accuracy and area under the receiver operator characteristic curve (AUC) were used to assess classification performance. The Intersection over Union (IoU) was used to quantify accuracy of the segmentation predictions by the CNN, where a value of 0.5 is generally considered an adequate segmentation.
Results
The final CNN was able to classify fibula fractures according to four classes (Danis-Weber A, B, C and No Fracture) with AUC values ranging from 0.93 to 0.99. Diagnostic accuracy was 89% on the test set with average sensitivity of 89% and specificity of 96%. External validity was 89–90% accurate on a set of radiographs from a different hospital. Accuracies/AUCs observed were 100/0.99 for the ‘No Fracture’ class, 92/0.99 for ‘Weber B’, 88/0.93 for ‘Weber C’, and 76/0.97 for ‘Weber A’. For the fracture bounding box prediction by the CNN, a mean IoU of 0.65 (SD ± 0.16) was observed. The fracture segmentation predictions by the CNN resulted in a mean IoU of 0.47 (SD ± 0.17).
Conclusions
This study presents a look into the ‘black box’ of CNNs and represents the first automated delineation (segmentation) of fracture lines on (ankle) radiographs. The AUC values presented in this paper indicate good discriminatory capability of the CNN and substantiate further study of CNNs in detecting and classifying ankle fractures.
Level of evidence
II, Diagnostic imaging study.
Collapse
Affiliation(s)
- Jasper Prijs
- Department of Orthopaedic Surgery, Groningen University Medical Centre, Groningen, The Netherlands.
- Department of Surgery, Groningen University Medical Centre, Groningen, The Netherlands.
- Department of Orthopaedic & Trauma Surgery, Flinders Medical Centre, Flinders University, Adelaide, Australia.
| | - Zhibin Liao
- Australian Institute for Machine Learning, Adelaide, Australia
| | - Minh-Son To
- College of Medicine and Public Health, Flinders University, Adelaide, Australia
- Department of Neurosurgery, Flinders Medical Center, Adelaide, Australia
| | - Johan Verjans
- Australian Institute for Machine Learning, Adelaide, Australia
| | - Paul C Jutte
- Department of Orthopaedic Surgery, Groningen University Medical Centre, Groningen, The Netherlands
| | - Vincent Stirler
- Department of Orthopaedic Surgery, Groningen University Medical Centre, Groningen, The Netherlands
| | - Jakub Olczak
- Institute of Clinical Sciences, Danderyd University Hospital, Karolinska Institute, Solna, Sweden
| | - Max Gordon
- Institute of Clinical Sciences, Danderyd University Hospital, Karolinska Institute, Solna, Sweden
| | - Daniel Guss
- Massachusetts General Hospital, Boston, USA
- Harvard Medical School, Boston, USA
| | | | - Ruurd L Jaarsma
- Department of Orthopaedic & Trauma Surgery, Flinders Medical Centre, Flinders University, Adelaide, Australia
| | - Frank F A IJpma
- Department of Orthopaedic Surgery, Groningen University Medical Centre, Groningen, The Netherlands
| | - Job N Doornberg
- Department of Orthopaedic Surgery, Groningen University Medical Centre, Groningen, The Netherlands
- Department of Orthopaedic & Trauma Surgery, Flinders Medical Centre, Flinders University, Adelaide, Australia
- College of Medicine and Public Health, Flinders University, Adelaide, Australia
| |
Collapse
|
35
|
Monteith S, Glenn T, Geddes J, Whybrow PC, Achtyes E, Bauer M. Expectations for Artificial Intelligence (AI) in Psychiatry. Curr Psychiatry Rep 2022; 24:709-721. [PMID: 36214931 PMCID: PMC9549456 DOI: 10.1007/s11920-022-01378-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/15/2022] [Indexed: 01/29/2023]
Abstract
PURPOSE OF REVIEW Artificial intelligence (AI) is often presented as a transformative technology for clinical medicine even though the current technology maturity of AI is low. The purpose of this narrative review is to describe the complex reasons for the low technology maturity and set realistic expectations for the safe, routine use of AI in clinical medicine. RECENT FINDINGS For AI to be productive in clinical medicine, many diverse factors that contribute to the low maturity level need to be addressed. These include technical problems such as data quality, dataset shift, black-box opacity, validation and regulatory challenges, and human factors such as a lack of education in AI, workflow changes, automation bias, and deskilling. There will also be new and unanticipated safety risks with the introduction of AI. The solutions to these issues are complex and will take time to discover, develop, validate, and implement. However, addressing the many problems in a methodical manner will expedite the safe and beneficial use of AI to augment medical decision making in psychiatry.
Collapse
Affiliation(s)
- Scott Monteith
- Michigan State University College of Human Medicine, Traverse City Campus, Traverse City, MI, 49684, USA.
| | - Tasha Glenn
- ChronoRecord Association, Fullerton, CA, USA
| | - John Geddes
- Department of Psychiatry, University of Oxford, Warneford Hospital, Oxford, UK
| | - Peter C Whybrow
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles (UCLA), Los Angeles, CA, USA
| | - Eric Achtyes
- Michigan State University College of Human Medicine, Grand Rapids, MI, 49684, USA
- Network180, Grand Rapids, MI, USA
| | - Michael Bauer
- Department of Psychiatry and Psychotherapy, University Hospital Carl Gustav Carus Medical Faculty, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
36
|
Benchmarking saliency methods for chest X-ray interpretation. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00536-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
AbstractSaliency methods, which produce heat maps that highlight the areas of the medical image that influence model prediction, are often presented to clinicians as an aid in diagnostic decision-making. However, rigorous investigation of the accuracy and reliability of these strategies is necessary before they are integrated into the clinical setting. In this work, we quantitatively evaluate seven saliency methods, including Grad-CAM, across multiple neural network architectures using two evaluation metrics. We establish the first human benchmark for chest X-ray segmentation in a multilabel classification set-up, and examine under what clinical conditions saliency maps might be more prone to failure in localizing important pathologies compared with a human expert benchmark. We find that (1) while Grad-CAM generally localized pathologies better than the other evaluated saliency methods, all seven performed significantly worse compared with the human benchmark, (2) the gap in localization performance between Grad-CAM and the human benchmark was largest for pathologies that were smaller in size and had shapes that were more complex, and (3) model confidence was positively correlated with Grad-CAM localization performance. Our work demonstrates that several important limitations of saliency methods must be addressed before we can rely on them for deep learning explainability in medical imaging.
Collapse
|
37
|
Momtazmanesh S, Nowroozi A, Rezaei N. Artificial Intelligence in Rheumatoid Arthritis: Current Status and Future Perspectives: A State-of-the-Art Review. Rheumatol Ther 2022; 9:1249-1304. [PMID: 35849321 PMCID: PMC9510088 DOI: 10.1007/s40744-022-00475-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 06/24/2022] [Indexed: 11/23/2022] Open
Abstract
Investigation of the potential applications of artificial intelligence (AI), including machine learning (ML) and deep learning (DL) techniques, is an exponentially growing field in medicine and healthcare. These methods can be critical in providing high-quality care to patients with chronic rheumatological diseases lacking an optimal treatment, like rheumatoid arthritis (RA), which is the second most prevalent autoimmune disease. Herein, following reviewing the basic concepts of AI, we summarize the advances in its applications in RA clinical practice and research. We provide directions for future investigations in this field after reviewing the current knowledge gaps and technical and ethical challenges in applying AI. Automated models have been largely used to improve RA diagnosis since the early 2000s, and they have used a wide variety of techniques, e.g., support vector machine, random forest, and artificial neural networks. AI algorithms can facilitate screening and identification of susceptible groups, diagnosis using omics, imaging, clinical, and sensor data, patient detection within electronic health record (EHR), i.e., phenotyping, treatment response assessment, monitoring disease course, determining prognosis, novel drug discovery, and enhancing basic science research. They can also aid in risk assessment for incidence of comorbidities, e.g., cardiovascular diseases, in patients with RA. However, the proposed models may vary significantly in their performance and reliability. Despite the promising results achieved by AI models in enhancing early diagnosis and management of patients with RA, they are not fully ready to be incorporated into clinical practice. Future investigations are required to ensure development of reliable and generalizable algorithms while they carefully look for any potential source of bias or misconduct. We showed that a growing body of evidence supports the potential role of AI in revolutionizing screening, diagnosis, and management of patients with RA. However, multiple obstacles hinder clinical applications of AI models. Incorporating the machine and/or deep learning algorithms into real-world settings would be a key step in the progress of AI in medicine.
Collapse
Affiliation(s)
- Sara Momtazmanesh
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran
- Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Dr. Gharib St, Keshavarz Blvd, Tehran, Iran
| | - Ali Nowroozi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran
| | - Nima Rezaei
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran.
- Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Dr. Gharib St, Keshavarz Blvd, Tehran, Iran.
- Department of Immunology, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
38
|
Alsoof D, McDonald CL, Kuris EO, Daniels AH. Machine Learning for the Orthopaedic Surgeon: Uses and Limitations. J Bone Joint Surg Am 2022; 104:1586-1594. [PMID: 35383655 DOI: 10.2106/jbjs.21.01305] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
➤ Machine learning is a subset of artificial intelligence in which computer algorithms are trained to make classifications and predictions based on patterns in data. The utilization of these techniques is rapidly expanding in the field of orthopaedic research. ➤ There are several domains in which machine learning has application to orthopaedics, including radiographic diagnosis, gait analysis, implant identification, and patient outcome prediction. ➤ Several limitations prevent the widespread use of machine learning in the daily clinical environment. However, future work can overcome these issues and enable machine learning tools to be a useful adjunct for orthopaedic surgeons in their clinical decision-making.
Collapse
Affiliation(s)
- Daniel Alsoof
- Department of Orthopedic Surgery, Warren Alpert Medical School of Brown University, Providence, Rhode Island
| | | | | | | |
Collapse
|
39
|
Faghani S, Khosravi B, Zhang K, Moassefi M, Jagtap JM, Nugen F, Vahdati S, Kuanar SP, Rassoulinejad-Mousavi SM, Singh Y, Vera Garcia DV, Rouzrokh P, Erickson BJ. Mitigating Bias in Radiology Machine Learning: 3. Performance Metrics. Radiol Artif Intell 2022; 4:e220061. [PMID: 36204539 PMCID: PMC9530766 DOI: 10.1148/ryai.220061] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 08/16/2022] [Accepted: 08/17/2022] [Indexed: 05/31/2023]
Abstract
The increasing use of machine learning (ML) algorithms in clinical settings raises concerns about bias in ML models. Bias can arise at any step of ML creation, including data handling, model development, and performance evaluation. Potential biases in the ML model can be minimized by implementing these steps correctly. This report focuses on performance evaluation and discusses model fitness, as well as a set of performance evaluation toolboxes: namely, performance metrics, performance interpretation maps, and uncertainty quantification. By discussing the strengths and limitations of each toolbox, our report highlights strategies and considerations to mitigate and detect biases during performance evaluations of radiology artificial intelligence models. Keywords: Segmentation, Diagnosis, Convolutional Neural Network (CNN) © RSNA, 2022.
Collapse
Affiliation(s)
- Shahriar Faghani
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | - Bardia Khosravi
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | - Kuan Zhang
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | - Mana Moassefi
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | - Jaidip Manikrao Jagtap
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | - Fred Nugen
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | - Sanaz Vahdati
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | - Shiba P. Kuanar
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | | | - Yashbir Singh
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | - Diana V. Vera Garcia
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | - Pouria Rouzrokh
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| | - Bradley J. Erickson
- From the Radiology Informatics Laboratory, Department of Radiology,
Mayo Clinic, 200 1st St SW, Rochester, MN 55905
| |
Collapse
|
40
|
Luo L, Chen H, Xiao Y, Zhou Y, Wang X, Vardhanabhuti V, Wu M, Han C, Liu Z, Fang XHB, Tsougenis E, Lin H, Heng PA. Rethinking Annotation Granularity for Overcoming Shortcuts in Deep Learning-based Radiograph Diagnosis: A Multicenter Study. Radiol Artif Intell 2022; 4:e210299. [PMID: 36204545 PMCID: PMC9530769 DOI: 10.1148/ryai.210299] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 06/17/2022] [Accepted: 07/07/2022] [Indexed: 06/16/2023]
Abstract
PURPOSE To evaluate the ability of fine-grained annotations to overcome shortcut learning in deep learning (DL)-based diagnosis using chest radiographs. MATERIALS AND METHODS Two DL models were developed using radiograph-level annotations (disease present: yes or no) and fine-grained lesion-level annotations (lesion bounding boxes), respectively named CheXNet and CheXDet. A total of 34 501 chest radiographs obtained from January 2005 to September 2019 were retrospectively collected and annotated regarding cardiomegaly, pleural effusion, mass, nodule, pneumonia, pneumothorax, tuberculosis, fracture, and aortic calcification. The internal classification performance and lesion localization performance of the models were compared on a testing set (n = 2922); external classification performance was compared on National Institutes of Health (NIH) Google (n = 4376) and PadChest (n = 24 536) datasets; and external lesion localization performance was compared on the NIH ChestX-ray14 dataset (n = 880). The models were also compared with radiologist performance on a subset of the internal testing set (n = 496). Performance was evaluated using receiver operating characteristic (ROC) curve analysis. RESULTS Given sufficient training data, both models performed similarly to radiologists. CheXDet achieved significant improvement for external classification, such as classifying fracture on NIH Google (CheXDet area under the ROC curve [AUC], 0.67; CheXNet AUC, 0.51; P < .001) and PadChest (CheXDet AUC, 0.78; CheXNet AUC, 0.55; P < .001). CheXDet achieved higher lesion detection performance than CheXNet for most abnormalities on all datasets, such as detecting pneumothorax on the internal set (CheXDet jackknife alternative free-response ROC [JAFROC] figure of merit [FOM], 0.87; CheXNet JAFROC FOM, 0.13; P < .001) and NIH ChestX-ray14 (CheXDet JAFROC FOM, 0.55; CheXNet JAFROC FOM, 0.04; P < .001). CONCLUSION Fine-grained annotations overcame shortcut learning and enabled DL models to identify correct lesion patterns, improving the generalizability of the models.Keywords: Computer-aided Diagnosis, Conventional Radiography, Convolutional Neural Network (CNN), Deep Learning Algorithms, Machine Learning Algorithms, Localization Supplemental material is available for this article © RSNA, 2022.
Collapse
|
41
|
Van Calster B, Timmerman S, Geysels A, Verbakel JY, Froyman W. A deep-learning-enabled diagnosis of ovarian cancer. Lancet Digit Health 2022; 4:e630. [PMID: 36028287 DOI: 10.1016/s2589-7500(22)00130-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 06/28/2022] [Indexed: 06/15/2023]
Affiliation(s)
- Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium; EPI-Centre, Department of Public Health and Primary Care, KU Leuven, Leuven, Belgium; Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, Netherlands
| | - Stefan Timmerman
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium; Department of Obstetrics and Gynaecology, University Hospitals Leuven, Leuven 3000, Belgium
| | - Axel Geysels
- Department of Electrical Engineering, STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, Belgium
| | - Jan Y Verbakel
- EPI-Centre, Department of Public Health and Primary Care, KU Leuven, Leuven, Belgium; Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
| | - Wouter Froyman
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium; Department of Obstetrics and Gynaecology, University Hospitals Leuven, Leuven 3000, Belgium.
| |
Collapse
|
42
|
Assessment of performances of a deep learning algorithm for the detection of limbs and pelvic fractures, dislocations, focal bone lesions, and elbow effusions on trauma X-rays. Eur J Radiol 2022; 154:110447. [DOI: 10.1016/j.ejrad.2022.110447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 04/29/2022] [Accepted: 07/19/2022] [Indexed: 11/23/2022]
|
43
|
Inferring pediatric knee skeletal maturity from MRI using deep learning. Skeletal Radiol 2022; 51:1671-1677. [PMID: 35184211 DOI: 10.1007/s00256-022-04010-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 01/29/2022] [Accepted: 02/04/2022] [Indexed: 02/02/2023]
Abstract
PURPOSE Many children who undergo MR of the knee to evaluate traumatic injury may not undergo a separate dedicated evaluation of their skeletal maturity, and we wished to investigate how accurately skeletal maturity could be automatically inferred from knee MRI using deep learning to offer this additional information to clinicians. MATERIALS AND METHODS Retrospective data from 894 studies from 783 patients were obtained (mean age 13.1 years, 47% female). Coronal and sagittal sequences that were T1/PD-weighted were included and resized to 224 × 224 pixels. Data were divided into train (n = 673), tune (n = 48), and test (n = 173) sets, and children were separated across sets. The chronologic age was predicted using deep learning approaches based on a long short-term memory (LSTM) model, which took as input DenseNet-121-extracted features from all T1/PD coronal and sagittal slices. Each test case was manually assigned a bone age by two radiology residents using a reference atlas provided by Pennock and Bomar. The patient's age served as ground truth. RESULTS The error of the model's predictions for chronological age was not significantly different from that of radiology residents (model M.S.E. 1.30 vs. resident 0.99, paired t-test = 1.47, p = 0.14). Pearson correlation between model and resident prediction of chronologic age was 0.96 (p < 0.001). CONCLUSION A deep learning-based approach demonstrated ability to infer skeletal maturity from knee MR sequences that was not significantly different from resident performance and did so in less than 2% of the time required by a human expert. This may offer a method for automatically evaluating lower extremity skeletal maturity automatically as part of every MR examination.
Collapse
|
44
|
Hornung AL, Hornung CM, Mallow GM, Barajas JN, Espinoza Orías AA, Galbusera F, Wilke HJ, Colman M, Phillips FM, An HS, Samartzis D. Artificial intelligence and spine imaging: limitations, regulatory issues and future direction. EUROPEAN SPINE JOURNAL : OFFICIAL PUBLICATION OF THE EUROPEAN SPINE SOCIETY, THE EUROPEAN SPINAL DEFORMITY SOCIETY, AND THE EUROPEAN SECTION OF THE CERVICAL SPINE RESEARCH SOCIETY 2022; 31:2007-2021. [PMID: 35084588 DOI: 10.1007/s00586-021-07108-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 11/29/2021] [Accepted: 12/30/2021] [Indexed: 01/20/2023]
Abstract
BACKGROUND As big data and artificial intelligence (AI) in spine care, and medicine as a whole, continue to be at the forefront of research, careful consideration to the quality and techniques utilized is necessary. Predictive modeling, data science, and deep analytics have taken center stage. Within that space, AI and machine learning (ML) approaches toward the use of spine imaging have gathered considerable attention in the past decade. Although several benefits of such applications exist, limitations are also present and need to be considered. PURPOSE The following narrative review presents the current status of AI, in particular, ML, with special regard to imaging studies, in the field of spinal research. METHODS A multi-database assessment of the literature was conducted up to September 1, 2021, that addressed AI as it related to imaging of the spine. Articles written in English were selected and critically assessed. RESULTS Overall, the review discussed the limitations, data quality and applications of ML models in the context of spine imaging. In particular, we addressed the data quality and ML algorithms in spine imaging research by describing preliminary results from a widely accessible imaging algorithm that is currently available for spine specialists to reference for information on severity of spine disease and degeneration which ultimately may alter clinical decision-making. In addition, awareness of the current, under-recognized regulation surrounding the execution of ML for spine imaging was raised. CONCLUSIONS Recommendations were provided for conducting high-quality, standardized AI applications for spine imaging.
Collapse
Affiliation(s)
- Alexander L Hornung
- Department of Orthopaedic Surgery, Rush University Medical Center, Orthopaedic Building, Suite 204-G, 1611 W. Harrison Street, Chicago, IL, 60612, USA
| | | | - G Michael Mallow
- Department of Orthopaedic Surgery, Rush University Medical Center, Orthopaedic Building, Suite 204-G, 1611 W. Harrison Street, Chicago, IL, 60612, USA
| | - J Nicolas Barajas
- Department of Orthopaedic Surgery, Rush University Medical Center, Orthopaedic Building, Suite 204-G, 1611 W. Harrison Street, Chicago, IL, 60612, USA
| | - Alejandro A Espinoza Orías
- Department of Orthopaedic Surgery, Rush University Medical Center, Orthopaedic Building, Suite 204-G, 1611 W. Harrison Street, Chicago, IL, 60612, USA
| | | | - Hans-Joachim Wilke
- Institute of Orthopaedic Research and Biomechanics, Trauma Research Center Ulm, Ulm University, Ulm, Germany
| | - Matthew Colman
- Department of Orthopaedic Surgery, Rush University Medical Center, Orthopaedic Building, Suite 204-G, 1611 W. Harrison Street, Chicago, IL, 60612, USA
| | - Frank M Phillips
- Department of Orthopaedic Surgery, Rush University Medical Center, Orthopaedic Building, Suite 204-G, 1611 W. Harrison Street, Chicago, IL, 60612, USA
| | - Howard S An
- Department of Orthopaedic Surgery, Rush University Medical Center, Orthopaedic Building, Suite 204-G, 1611 W. Harrison Street, Chicago, IL, 60612, USA
| | - Dino Samartzis
- Department of Orthopaedic Surgery, Rush University Medical Center, Orthopaedic Building, Suite 204-G, 1611 W. Harrison Street, Chicago, IL, 60612, USA.
| |
Collapse
|
45
|
Feng C, Zhou X, Wang H, He Y, Li Z, Tu C. Research hotspots and emerging trends of deep learning applications in orthopedics: A bibliometric and visualized study. Front Public Health 2022; 10:949366. [PMID: 35928480 PMCID: PMC9343683 DOI: 10.3389/fpubh.2022.949366] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 06/27/2022] [Indexed: 11/13/2022] Open
Abstract
Background As a research hotspot, deep learning has been continuously combined with various research fields in medicine. Recently, there is a growing amount of deep learning-based researches in orthopedics. This bibliometric analysis aimed to identify the hotspots of deep learning applications in orthopedics in recent years and infer future research trends. Methods We screened global publication on deep learning applications in orthopedics by accessing the Web of Science Core Collection. The articles and reviews were collected without language and time restrictions. Citespace was applied to conduct the bibliometric analysis of the publications. Results A total of 822 articles and reviews were finally retrieved. The analysis showed that the application of deep learning in orthopedics has great prospects for development based on the annual publications. The most prolific country is the USA, followed by China. University of California San Francisco, and Skeletal Radiology are the most prolific institution and journal, respectively. LeCun Y is the most frequently cited author, and Nature has the highest impact factor in the cited journals. The current hot keywords are convolutional neural network, classification, segmentation, diagnosis, image, fracture, and osteoarthritis. The burst keywords are risk factor, identification, localization, and surgery. The timeline viewer showed two recent research directions for bone tumors and osteoporosis. Conclusion Publications on deep learning applications in orthopedics have increased in recent years, with the USA being the most prolific. The current research mainly focused on classifying, diagnosing and risk predicting in osteoarthritis and fractures from medical images. Future research directions may put emphasis on reducing intraoperative risk, predicting the occurrence of postoperative complications, screening for osteoporosis, and identification and classification of bone tumors from conventional imaging.
Collapse
Affiliation(s)
- Chengyao Feng
- The Department of Orthopaedics, The Second Xiangya Hospital of Central South University, Changsha, China
- Hunan Key Laboratory of Tumor Models and Individualized Medicine, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Xiaowen Zhou
- Xiangya School of Medicine, Central South University, Changsha, China
| | - Hua Wang
- Xiangya School of Medicine, Central South University, Changsha, China
| | - Yu He
- The Department of Radiology, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Zhihong Li
- The Department of Orthopaedics, The Second Xiangya Hospital of Central South University, Changsha, China
- Hunan Key Laboratory of Tumor Models and Individualized Medicine, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Chao Tu
- The Department of Orthopaedics, The Second Xiangya Hospital of Central South University, Changsha, China
- Hunan Key Laboratory of Tumor Models and Individualized Medicine, The Second Xiangya Hospital of Central South University, Changsha, China
- *Correspondence: Chao Tu
| |
Collapse
|
46
|
Wardlaw JM, Mair G, von Kummer R, Williams MC, Li W, Storkey AJ, Trucco E, Liebeskind DS, Farrall A, Bath PM, White P. Accuracy of Automated Computer-Aided Diagnosis for Stroke Imaging: A Critical Evaluation of Current Evidence. Stroke 2022; 53:2393-2403. [PMID: 35440170 DOI: 10.1161/strokeaha.121.036204] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
There is increasing interest in computer applications, using artificial intelligence methodologies, to perform health care tasks previously performed by humans, particularly in medical imaging for diagnosis. In stroke, there are now commercial artificial intelligence software for use with computed tomography or MR imaging to identify acute ischemic brain tissue pathology, arterial obstruction on computed tomography angiography or as hyperattenuated arteries on computed tomography, brain hemorrhage, or size of perfusion defects. A rapid, accurate diagnosis may aid treatment decisions for individual patients and could improve outcome if it leads to effective and safe treatment; or conversely, to disaster if a delayed or incorrect diagnosis results in inappropriate treatment. Despite this potential clinical impact, diagnostic tools including artificial intelligence methods are not subjected to the same clinical evaluation standards as are mandatory for drugs. Here, we provide an evidence-based review of the pros and cons of commercially available automated methods for medical imaging diagnosis, including those based on artificial intelligence, to diagnose acute brain pathology on computed tomography or magnetic resonance imaging in patients with stroke.
Collapse
Affiliation(s)
- Joanna M Wardlaw
- Centre for Clinical Brain Sciences, UK Dementia Research Institute Centre at the University of Edinburgh, Little France, United Kingdom (J.M.W., G.M., W.L., A.F.)
| | - Grant Mair
- Centre for Clinical Brain Sciences, UK Dementia Research Institute Centre at the University of Edinburgh, Little France, United Kingdom (J.M.W., G.M., W.L., A.F.)
| | - Rüdiger von Kummer
- Institute of Diagnostic and Interventional Neuroradiology, Universitätsklinikum Carl Gustav Carus, Dresden, Germany (R.v.K.)
| | - Michelle C Williams
- Centre for Cardiovascular Science, University of Edinburgh, Little France, United Kingdom (M.C.W.)
| | - Wenwen Li
- Centre for Clinical Brain Sciences, UK Dementia Research Institute Centre at the University of Edinburgh, Little France, United Kingdom (J.M.W., G.M., W.L., A.F.)
| | | | - Emanuel Trucco
- VAMPIRE project, Computing, School of Science and Engineering, University of Dundee (E.T.)
| | | | - Andrew Farrall
- Centre for Clinical Brain Sciences, UK Dementia Research Institute Centre at the University of Edinburgh, Little France, United Kingdom (J.M.W., G.M., W.L., A.F.)
| | - Philip M Bath
- Stroke Trials Unit, Mental Health & Clinical Neuroscience, University of Nottingham, Queen's Medical Centre campus, United Kingdom (P.M.B.)
| | - Philip White
- Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne and Newcastle upon Tyne Hospitals NHS Trust, United Kingdom (P.W.)
| |
Collapse
|
47
|
Werder K, Ramesh B, Zhang R(S. Establishing Data Provenance for Responsible Artificial Intelligence Systems. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS 2022. [DOI: 10.1145/3503488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Data provenance, a record that describes the origins and processing of data, offers new promises in the increasingly important role of artificial intelligence (AI)-based systems in guiding human decision making. To avoid disastrous outcomes that can result from bias-laden AI systems, responsible AI builds on four important characteristics: fairness, accountability, transparency, and explainability. To stimulate further research on data provenance that enables responsible AI, this study outlines existing biases and discusses possible implementations of data provenance to mitigate them. We first review biases stemming from the data's origins and pre-processing. We then discuss the current state of practice, the challenges it presents, and corresponding recommendations to address them. We present a summary highlighting how our recommendations can help establish data provenance and thereby mitigate biases stemming from the data's origins and pre-processing to realize responsible AI-based systems. We conclude with a research agenda suggesting further research avenues.
Collapse
Affiliation(s)
- Karl Werder
- Cologne Institute for Information Systems, University of Cologne, Albertus-Magnus-Platz, Köln, Germany
| | | | | |
Collapse
|
48
|
Lin KY, Li YT, Han JY, Wu CC, Chu CM, Peng SY, Yeh TT. Deep Learning to Detect Triangular Fibrocartilage Complex Injury in Wrist MRI: Retrospective Study with Internal and External Validation. J Pers Med 2022; 12:jpm12071029. [PMID: 35887524 PMCID: PMC9322609 DOI: 10.3390/jpm12071029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/15/2022] [Accepted: 06/21/2022] [Indexed: 11/16/2022] Open
Abstract
Objective: To use deep learning to predict the probability of triangular fibrocartilage complex (TFCC) injury in patients’ MRI scans. Methods: We retrospectively studied medical records over 11 years and 2 months (1 January 2009–29 February 2019), collecting 332 contrast-enhanced hand MRI scans showing TFCC injury (143 scans) or not (189 scans) from a general hospital. We employed two convolutional neural networks with the MRNet (Algorithm 1) and ResNet50 (Algorithm 2) framework for deep learning. Explainable artificial intelligence was used for heatmap analysis. We tested deep learning using an external dataset containing the MRI scans of 12 patients with TFCC injuries and 38 healthy subjects. Results: In the internal dataset, Algorithm 1 had an AUC of 0.809 (95% confidence interval—CI: 0.670–0.947) for TFCC injury detection as well as an accuracy, sensitivity, and specificity of 75.6% (95% CI: 0.613–0.858), 66.7% (95% CI: 0.438–0.837), and 81.5% (95% CI: 0.633–0.918), respectively, and an F1 score of 0.686. Algorithm 2 had an AUC of 0.871 (95% CI: 0.747–0.995) for TFCC injury detection and an accuracy, sensitivity, and specificity of 90.7% (95% CI: 0.787–0.962), 88.2% (95% CI: 0.664–0.966), and 92.3% (95% CI: 0.763–0.978), respectively, and an F1 score of 0.882. The accuracy, sensitivity, and specificity for radiologist 1 were 88.9, 94.4 and 85.2%, respectively, and for radiologist 2, they were 71.1, 100 and 51.9%, respectively. Conclusions: A modified MRNet framework enables the detection of TFCC injury and guides accurate diagnosis.
Collapse
Affiliation(s)
- Kun-Yi Lin
- Department of Orthopedics, Tri-Service General Hospital, National Defense Medical Center, No. 325, Sec. 2, Chenggong Rd., Neihu District, Taipei 11490, Taiwan; (K.-Y.L.); (C.-C.W.)
| | - Yuan-Ta Li
- Department of Surgery, Tri-Service General Hospital Penghu Branch, National Defense Medical Center, Penghu 88056, Taiwan;
| | - Juin-Yi Han
- Graduate Institute of Technology, Innovation and Intellectual Property Management, National Cheng Chi University, Taipei 11605, Taiwan;
| | - Chia-Chun Wu
- Department of Orthopedics, Tri-Service General Hospital, National Defense Medical Center, No. 325, Sec. 2, Chenggong Rd., Neihu District, Taipei 11490, Taiwan; (K.-Y.L.); (C.-C.W.)
| | - Chi-Min Chu
- School of Public Health, National Defense Medical Center, Taipei 11490, Taiwan;
| | - Shao-Yu Peng
- Department of Animal Science, National Pingtung University of Science and Technology, Pingtung 91201, Taiwan;
| | - Tsu-Te Yeh
- Department of Orthopedics, Tri-Service General Hospital, National Defense Medical Center, No. 325, Sec. 2, Chenggong Rd., Neihu District, Taipei 11490, Taiwan; (K.-Y.L.); (C.-C.W.)
- Correspondence: ; Tel.: +886-2-87923311 or +886-2-87927185; Fax: +886-2-87927186
| |
Collapse
|
49
|
Bellamy D, Hernán MA, Beam A. A structural characterization of shortcut features for prediction. Eur J Epidemiol 2022; 37:563-568. [PMID: 35792990 PMCID: PMC9256901 DOI: 10.1007/s10654-022-00892-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 06/19/2022] [Indexed: 11/26/2022]
Abstract
With the rising use of machine learning for healthcare applications, practitioners are increasingly confronted with the limitations of prediction models that are trained in one setting but meant to be deployed in several others. One recently identified limitation is so-called shortcut learning, whereby a model learns to associate features with the prediction target that do not maintain their relationship across settings. Famously, the watermark on chest x-rays has been demonstrated to be an instance of a shortcut feature. In this viewpoint, we attempt to give a structural characterization of shortcut features in terms of causal DAGs. This is the first attempt at defining shortcut features in terms of their causal relationship with a model's prediction target.
Collapse
Affiliation(s)
- David Bellamy
- CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Miguel A Hernán
- CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Andrew Beam
- CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
50
|
Oosterhoff JHF, Savelberg ABMC, Karhade AV, Gravesteijn BY, Doornberg JN, Schwab JH, Heng M. Development and internal validation of a clinical prediction model using machine learning algorithms for 90 day and 2 year mortality in femoral neck fracture patients aged 65 years or above. Eur J Trauma Emerg Surg 2022; 48:4669-4682. [DOI: 10.1007/s00068-022-01981-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 04/16/2022] [Indexed: 12/01/2022]
Abstract
Abstract
Purpose
Preoperative prediction of mortality in femoral neck fracture patients aged 65 years or above may be valuable in the treatment decision-making. A preoperative clinical prediction model can aid surgeons and patients in the shared decision-making process, and optimize care for elderly femoral neck fracture patients. This study aimed to develop and internally validate a clinical prediction model using machine learning (ML) algorithms for 90 day and 2 year mortality in femoral neck fracture patients aged 65 years or above.
Methods
A retrospective cohort study at two trauma level I centers and three (non-level I) community hospitals was conducted to identify patients undergoing surgical fixation for a femoral neck fracture. Five different ML algorithms were developed and internally validated and assessed by discrimination, calibration, Brier score and decision curve analysis.
Results
In total, 2478 patients were included with 90 day and 2 year mortality rates of 9.1% (n = 225) and 23.5% (n = 582) respectively. The models included patient characteristics, comorbidities and laboratory values. The stochastic gradient boosting algorithm had the best performance for 90 day mortality prediction, with good discrimination (c-statistic = 0.74), calibration (intercept = − 0.05, slope = 1.11) and Brier score (0.078). The elastic-net penalized logistic regression algorithm had the best performance for 2 year mortality prediction, with good discrimination (c-statistic = 0.70), calibration (intercept = − 0.03, slope = 0.89) and Brier score (0.16). The models were incorporated into a freely available web-based application, including individual patient explanations for interpretation of the model to understand the reasoning how the model made a certain prediction: https://sorg-apps.shinyapps.io/hipfracturemortality/
Conclusions
The clinical prediction models show promise in estimating mortality prediction in elderly femoral neck fracture patients. External and prospective validation of the models may improve surgeon ability when faced with the treatment decision-making.
Level of evidence
Prognostic Level II.
Collapse
|