1
|
Bai X, Wang H, Qin Y, Han J, Yu N. SparseMorph: A weakly-supervised lightweight sparse transformer for mono- and multi-modal deformable image registration. Comput Biol Med 2024; 182:109205. [PMID: 39332116 DOI: 10.1016/j.compbiomed.2024.109205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Revised: 05/14/2024] [Accepted: 09/22/2024] [Indexed: 09/29/2024]
Abstract
PURPOSE Deformable image registration (DIR) is crucial for improving the precision of clinical diagnosis. Recent Transformer-based DIR methods have shown promising performance by capturing long-range dependencies. Nevertheless, these methods still grapple with high computational complexity. This work aims to enhance the performance of DIR in both computational efficiency and registration accuracy. METHODS We proposed a weakly-supervised lightweight Transformer model, named SparseMorph. To reduce computational complexity without compromising the representative feature capture ability, we designed a sparse multi-head self-attention (SMHA) mechanism. To accumulate representative features while preserving high computational efficiency, we constructed a multi-branch multi-layer perception (MMLP) module. Additionally, we developed an anatomically-constrained weakly-supervised strategy to guide the alignment of regions-of-interest in mono- and multi-modal images. RESULTS We assessed SparseMorph in terms of registration accuracy and computational complexity. Within the mono-modal brain datasets IXI and OASIS, our SparseMorph outperforms the state-of-the-art method TransMatch with improvements of 3.2 % and 2.9 % in DSC scores for MRI-to-CT registration tasks, respectively. Moreover, in the multi-modal cardiac dataset MMWHS, our SparseMorph shows DSC score improvements of 9.7 % and 11.4 % compared to TransMatch in MRI-to-CT and CT-to-MRI registration tasks, respectively. Notably, SparseMorph attains these performance advantages while utilizing 33.33 % of the parameters of TransMatch. CONCLUSIONS The proposed weakly-supervised deformable image registration model, SparseMorph, demonstrates efficiency in both mono- and multi-modal registration tasks, exhibiting superior performance compared to state-of-the-art algorithms, and establishing an effective DIR method for clinical applications.
Collapse
Affiliation(s)
- Xinhao Bai
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China; Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, Nankai University, Tianjin, 300350, China; Institute of Intelligence Technology and Robotic Systems, Shenzhen Research Institute of Nankai University, Shenzhen, 518083, China
| | - Hongpeng Wang
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China; Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, Nankai University, Tianjin, 300350, China; Institute of Intelligence Technology and Robotic Systems, Shenzhen Research Institute of Nankai University, Shenzhen, 518083, China
| | - Yanding Qin
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China; Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, Nankai University, Tianjin, 300350, China; Institute of Intelligence Technology and Robotic Systems, Shenzhen Research Institute of Nankai University, Shenzhen, 518083, China
| | - Jianda Han
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China; Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, Nankai University, Tianjin, 300350, China; Institute of Intelligence Technology and Robotic Systems, Shenzhen Research Institute of Nankai University, Shenzhen, 518083, China
| | - Ningbo Yu
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China; Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, Nankai University, Tianjin, 300350, China; Institute of Intelligence Technology and Robotic Systems, Shenzhen Research Institute of Nankai University, Shenzhen, 518083, China.
| |
Collapse
|
2
|
Wang L, Zhang X, Chen P, Zhou D. Doctor simulator: Delta-Age-Sex-AdaIn enhancing bone age assessment through AdaIn style transfer. Pediatr Radiol 2024; 54:1704-1712. [PMID: 39060414 DOI: 10.1007/s00247-024-06000-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 07/04/2024] [Accepted: 07/05/2024] [Indexed: 07/28/2024]
Abstract
BACKGROUND Bone age assessment assists physicians in evaluating the growth and development of children. However, deep learning methods for bone age estimation do not currently incorporate differential features obtained through comparisons with other bone atlases. OBJECTIVE To propose a more accurate method, Delta-Age-Sex-AdaIn (DASA-net), for bone age assessment, this paper combines age and sex distribution through adaptive instance normalization (AdaIN) and style transfer, simulating the process of visually comparing hand images with a standard bone atlas to determine bone age. MATERIALS AND METHODS The proposed Delta-Age-Sex-AdaIn (DASA-net) consists of four modules: BoneEncoder, Binary code distribution, Delta-Age-Sex-AdaIn, and AgeDecoder. It is compared with state-of-the-art methods on both a public Radiological Society of North America (RSNA) pediatric bone age prediction dataset (14,236 hand radiographs, ranging from 1 to 228 months) and a private bone age prediction dataset from Zigong Fourth People's Hospital (474 hand radiographs, ranging from 12 to 218 months, 268 male). Ablation experiments were designed to demonstrate the necessity of incorporating age distribution and sex distribution. RESULTS The DASA-net model achieved a lower mean absolute deviation (MAD) of 3.52 months on the RSNA dataset, outperforming other methods such as BoneXpert, Deeplasia, BoNet, and other deep learning based methods. On the private dataset, the DASA-net model obtained a MAD of 3.82 months, which is also superior to other methods. CONCLUSION The proposed DASA-net model aided the model's learning of the distinctive characteristics of hand bones of various ages and both sexes by integrating age and sex distribution into style transfer.
Collapse
Affiliation(s)
- Liping Wang
- Department of Computer Center, Zigong Fourth People's Hospital, Zigong, 643000, Sichuan, China.
| | - Xingpeng Zhang
- School of Computer Science and Software Engineering, Southwest Petroleum University, Chengdu, 610500, Sichuan, China
| | | | - Dehao Zhou
- Department of Computer Center, Zigong Fourth People's Hospital, Zigong, 643000, Sichuan, China
| |
Collapse
|
3
|
Sistaninejhad B, Rasi H, Nayeri P. A Review Paper about Deep Learning for Medical Image Analysis. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2023; 2023:7091301. [PMID: 37284172 PMCID: PMC10241570 DOI: 10.1155/2023/7091301] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 02/12/2023] [Accepted: 04/21/2023] [Indexed: 06/08/2023]
Abstract
Medical imaging refers to the process of obtaining images of internal organs for therapeutic purposes such as discovering or studying diseases. The primary objective of medical image analysis is to improve the efficacy of clinical research and treatment options. Deep learning has revamped medical image analysis, yielding excellent results in image processing tasks such as registration, segmentation, feature extraction, and classification. The prime motivations for this are the availability of computational resources and the resurgence of deep convolutional neural networks. Deep learning techniques are good at observing hidden patterns in images and supporting clinicians in achieving diagnostic perfection. It has proven to be the most effective method for organ segmentation, cancer detection, disease categorization, and computer-assisted diagnosis. Many deep learning approaches have been published to analyze medical images for various diagnostic purposes. In this paper, we review the work exploiting current state-of-the-art deep learning approaches in medical image processing. We begin the survey by providing a synopsis of research works in medical imaging based on convolutional neural networks. Second, we discuss popular pretrained models and general adversarial networks that aid in improving convolutional networks' performance. Finally, to ease direct evaluation, we compile the performance metrics of deep learning models focusing on COVID-19 detection and child bone age prediction.
Collapse
Affiliation(s)
| | - Habib Rasi
- Sahand University of Technology, East Azerbaijan, New City of Sahand, Iran
| | - Parisa Nayeri
- Khoy University of Medical Sciences, West Azerbaijan, Khoy, Iran
| |
Collapse
|
4
|
Abstract
Bone age is commonly used to reflect growth and development trends in children, predict adult heights, and diagnose endocrine disorders. Nevertheless, the existing automated bone age assessment (BAA) models do not consider the nonlinearity and continuity of hand bone development simultaneously. In addition, most existing BAA models are based on datasets from European and American children and may not be applicable to the developmental characteristics of Chinese children. Thus, this work proposes a cascade model that fuses prior knowledge. Specifically, a novel bone age representation is defined, which incorporates nonlinear and continuous features of skeletal development and is implemented by a cascade model. Moreover, corresponding regions of interest (RoIs) based on RUS-CHN were extracted by YOLO v5 as prior knowledge inputs to the model. In addition, based on MobileNet v2, an improved feature extractor was proposed by introducing the Convolutional Block Attention Module and increasing the receptive field to improve the accuracy of the evaluation. The experimental results show that the mean absolute error (MAE) is 4.44 months and significant correlations with the reference bone age is (r = 0.994, p < 0.01); accuracy is 94.04% for ground truth within ±1 year. Overall, the model design adequately considers hand bone development features and has high accuracy and consistency, and it also has some applicability on public datasets, showing potential for practical and clinical applications.
Collapse
|
5
|
Loraksa C, Mongkolsomlit S, Nimsuk N, Uscharapong M, Kiatisevi P. Effectiveness of Learning Systems from Common Image File Types to Detect Osteosarcoma Based on Convolutional Neural Networks (CNNs) Models. J Imaging 2021; 8:jimaging8010002. [PMID: 35049843 PMCID: PMC8779891 DOI: 10.3390/jimaging8010002] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 12/13/2021] [Accepted: 12/23/2021] [Indexed: 12/18/2022] Open
Abstract
Osteosarcoma is a rare bone cancer which is more common in children than in adults and has a high chance of metastasizing to the patient’s lungs. Due to initiated cases, it is difficult to diagnose and hard to detect the nodule in a lung at the early state. Convolutional Neural Networks (CNNs) are effectively applied for early state detection by considering CT-scanned images. Transferring patients from small hospitals to the cancer specialized hospital, Lerdsin Hospital, poses difficulties in information sharing because of the privacy and safety regulations. CD-ROM media was allowed for transferring patients’ data to Lerdsin Hospital. Digital Imaging and Communications in Medicine (DICOM) files cannot be stored on a CD-ROM. DICOM must be converted into other common image formats, such as BMP, JPG and PNG formats. Quality of images can affect the accuracy of the CNN models. In this research, the effect of different image formats is studied and experimented. Three popular medical CNN models, VGG-16, ResNet-50 and MobileNet-V2, are considered and used for osteosarcoma detection. The positive and negative class images are corrected from Lerdsin Hospital, and 80% of all images are used as a training dataset, while the rest are used to validate the trained models. Limited training images are simulated by reducing images in the training dataset. Each model is trained and validated by three different image formats, resulting in 54 testing cases. F1-Score and accuracy are calculated and compared for the models’ performance. VGG-16 is the most robust of all the formats. PNG format is the most preferred image format, followed by BMP and JPG formats, respectively.
Collapse
Affiliation(s)
- Chanunya Loraksa
- Medical Engineering, Faculty of Engineering, Thammasat University, Pathum Thani 12121, Thailand;
- Correspondence: ; Tel.: +66-(0)63-241-5888
| | | | - Nitikarn Nimsuk
- Medical Engineering, Faculty of Engineering, Thammasat University, Pathum Thani 12121, Thailand;
| | - Meenut Uscharapong
- Department of Medical Services, Lerdsin Hospital, Ministry of Public Health in Thailand, Bangkok 10500, Thailand; (M.U.); (P.K.)
| | - Piya Kiatisevi
- Department of Medical Services, Lerdsin Hospital, Ministry of Public Health in Thailand, Bangkok 10500, Thailand; (M.U.); (P.K.)
| |
Collapse
|
6
|
Zulkifley MA, Mohamed NA, Abdani SR, Kamari NAM, Moubark AM, Ibrahim AA. Intelligent Bone Age Assessment: An Automated System to Detect a Bone Growth Problem Using Convolutional Neural Networks with Attention Mechanism. Diagnostics (Basel) 2021; 11:765. [PMID: 33923215 PMCID: PMC8146101 DOI: 10.3390/diagnostics11050765] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/18/2021] [Accepted: 04/22/2021] [Indexed: 11/29/2022] Open
Abstract
Skeletal bone age assessment using X-ray images is a standard clinical procedure to detect any anomaly in bone growth among kids and babies. The assessed bone age indicates the actual level of growth, whereby a large discrepancy between the assessed and chronological age might point to a growth disorder. Hence, skeletal bone age assessment is used to screen the possibility of growth abnormalities, genetic problems, and endocrine disorders. Usually, the manual screening is assessed through X-ray images of the non-dominant hand using the Greulich-Pyle (GP) or Tanner-Whitehouse (TW) approach. The GP uses a standard hand atlas, which will be the reference point to predict the bone age of a patient, while the TW uses a scoring mechanism to assess the bone age using several regions of interest information. However, both approaches are heavily dependent on individual domain knowledge and expertise, which is prone to high bias in inter and intra-observer results. Hence, an automated bone age assessment system, which is referred to as Attention-Xception Network (AXNet) is proposed to automatically predict the bone age accurately. The proposed AXNet consists of two parts, which are image normalization and bone age regression modules. The image normalization module will transform each X-ray image into a standardized form so that the regressor network can be trained using better input images. This module will first extract the hand region from the background, which is then rotated to an upright position using the angle calculated from the four key-points of interest. Then, the masked and rotated hand image will be aligned such that it will be positioned in the middle of the image. Both of the masked and rotated images will be obtained through existing state-of-the-art deep learning methods. The last module will then predict the bone age through the Attention-Xception network that incorporates multiple layers of spatial-attention mechanism to emphasize the important features for more accurate bone age prediction. From the experimental results, the proposed AXNet achieves the lowest mean absolute error and mean squared error of 7.699 months and 108.869 months2, respectively. Therefore, the proposed AXNet has demonstrated its potential for practical clinical use with an error of less than one year to assist the experts or radiologists in evaluating the bone age objectively.
Collapse
|
7
|
Adaptive Multi-View Image Mosaic Method for Conveyor Belt Surface Fault Online Detection. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11062564] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
In order to improve the accuracy and real-time of image mosaic, realize the multi-view conveyor belt surface fault online detection, and solve the problem of longitudinal tear of conveyor belt, we in this paper propose an adaptive multi-view image mosaic (AMIM) method based on the combination of grayscale and feature. Firstly, the overlapping region of two adjacent images is preliminarily estimated by establishing the overlapping region estimation model, and then the grayscale-based method is used to register the overlapping region. Secondly, the image of interest (IOI) detection algorithm is used to divide the IOI and the non-IOI. Thirdly, only for the IOI, the feature-based partition and block registration method is used to register the images more accurately, the overlapping region is adaptively segmented, the speeded up robust features (SURF) algorithm is used to extract the feature points, and the random sample consensus (RANSAC) algorithm is used to achieve accurate registration. Finally, the improved weighted smoothing algorithm is used to fuse the two adjacent images. The experimental results showed that the registration rate reached 97.67%, and the average time of stitching was less than 500 ms. This method is accurate and fast, and is suitable for conveyor belt surface fault online detection.
Collapse
|