1
|
Zhang C, Zhao M, Xie Y, Ding R, Ma M, Guo K, Jiang H, Xi W, Xia L. TL-MSE 2-Net: Transfer learning based nested model for cerebrovascular segmentation with aneurysms. Comput Biol Med 2023; 167:107609. [PMID: 37883854 DOI: 10.1016/j.compbiomed.2023.107609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Revised: 10/11/2023] [Accepted: 10/17/2023] [Indexed: 10/28/2023]
Abstract
Cerebrovascular (i.e., cerebral vessel) segmentation is essential for diagnosing and treating brain diseases. Convolutional neural network models, such as U-Net, are commonly used for this purpose. Unfortunately, such models may not be entirely satisfactory in dealing with cerebrovascular segmentation with tumors due to the following issues: (1) Relatively small number of clinical datasets from patients obtained through different modalities such as computed tomography (CT) and magnetic resonance imaging (MRI), leading to inadequate training and lack of transferability in the modeling; (2) Insufficient feature extraction caused by less attention to both convolution sizes and cerebral vessel edges. Inspired by the existence of similar features on cerebral vessels between normal subjects and patients, we propose a transfer learning strategy based on a pre-trained nested model called TL-MSE2-Net. This model uses one of the publicly available datasets for cerebrovascular segmentation with aneurysms. To address issue (1), our transfer learning strategy leverages a pre-trained model that uses a large number of datasets from normal subjects, providing a potential solution to the lack of sufficient clinical datasets. To tackle issue (2), we structure the pre-trained model based on 3D U-Net, comprising three blocks: ResMul, DeRes, and REAM. The ResMul and DeRes blocks enhance feature extraction by utilizing multiple convolution sizes to capture multiscale features, and the REAM block increases the weight of the voxels on the edges of the given 3D volume. We evaluated the proposed model on one small private clinical dataset and two publicly available datasets. The experimental results demonstrated that our MSE2-Net framework achieved an average Dice score of 70.81 % and 89.08 % on the two publicly available datasets, outperforming other state-of-the-art methods. Ablation studies were also conducted to validate the effectiveness of each block. The proposed TL-MSE2-Net yielded better results than MSE2-Net on a small private clinical dataset, with increases of 5.52 %, 3.37 %, 6.71 %, and 0.85 % for the Dice score, sensitivity, Jaccard index, and precision, respectively.
Collapse
Affiliation(s)
- Chaoran Zhang
- Laboratory of Neural Computing and Intelligent Perception (NCIP), Capital Normal University, Beijing, 100048, China
| | - Ming Zhao
- Department of Neurosurgery, First Medical Center, Chinese PLA General Hospital, Beijing, 100853, China
| | - Yixuan Xie
- Laboratory of Neural Computing and Intelligent Perception (NCIP), Capital Normal University, Beijing, 100048, China
| | - Rui Ding
- Laboratory of Neural Computing and Intelligent Perception (NCIP), Capital Normal University, Beijing, 100048, China
| | - Ming Ma
- Department of Computer Science, Winona State University, Winona, MN, 55987, USA
| | - Kaiwen Guo
- Laboratory of Neural Computing and Intelligent Perception (NCIP), Capital Normal University, Beijing, 100048, China
| | - Hongzhen Jiang
- Department of Neurosurgery, First Medical Center, Chinese PLA General Hospital, Beijing, 100853, China
| | - Wei Xi
- Department of Radiology, Fourth Medical Center, Chinese PLA General Hospital, Beijing, 100048, China
| | - Likun Xia
- Laboratory of Neural Computing and Intelligent Perception (NCIP), Capital Normal University, Beijing, 100048, China.
| |
Collapse
|
2
|
Zhou S, Xu H, Bai Z, Du Z, Zeng J, Wang Y, Wang Y, Li S, Wang M, Li Y, Li J, Xu J. A multidimensional feature fusion network based on MGSE and TAAC for video-based human action recognition. Neural Netw 2023; 168:496-507. [PMID: 37827068 DOI: 10.1016/j.neunet.2023.09.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 09/14/2023] [Accepted: 09/18/2023] [Indexed: 10/14/2023]
Abstract
With the maturity of intelligent technology such as human-computer interaction, human action recognition (HAR) technology has been widely used in virtual reality, video surveillance, and other fields. However, the current video-based HAR methods still cannot fully extract abstract action features, and there is still a lack of action collection and recognition for special personnel such as prisoners and elderly people living alone. To solve the above problems, this paper proposes a multidimensional feature fusion network, called P-MTSC3D, a parallel network based on context modeling and temporal adaptive attention module. It consists of three branches. The first branch serves as the basic network branch, which extracts basic feature information. The second branch consists of a feature pre-extraction layer and two multiscale-convolution-based global context modeling combined squeeze and excitation (MGSE) modules, which can extract spatial and channel features. The third branch consists of two temporal adaptive attention units based on convolution (TAAC) to extract temporal dimension features. In order to verify the validity of the proposed network, this paper conducts experiments on the University of Central Florida (UCF) 101 dataset and the human motion database (HMDB) 51 dataset. The recognition accuracy of the proposed P-MTSC3D network is 97.92% on the UCF101 dataset and 75.59% on the HMDB51 dataset, respectively. The FLOPs of the P-MTSC3D network is 30.85G, and the test time is 2.83 s/16 samples on the UCF101 dataset. The experimental results demonstrate that the P-MTSC3D network has better overall performance than the state-of-the-art networks. In addition, a prison action (PA) dataset is constructed in this paper to verify the application effect of the proposed network in actual scenarios.
Collapse
Affiliation(s)
- Shuang Zhou
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China
| | - Hongji Xu
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China.
| | - Zhiquan Bai
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China.
| | - Zhengfeng Du
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China
| | - Jiaqi Zeng
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China
| | - Yang Wang
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China
| | - Yuhao Wang
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China
| | - Shijie Li
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China
| | - Mengmeng Wang
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China
| | - Yiran Li
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China
| | - Jianjun Li
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China
| | - Jie Xu
- School of Information Science and Engineering, Shandong University, 72 Binhai Road, Qingdao, 266237, Shandong, China
| |
Collapse
|