1
|
Yan Z, Chang C, Kang Z, Chen C, Lv X, Chen C. Application of one-dimensional hierarchical network assisted screening for cervical cancer based on Raman spectroscopy combined with attention mechanism. Photodiagnosis Photodyn Ther 2024:104086. [PMID: 38608802 DOI: 10.1016/j.pdpdt.2024.104086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/07/2024] [Accepted: 04/10/2024] [Indexed: 04/14/2024]
Abstract
Cervical cancer is one of the most common malignant tumors among women, and its pathological change is a relatively slow process. If it can be detected in time and treated properly, it can effectively reduce the incidence rate and mortality rate of cervical cancer, so the early screening of cervical cancer is particularly critical and significant. In this paper, we used Raman spectroscopy technology to collect the tissue sample data of patients with cervicitis, Low-grade Squamous Intraepithelial Lesion, High-grade Squamous Intraepithelial Lesion, Well differentiated squamous cell carcinoma, Moderately differentiated squamous cell carcinoma, Poorly differentiated squamous cell carcinoma and cervical adenocarcinoma. A one-dimensional hierarchical convolutional neural network based on attention mechanism was constructed to classify and identify seven types of tissue samples. The attention mechanism Efficient Channel Attention Networks module and Squeeze-and-Excitation Networks module were combined with the established one-dimensional convolutional hierarchical network model, and the results showed that the combined model had better diagnostic performance. The average accuracy, F1, and AUC of the Principal Component Analysis-Squeeze and Excitation-hierarchical network model after 5-fold cross validations could reach 96.49%±2.12%, 0.9663±0.0253, and 0.9815±0.0224, respectively, which were 1.58%, 0.0140, and 0.008 higher than those of hierarchical network. The recall rate of the Principal Component Analysis-Efficient Channel Attention-hierarchical network model was as high as 96.78%±2.85%, which is 1.47% higher than hierarchical network. Compared with the classification results of traditional CNN and ResNet for seven types of cervical cancer staging, the accuracy of the Principal Component Analysis-Squeeze and Excitation-hierarchical network model is 3.33% and 11.05% higher, respectively. The experimental results indicate that the model established in this study is easy to operate and has high accuracy. It has good reference value for rapid screening of cervical cancer, laying a foundation for further research on Raman spectroscopy as a clinical diagnostic method for cervical cancer.
Collapse
Affiliation(s)
- Ziwei Yan
- College of Software, Xinjiang University, Urumqi, China
| | - Chenjie Chang
- School of Computer Science and Technology, Xinjiang University, Urumqi, China
| | - Zhenping Kang
- School of Computer Science and Technology, Xinjiang University, Urumqi, China
| | - Chen Chen
- College of Software, Xinjiang University, Urumqi, China
| | - Xiaoyi Lv
- School of Computer Science and Technology, Xinjiang University, Urumqi, China
| | - Cheng Chen
- College of Software, Xinjiang University, Urumqi, China.
| |
Collapse
|
2
|
Li X, Huang Y, Ning Y, Wang M, Cai W. Multi-branch myocardial infarction detection and localization framework based on multi-instance learning and domain knowledge. Physiol Meas 2024. [PMID: 38599223 DOI: 10.1088/1361-6579/ad3d25] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
OBJECTIVE
Myocardial infarction (MI) is a serious cardiovascular disease that can cause irreversible damage to the heart, making early identification and treatment crucial. However, automatic MI detection and localization from an electrocardiogram (ECG) remain challenging. In this study, we propose two models, MFB-SENET and MFB-DMIL, for MI detection and localization, respectively. APPROACH The MFB-SENET model is designed to detect MI, while the MFB-DMIL model is designed to localize MI. The MI localization model employs a specialized attention mechanism to integrate multi-instance learning with domain knowledge. This approach incorporates handcrafted features and introduces a new loss function called lead-loss, to improve MI localization. Grad-CAM is employed to visualize the decision-making process.
Main Results:
The proposed method was evaluated on the PTB and PTB-XL databases. Under the inter-patient scheme, the accuracy of MI detection and localization on the PTB database reached 93.88% and 67.17%, respectively. The accuracy of MI detection and localization on the PTB-XL database were 94.89% and 85.83%, respectively. SIGNIFICANCE Our method achieved comparable or better performance than other state-of-the-art algorithms. The proposed method combined deep learning and medical domain knowledge, demonstrates effectiveness and reliability, holding promise as an efficient MI diagnostic tool to assist physicians in formulating accurate diagnoses.
.
Collapse
Affiliation(s)
- Xinyue Li
- University of Shanghai for Science and Technology, 516 Jungong Road, Yangpu District, Shanghai, 200093, CHINA
| | - Yangcheng Huang
- University of Shanghai for Science and Technology, 516 Jungong Road, Yangpu District, Shanghai, Shanghai, 200093, CHINA
| | - Yixin Ning
- University of Shanghai for Science and Technology, 516 Jungong Road, Yangpu District, Shanghai, 200093, CHINA
| | - Mingjie Wang
- Fudan University School of Basic Medical Sciences, 130 Dongan Road, Xuhui District, Shanghai, 200032, CHINA
| | - Wenjie Cai
- University of Shanghai for Science and Technology, 516 Jungong Road, Yangpu District, Shanghai, 200093, CHINA
| |
Collapse
|
3
|
Wang Y, Zhang P, Tian S. Tomato leaf disease detection based on attention mechanism and multi-scale feature fusion. Front Plant Sci 2024; 15:1382802. [PMID: 38654901 PMCID: PMC11035761 DOI: 10.3389/fpls.2024.1382802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 03/25/2024] [Indexed: 04/26/2024]
Abstract
When detecting tomato leaf diseases in natural environments, factors such as changes in lighting, occlusion, and the small size of leaf lesions pose challenges to detection accuracy. Therefore, this study proposes a tomato leaf disease detection method based on attention mechanisms and multi-scale feature fusion. Firstly, the Convolutional Block Attention Module (CBAM) is introduced into the backbone feature extraction network to enhance the ability to extract lesion features and suppress the effects of environmental interference. Secondly, shallow feature maps are introduced into the re-parameterized generalized feature pyramid network (RepGFPN), constructing a new multi-scale re-parameterized generalized feature fusion module (BiRepGFPN) to enhance feature fusion expression and improve the localization ability for small lesion features. Finally, the BiRepGFPN replaces the Path Aggregation Feature Pyramid Network (PAFPN) in the YOLOv6 model to achieve effective fusion of deep semantic and shallow spatial information. Experimental results indicate that, when evaluated on the publicly available PlantDoc dataset, the model's mean average precision (mAP) showed improvements of 7.7%, 11.8%, 3.4%, 5.7%, 4.3%, and 2.6% compared to YOLOX, YOLOv5, YOLOv6, YOLOv6-s, YOLOv7, and YOLOv8, respectively. When evaluated on the tomato leaf disease dataset, the model demonstrated a precision of 92.9%, a recall rate of 95.2%, an F1 score of 94.0%, and a mean average precision (mAP) of 93.8%, showing improvements of 2.3%, 4.0%, 3.1%, and 2.7% respectively compared to the baseline model. These results indicate that the proposed detection method possesses significant detection performance and generalization capabilities.
Collapse
Affiliation(s)
- Yong Wang
- School of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China
| | | | | |
Collapse
|
4
|
Tao W, Wang X, Yan T, Liu Z, Wan S. ESF-YOLO: an accurate and universal object detector based on neural networks. Front Neurosci 2024; 18:1371418. [PMID: 38650621 PMCID: PMC11033406 DOI: 10.3389/fnins.2024.1371418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 03/28/2024] [Indexed: 04/25/2024] Open
Abstract
As an excellent single-stage object detector based on neural networks, YOLOv5 has found extensive applications in the industrial domain; however, it still exhibits certain design limitations. To address these issues, this paper proposes Efficient Scale Fusion YOLO (ESF-YOLO). Firstly, the Multi-Sampling Conv Module (MSCM) is designed, which enhances the backbone network's learning capability for low-level features through multi-scale receptive fields and cross-scale feature fusion. Secondly, to tackle occlusion issues, a new Block-wise Channel Attention Module (BCAM) is designed, assigning greater weights to channels corresponding to critical information. Next, a lightweight Decoupled Head (LD-Head) is devised. Additionally, the loss function is redesigned to address asynchrony between labels and confidences, alleviating the imbalance between positive and negative samples during the neural network training. Finally, an adaptive scale factor for Intersection over Union (IoU) calculation is innovatively proposed, adjusting bounding box sizes adaptively to accommodate targets of different sizes in the dataset. Experimental results on the SODA10M and CBIA8K datasets demonstrate that ESF-YOLO increases Average Precision at 0.50 IoU (AP50) by 3.93 and 2.24%, Average Precision at 0.75 IoU (AP75) by 4.77 and 4.85%, and mean Average Precision (mAP) by 4 and 5.39%, respectively, validating the model's broad applicability.
Collapse
Affiliation(s)
- Wenguang Tao
- Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
| | - Xiaotian Wang
- Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
| | - Tian Yan
- Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
| | - Zhengzhuo Liu
- Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
| | - Shizheng Wan
- Shanghai Electro-Mechanical Engineering Institute, Shanghai, China
| |
Collapse
|
5
|
Xu N, Ma Z, Xia Y, Dong Y, Zi J, Xu D, Xu F, Su X, Zhang H, Chen F. A Serial Multi-Scale Feature Fusion and Enhancement Network for Amur Tiger Re-Identification. Animals (Basel) 2024; 14:1106. [PMID: 38612345 PMCID: PMC11011027 DOI: 10.3390/ani14071106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 03/26/2024] [Accepted: 04/02/2024] [Indexed: 04/14/2024] Open
Abstract
The Amur tiger is an important endangered species in the world, and its re-identification (re-ID) plays an important role in regional biodiversity assessment and wildlife resource statistics. This paper focuses on the task of Amur tiger re-ID based on visible light images from screenshots of surveillance videos or camera traps, aiming to solve the problem of low accuracy caused by camera perspective, noisy background noise, changes in motion posture, and deformation of Amur tiger body patterns during the re-ID process. To overcome this challenge, we propose a serial multi-scale feature fusion and enhancement re-ID network of Amur tiger for this task, in which global and local branches are constructed. Specifically, we design a global inverted pyramid multi-scale feature fusion method in the global branch to effectively fuse multi-scale global features and achieve high-level, fine-grained, and deep semantic feature preservation. We also design a local dual-domain attention feature enhancement method in the local branch, further enhancing local feature extraction and fusion by dividing local feature blocks. Based on the above model structure, we evaluated the effectiveness and feasibility of the model on the public dataset of the Amur Tiger Re-identification in the Wild (ATRW), and achieved good results on mAP, Rank-1, and Rank-5, demonstrating a certain competitiveness. In addition, since our proposed model does not require the introduction of additional expensive annotation information and does not incorporate other pre-training modules, it has important advantages such as strong transferability and simple training.
Collapse
Affiliation(s)
- Nuo Xu
- School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China; (N.X.); (Z.M.); (Y.X.); (Y.D.); (J.Z.); (D.X.); (F.X.); (X.S.); (H.Z.)
| | - Zhibin Ma
- School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China; (N.X.); (Z.M.); (Y.X.); (Y.D.); (J.Z.); (D.X.); (F.X.); (X.S.); (H.Z.)
| | - Yi Xia
- School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China; (N.X.); (Z.M.); (Y.X.); (Y.D.); (J.Z.); (D.X.); (F.X.); (X.S.); (H.Z.)
| | - Yanqi Dong
- School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China; (N.X.); (Z.M.); (Y.X.); (Y.D.); (J.Z.); (D.X.); (F.X.); (X.S.); (H.Z.)
| | - Jiali Zi
- School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China; (N.X.); (Z.M.); (Y.X.); (Y.D.); (J.Z.); (D.X.); (F.X.); (X.S.); (H.Z.)
| | - Delong Xu
- School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China; (N.X.); (Z.M.); (Y.X.); (Y.D.); (J.Z.); (D.X.); (F.X.); (X.S.); (H.Z.)
| | - Fu Xu
- School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China; (N.X.); (Z.M.); (Y.X.); (Y.D.); (J.Z.); (D.X.); (F.X.); (X.S.); (H.Z.)
- Engineering Research Center for Forestry-Oriented Intelligent Information Processing, National Forestry and Grassland Administration, Beijing 100083, China
| | - Xiaohui Su
- School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China; (N.X.); (Z.M.); (Y.X.); (Y.D.); (J.Z.); (D.X.); (F.X.); (X.S.); (H.Z.)
- Engineering Research Center for Forestry-Oriented Intelligent Information Processing, National Forestry and Grassland Administration, Beijing 100083, China
| | - Haiyan Zhang
- School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China; (N.X.); (Z.M.); (Y.X.); (Y.D.); (J.Z.); (D.X.); (F.X.); (X.S.); (H.Z.)
- Engineering Research Center for Forestry-Oriented Intelligent Information Processing, National Forestry and Grassland Administration, Beijing 100083, China
| | - Feixiang Chen
- School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China; (N.X.); (Z.M.); (Y.X.); (Y.D.); (J.Z.); (D.X.); (F.X.); (X.S.); (H.Z.)
- Engineering Research Center for Forestry-Oriented Intelligent Information Processing, National Forestry and Grassland Administration, Beijing 100083, China
| |
Collapse
|
6
|
Xu K, Zhang F, Huang Y, Huang X. 2.5D UNet with context-aware feature sequence fusion for accurate esophageal tumor semantic segmentation. Phys Med Biol 2024; 69:085002. [PMID: 38484399 DOI: 10.1088/1361-6560/ad3419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 03/14/2024] [Indexed: 04/04/2024]
Abstract
Segmenting esophageal tumor from computed tomography (CT) sequence images can assist doctors in diagnosing and treating patients with this malignancy. However, accurately extracting esophageal tumor features from CT images often present challenges due to their small area, variable position, and shape, as well as the low contrast with surrounding tissues. This results in not achieving the level of accuracy required for practical applications in current methods. To address this problem, we propose a 2.5D context-aware feature sequence fusion UNet (2.5D CFSF-UNet) model for esophageal tumor segmentation in CT sequence images. Specifically, we embed intra-slice multiscale attention feature fusion (Intra-slice MAFF) in each skip connection of UNet to improve feature learning capabilities, better expressing the differences between anatomical structures within CT sequence images. Additionally, the inter-slice context fusion block (Inter-slice CFB) is utilized in the center bridge of UNet to enhance the depiction of context features between CT slices, thereby preventing the loss of structural information between slices. Experiments are conducted on a dataset of 430 esophageal tumor patients. The results show an 87.13% dice similarity coefficient, a 79.71% intersection over union and a 2.4758 mm Hausdorff distance, which demonstrates that our approach can improve contouring consistency and can be applied to clinical applications.
Collapse
Affiliation(s)
- Kai Xu
- Scholl of the Internet, Anhui university, Anhui, 230039, People's Republic of China
| | - Feixiang Zhang
- Scholl of the Internet, Anhui university, Anhui, 230039, People's Republic of China
| | - Yong Huang
- Department of Medical Oncology, The Second People's Hospital of Hefei, Hefei, 230011, People's Republic of China
| | - Xiaoyu Huang
- Department of Chinese Integrative Medicine Oncology, The First Affiliated Hospital of Anhui Medical University, Hefei, 230022, People's Republic of China
| |
Collapse
|
7
|
Ru Y, Wei Z, An G, Chen H. Combining data augmentation and deep learning for improved epilepsy detection. Front Neurol 2024; 15:1378076. [PMID: 38633533 PMCID: PMC11021591 DOI: 10.3389/fneur.2024.1378076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 03/18/2024] [Indexed: 04/19/2024] Open
Abstract
Introduction In recent years, the use of EEG signals for seizure detection has gained widespread academic attention. Aiming at the problem of overfitting deep learning models due to the small number of EEG signal data during epilepsy detection, this paper proposes an epilepsy detection method that combines data augmentation and deep learning. Methods First, the Adversarial and Mixup Data Augmentation (AMDA) method is used to realize the data augmentation, which effectively enriches the number of training samples. To further improve the classification accuracy and robustness of epilepsy detection, this paper proposes a one-dimensional convolutional neural network and gated recurrent unit (AM-1D CNN-GRU) network model based on attention mechanism for epilepsy detection. Results and discussion The experimental results show that the performance of epilepsy detection achieved by using augmented data is significantly improved, and the accuracy, sensitivity, and area under the subject's working characteristic curve are up to 96.06, 95.48%, and 0.9637, respectively. Compared with the non-augmented data, all indicators are increased by more than 6.2%. Meanwhile, the detection performance was significantly improved compared with other epilepsy detection methods. The results of this research can provide a reference for the clinical application of epilepsy detection.
Collapse
Affiliation(s)
- Yandong Ru
- School of Information Engineering, Zhejiang Ocean University, Zhoushan, China
- Key Laboratory of Oceanographic Big Data Mining & Application of Zhejiang Province, Zhejiang Ocean University, Zhoushan, China
| | - Zheng Wei
- School of Electronics and Information Engineering, Heilongjiang University of Science and Technology, Harbin, China
| | - Gaoyang An
- School of Electronics and Information Engineering, Heilongjiang University of Science and Technology, Harbin, China
| | - Hongming Chen
- School of Information Engineering, Zhejiang Ocean University, Zhoushan, China
- Key Laboratory of Oceanographic Big Data Mining & Application of Zhejiang Province, Zhejiang Ocean University, Zhoushan, China
| |
Collapse
|
8
|
Liang X, Zhao H, Wang J. MA-PEP: A novel anticancer peptide prediction framework with multimodal feature fusion based on attention mechanism. Protein Sci 2024; 33:e4966. [PMID: 38532681 DOI: 10.1002/pro.4966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 01/30/2024] [Accepted: 03/06/2024] [Indexed: 03/28/2024]
Abstract
AntiCancer Peptides (ACPs) have emerged as promising therapeutic agents for cancer treatment. The time-consuming and costly nature of wet-lab discriminatory methods has spurred the development of various machine learning and deep learning-based ACP classification methods. Nonetheless, current methods encountered challenges in efficiently integrating features from various peptide modalities, thereby limiting a more comprehensive understanding of ACPs and further restricting the improvement of prediction model performance. In this study, we introduce a novel ACP prediction method, MA-PEP, which leverages multiple attention mechanisms for feature enhancement and fusion to improve ACP prediction. By integrating the enhanced molecular-level chemical features and sequence information of peptides, MA-PEP demonstrates superior prediction performance across several benchmark datasets, highlighting its efficacy in ACP prediction. Moreover, the visual analysis and case studies further demonstrate MA-PEP's reliable feature extraction capability and its promise in the realm of ACP exploration. The code and datasets for MA-PEP are available at https://github.com/liangxiaodata/MA-PEP.
Collapse
Affiliation(s)
- Xiao Liang
- School of Computer Science and Engineering, Central South University, Changsha, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, China
| | - Haochen Zhao
- School of Computer Science and Engineering, Central South University, Changsha, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, China
| |
Collapse
|
9
|
Chen L, Leng L, Yang Z, Teoh ABJ. Enhanced Multitask Learning for Hash Code Generation of Palmprint Biometrics. Int J Neural Syst 2024; 34:2450020. [PMID: 38414422 DOI: 10.1142/s0129065724500205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
This paper presents a novel multitask learning framework for palmprint biometrics, which optimizes classification and hashing branches jointly. The classification branch within our framework facilitates the concurrent execution of three distinct tasks: identity recognition and classification of soft biometrics, encompassing gender and chirality. On the other hand, the hashing branch enables the generation of palmprint hash codes, optimizing for minimal storage as templates and efficient matching. The hashing branch derives the complementary information from these tasks by amalgamating knowledge acquired from the classification branch. This approach leads to superior overall performance compared to individual tasks in isolation. To enhance the effectiveness of multitask learning, two additional modules, an attention mechanism module and a customized gate control module, are introduced. These modules are vital in allocating higher weights to crucial channels and facilitating task-specific expert knowledge integration. Furthermore, an automatic weight adjustment module is incorporated to optimize the learning process further. This module fine-tunes the weights assigned to different tasks, improving performance. Integrating the three modules above has shown promising accuracies across various classification tasks and has notably improved authentication accuracy. The extensive experimental results validate the efficacy of our proposed framework.
Collapse
Affiliation(s)
- Lin Chen
- Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition, Nanchang Hangkong University, Nanchang, Jiangxi, P. R. China
| | - Lu Leng
- Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition, Nanchang Hangkong University, Nanchang, Jiangxi, P. R. China
| | - Ziyuan Yang
- College of Computer Science, Sichuan University, Chengdu, Sichuan, P. R. China
| | - Andrew Beng Jin Teoh
- School of Electrical and Electronic Engineering, College of Engineering, Yonsei University Seoul, Republic of Korea
| |
Collapse
|
10
|
Hu X, Li X, Huang Z, Chen Q, Lin S. Detecting tea tree pests in complex backgrounds using a hybrid architecture guided by transformers and multi-scale attention mechanism. J Sci Food Agric 2024; 104:3570-3584. [PMID: 38150568 DOI: 10.1002/jsfa.13241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 12/15/2023] [Accepted: 12/23/2023] [Indexed: 12/29/2023]
Abstract
BACKGROUND Tea pests pose a significant threat to tea leaf yield and quality, necessitating fast and accurate detection methods to improve pest control efficiency and reduce economic losses for tea farmers. However, in real tea gardens, some tea pests are small in size and easily camouflaged by complex backgrounds, making it challenging for farmers to promptly and accurately identify them. RESULTS To address this issue, we propose a real-time detection method based on TP-YOLOX for monitoring tea pests in complex backgrounds. Our approach incorporates the CSBLayer module, which combines convolution and multi-head self-attention mechanisms, to capture global contextual information from images and expand the network's perception field. Additionally, we integrate an efficient multi-scale attention module to enhance the model's ability to perceive fine details in small targets. To expedite model convergence and improve the precision of target localization, we employ the SIOU loss function as the bounding box regression function. Experimental results demonstrate that TP-YOLOX achieves a significant performance improvement with a relatively small additional computational cost (0.98 floating-point operations), resulting in a 4.50% increase in mean average precision (mAP) compared to the original YOLOX-s. When compared with existing object detection algorithms, TP-YOLOX outperforms them in terms of mAP performance. Moreover, the proposed method achieves a frame rate of 82.66 frames per second, meeting real-time requirements. CONCLUSION TP-YOLOX emerges as a proficient solution, capable of accurately and swiftly identifying tea pests amidst the complex backgrounds of tea gardens. This contribution not only offers valuable insights for tea pest monitoring but also serves as a reference for achieving precise pest control. © 2023 Society of Chemical Industry.
Collapse
Affiliation(s)
- Xianming Hu
- College of Mechanical and Electrical Engineering, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xinliang Li
- College of Mechanical and Electrical Engineering, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Ziyan Huang
- College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Qibin Chen
- College of Mechanical and Electrical Engineering, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Shouying Lin
- College of Mechanical and Electrical Engineering, Fujian Agriculture and Forestry University, Fuzhou, China
| |
Collapse
|
11
|
Yang J, Lei X, Zhang F. Identification of circRNA-disease associations via multi-model fusion and ensemble learning. J Cell Mol Med 2024; 28:e18180. [PMID: 38506066 PMCID: PMC10951890 DOI: 10.1111/jcmm.18180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/21/2024] [Accepted: 02/05/2024] [Indexed: 03/21/2024] Open
Abstract
Circular RNA (circRNA) is a common non-coding RNA and plays an important role in the diagnosis and therapy of human diseases, circRNA-disease associations prediction based on computational methods can provide a new way for better clinical diagnosis. In this article, we proposed a novel method for circRNA-disease associations prediction based on ensemble learning, named ELCDA. First, the association heterogeneous network was constructed via collecting multiple information of circRNAs and diseases, and multiple similarity measures are adopted here, then, we use metapath, matrix factorization and GraphSAGE-based models to extract features of nodes from different views, the final comprehensive features of circRNAs and diseases via ensemble learning, finally, a soft voting ensemble strategy is used to integrate the predicted results of all classifier. The performance of ELCDA is evaluated by fivefold cross-validation and compare with other state-of-the-art methods, the experimental results show that ELCDA is outperformance than others. Furthermore, three common diseases are used as case studies, which also demonstrate that ELCDA is an effective method for predicting circRNA-disease associations.
Collapse
Affiliation(s)
- Jing Yang
- School of Computer ScienceShaanxi Normal UniversityXi'anShaanxiChina
| | - Xiujuan Lei
- School of Computer ScienceShaanxi Normal UniversityXi'anShaanxiChina
| | - Fa Zhang
- School of Medical TechnologyBeijing Institute of TechnologyBeijingChina
| |
Collapse
|
12
|
Lee HS, Kim M, Jang S, Bae HB, Lee S. Multi-Granularity Aggregation with Spatiotemporal Consistency for Video-Based Person Re-Identification. Sensors (Basel) 2024; 24:2229. [PMID: 38610439 PMCID: PMC11014311 DOI: 10.3390/s24072229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 03/25/2024] [Accepted: 03/27/2024] [Indexed: 04/14/2024]
Abstract
Video-based person re-identification (ReID) aims to exploit relevant features from spatial and temporal knowledge. Widely used methods include the part- and attention-based approaches for suppressing irrelevant spatial-temporal features. However, it is still challenging to overcome inconsistencies across video frames due to occlusion and imperfect detection. These mismatches make temporal processing ineffective and create an imbalance of crucial spatial information. To address these problems, we propose the Spatiotemporal Multi-Granularity Aggregation (ST-MGA) method, which is specifically designed to accumulate relevant features with spatiotemporally consistent cues. The proposed framework consists of three main stages: extraction, which extracts spatiotemporally consistent partial information; augmentation, which augments the partial information with different granularity levels; and aggregation, which effectively aggregates the augmented spatiotemporal information. We first introduce the consistent part-attention (CPA) module, which extracts spatiotemporally consistent and well-aligned attentive parts. Sub-parts derived from CPA provide temporally consistent semantic information, solving misalignment problems in videos due to occlusion or inaccurate detection, and maximize the efficiency of aggregation through uniform partial information. To enhance the diversity of spatial and temporal cues, we introduce the Multi-Attention Part Augmentation (MA-PA) block, which incorporates fine parts at various granular levels, and the Long-/Short-term Temporal Augmentation (LS-TA) block, designed to capture both long- and short-term temporal relations. Using densely separated part cues, ST-MGA fully exploits and aggregates the spatiotemporal multi-granular patterns by comparing relations between parts and scales. In the experiments, the proposed ST-MGA renders state-of-the-art performance on several video-based ReID benchmarks (i.e., MARS, DukeMTMC-VideoReID, and LS-VID).
Collapse
Affiliation(s)
- Hean Sung Lee
- School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea; (H.S.L.); (M.K.); (S.J.)
| | - Minjung Kim
- School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea; (H.S.L.); (M.K.); (S.J.)
| | - Sungjun Jang
- School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea; (H.S.L.); (M.K.); (S.J.)
| | - Han Byeol Bae
- School of Computer Science and Engineering, Kunsan National University, 558 Daehak-ro, Gunsan-si 54150, Republic of Korea;
| | - Sangyoun Lee
- School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea; (H.S.L.); (M.K.); (S.J.)
| |
Collapse
|
13
|
Hui Y, You S, Hu X, Yang P, Zhao J. SEB-YOLO: An Improved YOLOv5 Model for Remote Sensing Small Target Detection. Sensors (Basel) 2024; 24:2193. [PMID: 38610404 PMCID: PMC11014141 DOI: 10.3390/s24072193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 03/14/2024] [Accepted: 03/27/2024] [Indexed: 04/14/2024]
Abstract
Due to the limited semantic information extraction with small objects and difficulty in distinguishing similar targets, it brings great challenges to target detection in remote sensing scenarios, which results in poor detection performance. This paper proposes an improved YOLOv5 remote sensing image target detection algorithm, SEB-YOLO (SPD-Conv + ECSPP + Bi-FPN + YOLOv5). Firstly, the space-to-depth (SPD) layer followed by a non-strided convolution (Conv) layer module (SPD-Conv) was used to reconstruct the backbone network, which retained the global features and reduced the feature loss. Meanwhile, the pooling module with the attention mechanism of the final layer of the backbone network was designed to help the network better identify and locate the target. Furthermore, a bidirectional feature pyramid network (Bi-FPN) with bilinear interpolation upsampling was added to improve bidirectional cross-scale connection and weighted feature fusion. Finally, the decoupled head is introduced to enhance the model convergence and solve the contradiction between the classification task and the regression task. Experimental results on NWPU VHR-10 and RSOD datasets show that the mAP of the proposed algorithm reaches 93.5% and 93.9%respectively, which is 4.0% and 5.3% higher than that of the original YOLOv5l algorithm. The proposed algorithm achieves better detection results for complex remote sensing images.
Collapse
Affiliation(s)
- Yan Hui
- School of Computer Science and Engineering, Xi’an Technological University, Xi’an 710021, China; (S.Y.); (X.H.); (P.Y.); (J.Z.)
- State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi’an 710021, China
| | - Shijie You
- School of Computer Science and Engineering, Xi’an Technological University, Xi’an 710021, China; (S.Y.); (X.H.); (P.Y.); (J.Z.)
- State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi’an 710021, China
| | - Xiuhua Hu
- School of Computer Science and Engineering, Xi’an Technological University, Xi’an 710021, China; (S.Y.); (X.H.); (P.Y.); (J.Z.)
- State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi’an 710021, China
| | - Panpan Yang
- School of Computer Science and Engineering, Xi’an Technological University, Xi’an 710021, China; (S.Y.); (X.H.); (P.Y.); (J.Z.)
- State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi’an 710021, China
| | - Jing Zhao
- School of Computer Science and Engineering, Xi’an Technological University, Xi’an 710021, China; (S.Y.); (X.H.); (P.Y.); (J.Z.)
- State and Provincial Joint Engineering Laboratory of Advanced Network, Monitoring and Control, Xi’an 710021, China
| |
Collapse
|
14
|
Wang K, Hou P, Xu X, Gao Y, Chen M, Lai B, An F, Ren Z, Li Y, Jia G, Hua Y. Automatic Identification of Pangolin Behavior Using Deep Learning Based on Temporal Relative Attention Mechanism. Animals (Basel) 2024; 14:1032. [PMID: 38612271 PMCID: PMC11011081 DOI: 10.3390/ani14071032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 03/25/2024] [Accepted: 03/25/2024] [Indexed: 04/14/2024] Open
Abstract
With declining populations in the wild, captive rescue and breeding have become one of the most important ways to protect pangolins from extinction. At present, the success rate of artificial breeding is low, due to the insufficient understanding of the breeding behavior characteristics of pangolins. The automatic recognition method based on machine vision not only monitors for 24 h but also reduces the stress response of pangolins. This paper aimed to establish a temporal relation and attention mechanism network (Pangolin breeding attention and transfer network, PBATn) to monitor and recognize pangolin behaviors, including breeding and daily behavior. There were 11,476 videos including breeding behavior and daily behavior that were divided into training, validation, and test sets. For the training set and validation set, the PBATn network model had an accuracy of 98.95% and 96.11%, and a loss function value of 0.1531 and 0.1852. The model is suitable for a 2.40 m × 2.20 m (length × width) pangolin cage area, with a nest box measuring 40 cm × 30 cm × 30 cm (length × width × height) positioned either on the left or right side inside the cage. A spherical night-vision monitoring camera was installed on the cage wall at a height of 2.50 m above the ground. For the test set, the mean Average Precision (mAP), average accuracy, average recall, average specificity, and average F1 score were found to be higher than SlowFast, X3D, TANet, TSN, etc., with values of 97.50%, 99.17%, 97.55%, 99.53%, and 97.48%, respectively. The recognition accuracies of PBATn were 94.00% and 98.50% for the chasing and mounting breeding behaviors, respectively. The results showed that PBATn outperformed the baseline methods in all aspects. This study shows that the deep learning system can accurately observe pangolin breeding behavior and it will be useful for analyzing the behavior of these animals.
Collapse
Affiliation(s)
- Kai Wang
- Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou 510520, China; (K.W.); (P.H.); (X.X.); (M.C.); (B.L.); (F.A.); (Z.R.); (Y.L.)
| | - Pengfei Hou
- Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou 510520, China; (K.W.); (P.H.); (X.X.); (M.C.); (B.L.); (F.A.); (Z.R.); (Y.L.)
- College of Engineering, Huazhong Agricultural University, Wuhan 430070, China;
| | - Xuelin Xu
- Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou 510520, China; (K.W.); (P.H.); (X.X.); (M.C.); (B.L.); (F.A.); (Z.R.); (Y.L.)
| | - Yun Gao
- College of Engineering, Huazhong Agricultural University, Wuhan 430070, China;
| | - Ming Chen
- Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou 510520, China; (K.W.); (P.H.); (X.X.); (M.C.); (B.L.); (F.A.); (Z.R.); (Y.L.)
- College of Engineering, Huazhong Agricultural University, Wuhan 430070, China;
| | - Binghua Lai
- Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou 510520, China; (K.W.); (P.H.); (X.X.); (M.C.); (B.L.); (F.A.); (Z.R.); (Y.L.)
- College of Engineering, Huazhong Agricultural University, Wuhan 430070, China;
| | - Fuyu An
- Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou 510520, China; (K.W.); (P.H.); (X.X.); (M.C.); (B.L.); (F.A.); (Z.R.); (Y.L.)
| | - Zhenyu Ren
- Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou 510520, China; (K.W.); (P.H.); (X.X.); (M.C.); (B.L.); (F.A.); (Z.R.); (Y.L.)
| | - Yongzheng Li
- Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou 510520, China; (K.W.); (P.H.); (X.X.); (M.C.); (B.L.); (F.A.); (Z.R.); (Y.L.)
| | - Guifeng Jia
- College of Engineering, Huazhong Agricultural University, Wuhan 430070, China;
| | - Yan Hua
- Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou 510520, China; (K.W.); (P.H.); (X.X.); (M.C.); (B.L.); (F.A.); (Z.R.); (Y.L.)
| |
Collapse
|
15
|
Chen Y, Sun X, Duan Y, Wang Y, Zhang J, Zhu Y. Lightweight semantic segmentation network for tumor cell nuclei and skin lesion. Front Oncol 2024; 14:1254705. [PMID: 38601757 PMCID: PMC11005060 DOI: 10.3389/fonc.2024.1254705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 03/04/2024] [Indexed: 04/12/2024] Open
Abstract
In the field of medical image segmentation, achieving fast and accurate semantic segmentation of tumor cell nuclei and skin lesions is of significant importance. However, the considerable variations in skin lesion forms and cell types pose challenges to attaining high network accuracy and robustness. Additionally, as network depth increases, the growing parameter size and computational complexity make practical implementation difficult. To address these issues, this paper proposes MD-UNet, a fast cell nucleus segmentation network that integrates Tokenized Multi-Layer Perceptron modules, attention mechanisms, and Inception structures. Firstly, tokenized MLP modules are employed to label and project convolutional features, reducing computational complexity. Secondly, the paper introduces Depthwise Attention blocks and Multi-layer Feature Extraction modules. The Depthwise Attention blocks eliminate irrelevant and noisy responses from coarse-scale extracted information, serving as alternatives to skip connections in the UNet architecture. The Multi-layer Feature Extraction modules capture a wider range of high-level and low-level semantic features during decoding and facilitate feature fusion. The proposed MD-UNet approach is evaluated on two datasets: the International Skin Imaging Collaboration (ISIC2018) dataset and the PanNuke dataset. The experimental results demonstrate that MD-UNet achieves the best performance on both datasets.
Collapse
Affiliation(s)
- Yan Chen
- Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin, China
| | - Xiaoming Sun
- Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin, China
| | - Yan Duan
- Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin, China
| | - Yongliang Wang
- Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin, China
| | - Junkai Zhang
- Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin, China
| | - Yuemin Zhu
- INSA Lyon, University Claude Bernard Lyon 1, CNRS, Inserm, CREATIS UMR 5220, U1294, Lyon, France
| |
Collapse
|
16
|
Chen Q, Zhang L, Liu Y, Qin Z, Zhao T. PUTransGCN: identification of piRNA-disease associations based on attention encoding graph convolutional network and positive unlabelled learning. Brief Bioinform 2024; 25:bbae144. [PMID: 38581419 PMCID: PMC10998538 DOI: 10.1093/bib/bbae144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 02/25/2024] [Accepted: 03/15/2024] [Indexed: 04/08/2024] Open
Abstract
Piwi-interacting RNAs (piRNAs) play a crucial role in various biological processes and are implicated in disease. Consequently, there is an escalating demand for computational tools to predict piRNA-disease interactions. Although there have been computational methods proposed for the detection of piRNA-disease associations, the problem of imbalanced and sparse dataset has brought great challenges to capture the complex relationships between piRNAs and diseases. In response to this necessity, we have developed a novel computational architecture, denoted as PUTransGCN, which uses heterogeneous graph convolutional networks to uncover potential piRNA-disease associations. Additionally, the attention mechanism was used to adjust the weight parameters of aggregation heterogeneous node features automatically. For tackling the imbalanced dataset problem, the combined positive unlabelled learning (PUL) method comprising PU bagging, two-step and spy technique was applied to select reliable negative associations. The features of piRNAs and diseases were derived from three distinct biological sources by PUTransGCN, including information on piRNA sequences, semantic terms related to diseases and the existing network of piRNA-disease associations. In the experiment, PUTransGCN performs in 5-fold cross-validation with an AUC of 0.93 and 0.95 on two datasets, respectively, which outperforms the other six state-of-the-art models. We compared three different PUL methods, and the results of the ablation experiment indicate that the combined PUL method yields the best results. The PUTransGCN could serve as a valuable piRNA-disease prediction tool for upcoming studies in the biomedical field. The code for PUTransGCN is available at https://github.com/chenqiuhao/PUTransGCN.
Collapse
Affiliation(s)
- Qiuhao Chen
- Institute of Bioinformatics, Harbin Institute of Technology, 150000, Harbin, Heilongjiang, China
| | - Liyuan Zhang
- School of Computer Science and Technology, Harbin Institute of Technology, 150000, Harbin, Heilongjiang, China
| | - Yaojia Liu
- School of Computer Science and Technology, Harbin Institute of Technology, 150000, Harbin, Heilongjiang, China
| | - Zhonghao Qin
- State Key Laboratory of Robotics and System, Harbin Institute of Technology, 150000, Harbin, Heilongjiang, China
| | - Tianyi Zhao
- School of Computer Science and Technology, Harbin Institute of Technology, 150000, Harbin, Heilongjiang, China
| |
Collapse
|
17
|
Huang Y, Xiong J, Yao Z, Huang Q, Tang K, Jiang D, Yang Z. A fluorescence detection method for postharvest tomato epidermal defects based on improved YOLOv5m. J Sci Food Agric 2024. [PMID: 38523076 DOI: 10.1002/jsfa.13486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/19/2024] [Accepted: 03/22/2024] [Indexed: 03/26/2024]
Abstract
BACKGROUND Tomato quality visual grading is greatly affected by the problems of smooth skin, uneven illumination and invisible defects that are difficult to identify. The realization of intelligent detection of postharvest epidermal defects is conducive to further improving the economic value of postharvest tomatoes. RESULTS An image acquisition device that utilizes fluorescence technology has been designed to capture a dataset of tomato skin defects, encompassing categories such as rot defects, crack defects and imperceptible defects. The YOLOv5m model was improved by introducing Convolutional Block Attention Module and replacing part of the convolution kernels in the backbone network with Switchable Atrous Convolution. The results of comparison experiments and ablation experiments show that the Precision, Recall and mean Average Precision of the improved YOLOv5m model were 89.93%, 82.33% and 87.57%, which are higher than YOLOv5m, Faster R-CNN and YOLOv7, and the average detection time was reduced by 47.04 ms picture-1. CONCLUSION The present study utilizes fluorescence imaging and an improved YOLOv5m model to detect tomato epidermal defects, resulting in better identification of imperceptible defects and detection of multiple categories of defects. This provides strong technical support for intelligent detection and quality grading of tomatoes. © 2024 Society of Chemical Industry.
Collapse
Affiliation(s)
- Yuhua Huang
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
| | - Juntao Xiong
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
| | - Zhaoshen Yao
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
| | - Qiyin Huang
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
| | - Kun Tang
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
| | - Dandan Jiang
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
| | - Zhengang Yang
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
| |
Collapse
|
18
|
Wu J, Qiao S, Li H, Sun B, Gao F, Hu H, Zhao R. Goal-Guided Graph Attention Network with Interactive State Refinement for Multi-Agent Trajectory Prediction. Sensors (Basel) 2024; 24:2065. [PMID: 38610277 PMCID: PMC11014028 DOI: 10.3390/s24072065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 03/20/2024] [Accepted: 03/21/2024] [Indexed: 04/14/2024]
Abstract
The accurate prediction of the future trajectories of traffic participants is crucial for enhancing the safety and decision-making capabilities of autonomous vehicles. Modeling social interactions among agents and revealing the inherent relationships is crucial for accurate trajectory prediction. In this context, we propose a goal-guided and interaction-aware state refinement graph attention network (SRGAT) for multi-agent trajectory prediction. This model effectively integrates high-precision map data and dynamic traffic states and captures long-term temporal dependencies through the Transformer network. Based on these dependencies, it generates multiple potential goals and Points of Interest (POIs). Through its dual-branch, multimodal prediction approach, the model not only proposes various plausible future trajectories associated with these POIs, but also rigorously assesses the confidence levels of each trajectory. This goal-oriented strategy enables SRGAT to accurately predict the future movement trajectories of other vehicles in complex traffic scenarios. Tested on the Argoverse and nuScenes datasets, SRGAT surpasses existing algorithms in key performance metrics by adeptly integrating past trajectories and current context. This goal-guided approach not only enhances long-term prediction accuracy, but also ensures its reliability, demonstrating a significant advancement in trajectory forecasting.
Collapse
Affiliation(s)
- Jianghang Wu
- College of Automotive Engineering, Jilin University, Changchun 130025, China; (J.W.); (S.Q.)
| | - Senyao Qiao
- College of Automotive Engineering, Jilin University, Changchun 130025, China; (J.W.); (S.Q.)
| | - Haocheng Li
- College of Automotive Engineering, Jilin University, Changchun 130025, China; (J.W.); (S.Q.)
| | - Boyu Sun
- College of Automotive Engineering, Jilin University, Changchun 130025, China; (J.W.); (S.Q.)
| | - Fei Gao
- College of Automotive Engineering, Jilin University, Changchun 130025, China; (J.W.); (S.Q.)
- State Key Laboratory of Automotive Simulation and Control, Jilin University, Changchun 130025, China
| | - Hongyu Hu
- College of Automotive Engineering, Jilin University, Changchun 130025, China; (J.W.); (S.Q.)
| | - Rui Zhao
- College of Automotive Engineering, Jilin University, Changchun 130025, China; (J.W.); (S.Q.)
| |
Collapse
|
19
|
Wang Y, Liu W, Lu Y, Ling R, Wang W, Li S, Zhang F, Ning Y, Chen X, Yang G, Zhang H. Fully Automated Identification of Lymph Node Metastases and Lymphovascular Invasion in Endometrial Cancer From Multi-Parametric MRI by Deep Learning. J Magn Reson Imaging 2024. [PMID: 38471960 DOI: 10.1002/jmri.29344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/27/2024] [Accepted: 02/27/2024] [Indexed: 03/14/2024] Open
Abstract
BACKGROUND Early and accurate identification of lymphatic node metastasis (LNM) and lymphatic vascular space invasion (LVSI) for endometrial cancer (EC) patients is important for treatment design, but difficult on multi-parametric MRI (mpMRI) images. PURPOSE To develop a deep learning (DL) model to simultaneously identify of LNM and LVSI of EC from mpMRI images. STUDY TYPE Retrospective. POPULATION Six hundred twenty-one patients with histologically proven EC from two institutions, including 111 LNM-positive and 168 LVSI-positive, divided into training, internal, and external test cohorts of 398, 169, and 54 patients, respectively. FIELD STRENGTH/SEQUENCE T2-weighted imaging (T2WI), contrast-enhanced T1WI (CE-T1WI), and diffusion-weighted imaging (DWI) were scanned with turbo spin-echo, gradient-echo, and two-dimensional echo-planar sequences, using either a 1.5 T or 3 T system. ASSESSMENT EC lesions were manually delineated on T2WI by two radiologists and used to train an nnU-Net model for automatic segmentation. A multi-task DL model was developed to simultaneously identify LNM and LVSI positive status using the segmented EC lesion regions and T2WI, CE-T1WI, and DWI images as inputs. The performance of the model for LNM-positive diagnosis was compared with those of three radiologists in the external test cohort. STATISTICAL TESTS Dice similarity coefficient (DSC) was used to evaluate segmentation results. Receiver Operating Characteristic (ROC) analysis was used to assess the performance of LNM and LVSI status identification. P value <0.05 was considered significant. RESULTS EC lesion segmentation model achieved mean DSC values of 0.700 ± 0.25 and 0.693 ± 0.21 in the internal and external test cohorts, respectively. For LNM positive/LVSI positive identification, the proposed model achieved AUC values of 0.895/0.848, 0.806/0.795, and 0.804/0.728 in the training, internal, and external test cohorts, respectively, and better than those of three radiologists (AUC = 0.770/0.648/0.674). DATA CONCLUSION The proposed model has potential to help clinicians to identify LNM and LVSI status of EC patients and improve treatment planning. EVIDENCE LEVEL 3 TECHNICAL EFFICACY: Stage 2.
Collapse
Affiliation(s)
- Yida Wang
- Shanghai Key Laboratory of Magnetic Resonance, East China Normal University, Shanghai, China
| | - Wei Liu
- Department of Gynecology, Obstetrics and Gynecology Hospital, Fudan University, Shanghai, China
| | - Yuanyuan Lu
- Department of Radiology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Rennan Ling
- Department of Radiology, Shenzhen People's Hospital, Second Clinical Medical College of Jinan University, First Affiliated Hospital of Southern University of Science and Technology, Shanghai, China
| | - Wenjing Wang
- Department of Radiology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Shengyong Li
- Shanghai Key Laboratory of Magnetic Resonance, East China Normal University, Shanghai, China
| | - Feiran Zhang
- Department of Pathology, Obstetrics and Gynecology Hospital, Fudan University, Shanghai, China
| | - Yan Ning
- Department of Pathology, Obstetrics and Gynecology Hospital, Fudan University, Shanghai, China
| | - Xiaojun Chen
- Department of Gynecology, Obstetrics and Gynecology Hospital, Fudan University, Shanghai, China
| | - Guang Yang
- Shanghai Key Laboratory of Magnetic Resonance, East China Normal University, Shanghai, China
| | - He Zhang
- Department of Radiology, Obstetrics and Gynecology Hospital, Fudan University, Shanghai, China
| |
Collapse
|
20
|
Ma N, Su Y, Yang L, Li Z, Yan H. Wheat Seed Detection and Counting Method Based on Improved YOLOv8 Model. Sensors (Basel) 2024; 24:1654. [PMID: 38475189 DOI: 10.3390/s24051654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 02/29/2024] [Accepted: 03/01/2024] [Indexed: 03/14/2024]
Abstract
Wheat seed detection has important applications in calculating thousand-grain weight and crop breeding. In order to solve the problems of seed accumulation, adhesion, and occlusion that can lead to low counting accuracy, while ensuring fast detection speed with high accuracy, a wheat seed counting method is proposed to provide technical support for the development of the embedded platform of the seed counter. This study proposes a lightweight real-time wheat seed detection model, YOLOv8-HD, based on YOLOv8. Firstly, we introduce the concept of shared convolutional layers to improve the YOLOv8 detection head, reducing the number of parameters and achieving a lightweight design to improve runtime speed. Secondly, we incorporate the Vision Transformer with a Deformable Attention mechanism into the C2f module of the backbone network to enhance the network's feature extraction capability and improve detection accuracy. The results show that in the stacked scenes with impurities (severe seed adhesion), the YOLOv8-HD model achieves an average detection accuracy (mAP) of 77.6%, which is 9.1% higher than YOLOv8. In all scenes, the YOLOv8-HD model achieves an average detection accuracy (mAP) of 99.3%, which is 16.8% higher than YOLOv8. The memory size of the YOLOv8-HD model is 6.35 MB, approximately 4/5 of YOLOv8. The GFLOPs of YOLOv8-HD decrease by 16%. The inference time of YOLOv8-HD is 2.86 ms (on GPU), which is lower than YOLOv8. Finally, we conducted numerous experiments and the results showed that YOLOv8-HD outperforms other mainstream networks in terms of mAP, speed, and model size. Therefore, our YOLOv8-HD can efficiently detect wheat seeds in various scenarios, providing technical support for the development of seed counting instruments.
Collapse
Affiliation(s)
- Na Ma
- College of Information Science and Engineering, Shanxi Agricultural University, Taigu District, Jinzhong 030801, China
| | - Yaxin Su
- College of Information Science and Engineering, Shanxi Agricultural University, Taigu District, Jinzhong 030801, China
| | - Lexin Yang
- College of Information Science and Engineering, Shanxi Agricultural University, Taigu District, Jinzhong 030801, China
| | - Zhongtao Li
- College of Information Science and Engineering, Shanxi Agricultural University, Taigu District, Jinzhong 030801, China
| | - Hongwen Yan
- College of Information Science and Engineering, Shanxi Agricultural University, Taigu District, Jinzhong 030801, China
| |
Collapse
|
21
|
Xie Q, Lin Y, Wang M, Wu Y. Synthesis of gadolinium-enhanced glioma images on multisequence magnetic resonance images using contrastive learning. Med Phys 2024. [PMID: 38421681 DOI: 10.1002/mp.17004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 12/28/2023] [Accepted: 02/06/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND Gadolinium-based contrast agents are commonly used in brain magnetic resonance imaging (MRI), however, they cannot be used by patients with allergic reactions or poor renal function. For long-term follow-up patients, gadolinium deposition in the body can cause nephrogenic systemic fibrosis and other potential risks. PURPOSE Developing a new method of enhanced image synthesis based on the advantages of multisequence MRI has important clinical value for these patients. In this paper, an end-to-end synthesis model structure similarity index measure (SSIM)-based Dual Constrastive Learning with Attention (SDACL) based on contrastive learning is proposed to synthesize contrast-enhanced T1 (T1ce) using three unenhanced MRI images of T1, T2, and Flair in patients with glioma. METHODS The model uses the attention-dilation generator to enlarge the receptive field by expanding the residual blocks and to strengthen the feature representation and context learning of multisequence MRI. To enhance the detail and texture performance of the imaged tumor area, a comprehensive loss function combining patch-level contrast loss and structural similarity loss is created, which can effectively suppress noise and ensure the consistency of synthesized images and real images. RESULTS The normalized root-mean-square error (NRMSE), peak signal-to-noise ratio (PSNR), and SSIM of the model on the independent test set are 0.307 ± $\pm$ 0.12, 23.337 ± $\pm$ 3.21, and 0.881 ± $\pm$ 0.05, respectively. CONCLUSIONS Results show this method can be used for the multisequence synthesis of T1ce images, which can provide valuable information for clinical diagnosis.
Collapse
Affiliation(s)
- Qian Xie
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, Henan, China
- Collaborative Innovation Center for Internet Healthcare, Zhengzhou University, Zhengzhou, Henan, China
| | - Yusong Lin
- Collaborative Innovation Center for Internet Healthcare, Zhengzhou University, Zhengzhou, Henan, China
- School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou, Henan, China
- Hanwei IoT Institute, Zhengzhou University, Zhengzhou, Henan, China
| | - Meiyun Wang
- Collaborative Innovation Center for Internet Healthcare, Zhengzhou University, Zhengzhou, Henan, China
- Department of Medical Imaging, Henan Provincial People's Hospital, Zhengzhou, Henan, China
- Laboratory of Brain Science and Brain-Like Intelligence Technology Biomedical Research Institute Henan Academy of Science, Zhengzhou, Henan, China
| | - Yaping Wu
- Department of Medical Imaging, Henan Provincial People's Hospital, Zhengzhou, Henan, China
- Laboratory of Brain Science and Brain-Like Intelligence Technology Biomedical Research Institute Henan Academy of Science, Zhengzhou, Henan, China
| |
Collapse
|
22
|
Lu J, Zhang S, Zhao S, Li D, Zhao R. A Metric-Based Few-Shot Learning Method for Fish Species Identification with Limited Samples. Animals (Basel) 2024; 14:755. [PMID: 38473140 DOI: 10.3390/ani14050755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 02/22/2024] [Accepted: 02/22/2024] [Indexed: 03/14/2024] Open
Abstract
Fish species identification plays a vital role in marine fisheries resource exploration, yet datasets related to marine fish resources are scarce. In open-water environments, various fish species often exhibit similar appearances and sizes. To solve these issues, we propose a few-shot learning approach to identifying fish species. Our approach involves two key components. Firstly, the embedding module was designed to address the challenges posed by a large number of fish species with similar phenotypes by utilizing the distribution relationships of species in the embedding space. Secondly, a metric function was introduced, effectively enhancing the performance of fish species classification and successfully addressing the issue of limited sample quantity. The proposed model is trained end to end on fish species public datasets including the Croatian fish dataset, Fish4Knowledge and WildFish. Compared with the prototypical networks, our method performs more effectively and improves accuracy by 2% to 10%; it is able to identify fish effectively in small samples sizes and complex scene scenarios. This method provides a valuable technological tool for the development of fisheries resources and the preservation of fish biodiversity.
Collapse
Affiliation(s)
- Jiamin Lu
- National Innovation Center for Digital Fishery, China Agricultural University, Beijing 100083, China
- Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affair, Beijing 100083, China
- Beijing Engineering and Technology Research Centre for Internet of Things in Agriculture, China Agriculture University, Beijing 100083, China
- College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
| | - Song Zhang
- National Innovation Center for Digital Fishery, China Agricultural University, Beijing 100083, China
- Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affair, Beijing 100083, China
- Beijing Engineering and Technology Research Centre for Internet of Things in Agriculture, China Agriculture University, Beijing 100083, China
- College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
| | - Shili Zhao
- National Innovation Center for Digital Fishery, China Agricultural University, Beijing 100083, China
- Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affair, Beijing 100083, China
- Beijing Engineering and Technology Research Centre for Internet of Things in Agriculture, China Agriculture University, Beijing 100083, China
- College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
| | - Daoliang Li
- National Innovation Center for Digital Fishery, China Agricultural University, Beijing 100083, China
- Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affair, Beijing 100083, China
- Beijing Engineering and Technology Research Centre for Internet of Things in Agriculture, China Agriculture University, Beijing 100083, China
- College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
| | - Ran Zhao
- National Innovation Center for Digital Fishery, China Agricultural University, Beijing 100083, China
- Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affair, Beijing 100083, China
- Beijing Engineering and Technology Research Centre for Internet of Things in Agriculture, China Agriculture University, Beijing 100083, China
- College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
| |
Collapse
|
23
|
Karn PK, Abdulla WH. Advancing Ocular Imaging: A Hybrid Attention Mechanism-Based U-Net Model for Precise Segmentation of Sub-Retinal Layers in OCT Images. Bioengineering (Basel) 2024; 11:240. [PMID: 38534514 DOI: 10.3390/bioengineering11030240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 02/21/2024] [Accepted: 02/26/2024] [Indexed: 03/28/2024] Open
Abstract
This paper presents a novel U-Net model incorporating a hybrid attention mechanism for automating the segmentation of sub-retinal layers in Optical Coherence Tomography (OCT) images. OCT is an ophthalmology tool that provides detailed insights into retinal structures. Manual segmentation of these layers is time-consuming and subjective, calling for automated solutions. Our proposed model combines edge and spatial attention mechanisms with the U-Net architecture to improve segmentation accuracy. By leveraging attention mechanisms, the U-Net focuses selectively on image features. Extensive evaluations using datasets demonstrate that our model outperforms existing approaches, making it a valuable tool for medical professionals. The study also highlights the model's robustness through performance metrics such as an average Dice score of 94.99%, Adjusted Rand Index (ARI) of 97.00%, and Strength of Agreement (SOA) classifications like "Almost Perfect", "Excellent", and "Very Strong". This advanced predictive model shows promise in expediting processes and enhancing the precision of ocular imaging in real-world applications.
Collapse
Affiliation(s)
- Prakash Kumar Karn
- Department of Electrical, Computer and Software Engineering, The University of Auckland, Auckland 1010, New Zealand
| | - Waleed H Abdulla
- Department of Electrical, Computer and Software Engineering, The University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
24
|
Lee H, Cho M, Kwon HY. Attention-based speech feature transfer between speakers. Front Artif Intell 2024; 7:1259641. [PMID: 38469160 PMCID: PMC10926952 DOI: 10.3389/frai.2024.1259641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Accepted: 02/06/2024] [Indexed: 03/13/2024] Open
Abstract
In this study, we propose a simple yet effective method for incorporating the source speaker's characteristics in the target speaker's speech. This allows our model to generate the speech of the target speaker with the style of the source speaker. To achieve this, we focus on the attention model within the speech synthesis model, which learns various speaker features such as spectrogram, pitch, intensity, formant, pulse, and voice breaks. The model is trained separately using datasets specific to the source and target speakers. Subsequently, we replace the attention weights learned from the source speaker's dataset with the attention weights from the target speaker's model. Finally, by providing new input texts to the target model, we generate the speech of the target speaker with the styles of the source speaker. We validate the effectiveness of our model through similarity analysis utilizing five evaluation metrics and showcase real-world examples.
Collapse
Affiliation(s)
| | | | - Hyuk-Yoon Kwon
- Department of Industrial Engineering, Seoul National University of Science and Technology, Seoul, Republic of Korea
| |
Collapse
|
25
|
Tang Y, Li Y, Li P, Liu ZP. Drug-target Affinity Prediction by Molecule Secondary Structure Representation Network. Curr Med Chem 2024; 31:CMC-EPUB-138716. [PMID: 38409701 DOI: 10.2174/0109298673252287240215103035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 06/25/2023] [Accepted: 02/09/2024] [Indexed: 02/28/2024]
Abstract
INTRODUCTION Identification of drug-target interactions (DTI) is a crucial step in drug development with high specificity and low toxicity. To accelerate the process, computer-aided DTI prediction algorithms have been used to screen compounds or targets rapidly. Furthermore, DTI prediction can be used to identify potential targets for existing drugs, thus uncovering new indications and repositioning them. Therefore, it is of great importance to develop efficient and accurate DTI prediction algorithms. METHOD Current algorithms usually represent drugs as extracted features, which are learned by convolutional neural networks (CNNs) from its linear representation, or utilize graph neural networks (GNNs) to learn its graph representation. However, these methods either lose information or fail to capture the structural information of the drug. To address this issue, a novel molecule secondary structure representation network (MSSRN) is proposed to learn drug characterization more accurately. Firstly, the network performs relational graph convolutional networks (R-GCNs) on the drug's molecular graph and integrates drug sequence convolutions to learn the sequential information. Secondly, inspired by the attention mechanism, spatial importance weights of the drug sequence are calculated to guide R-GCNs to learn the topological information of the drug. RESULT A drug-target affinity model, called MSSRN-DTA, was then constructed by using MSSRN to learn drug structure and CNN to learn protein sequence. CONCLUSION The effectiveness of the proposed method is verified by comparing it with other alternative methods and baseline models on two benchmark datasets.
Collapse
Affiliation(s)
- Yuewei Tang
- Center for Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| | - Yunhai Li
- Center for Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| | - Pengpai Li
- Center for Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| | - Zhi-Ping Liu
- Center for Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| |
Collapse
|
26
|
Zhao L, Zuo Y, Zhang W, Li T, Chen CLP. End-to-end model-based trajectory prediction for ro-ro ship route using dual- attention mechanism. Front Comput Neurosci 2024; 18:1358437. [PMID: 38449670 PMCID: PMC10915009 DOI: 10.3389/fncom.2024.1358437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 01/31/2024] [Indexed: 03/08/2024] Open
Abstract
With the rapid increase of economic globalization, the significant expansion of shipping volume has resulted in shipping route congestion, causing the necessity of trajectory prediction for effective service and efficient management. While trajectory prediction can achieve a relatively high level of accuracy, the performance and generalization of prediction models remain critical bottlenecks. Therefore, this article proposes a dual-attention (DA) based end-to-end (E2E) neural network (DAE2ENet) for trajectory prediction. In the E2E structure, long short-term memory (LSTM) units are included for the task of pursuing sequential trajectory data from the encoder layer to the decoder layer. In DA mechanisms, global attention is introduced between the encoder and decoder layers to facilitate interactions between input and output trajectory sequences, and multi-head self-attention is utilized to extract sequential features from the input trajectory. In experiments, we use a ro-ro ship with a fixed navigation route as a case study. Compared with baseline models and benchmark neural networks, DAE2ENet can obtain higher performance on trajectory prediction, and better validation of environmental factors on ship navigation.
Collapse
Affiliation(s)
- Licheng Zhao
- Navigation College, Dalian Maritime University, Dalian, China
| | - Yi Zuo
- Navigation College, Dalian Maritime University, Dalian, China
- Maritime Big Data and Artificial Intelligent Application Centre, Dalian Maritime University, Dalian, China
| | - Wenjun Zhang
- Navigation College, Dalian Maritime University, Dalian, China
- Key Laboratory of Safety and Security Technology for Autonomous Shipping, Dalian Maritime University, Dalian, China
| | - Tieshan Li
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - C. L. Philip Chen
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
| |
Collapse
|
27
|
Li F, Peng T. Developing a New Constitutive Model of High Damping Rubber by Combining GRU and Attention Mechanism. Polymers (Basel) 2024; 16:567. [PMID: 38475250 DOI: 10.3390/polym16050567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Revised: 02/15/2024] [Accepted: 02/16/2024] [Indexed: 03/14/2024] Open
Abstract
High damping rubber (HDR) bearings are extensively used in seismic design for bridges due to their remarkable energy dissipation capabilities, which is critical during earthquakes. A thorough assessment of crucial factors such as temperature, rate, experienced maximum amplitude, and the Mullins effect of HDR on the mechanics-based constitutive model of HDR is lacking. To address this issue, we propose a deep learning approach that integrates the Gate Recurrent Unit (GRU) and attention mechanism to identify time series characteristics from compression-shear test data of HDR specimens. It is shown that the combination of GRU and attention mechanism enables accurate prediction of the mechanical behavior of HDR specimens. Compared to the sole use of GRU, this suggested method significantly reduces model complexity and computation time while maintaining good prediction performance. Therefore, it offers a new approach to constructing the HDR constitutive model. Finally, the HDR constitutive model was used to analyze the impact of experienced maximum amplitudes and cycles on following processes. It was observed that maximum amplitudes directly influence the stress-strain relationship of HDR during subsequent processes. Consequently, a solid foundation is laid for evaluating the responses of HDR bearings under earthquakes.
Collapse
Affiliation(s)
- Feng Li
- College of Civil Engineering, Tongji University, Shanghai 200092, China
| | - Tianbo Peng
- College of Civil Engineering, Tongji University, Shanghai 200092, China
- State Key Laboratory of Disaster Reduction in Civil Engineering, Tongji University, Shanghai 200092, China
| |
Collapse
|
28
|
Ren Y, Gao Y, Du W, Qiao W, Li W, Yang Q, Liang Y, Li G. Classifying breast cancer using multi-view graph neural network based on multi-omics data. Front Genet 2024; 15:1363896. [PMID: 38444760 PMCID: PMC10912483 DOI: 10.3389/fgene.2024.1363896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Accepted: 02/02/2024] [Indexed: 03/07/2024] Open
Abstract
Introduction: As the evaluation indices, cancer grading and subtyping have diverse clinical, pathological, and molecular characteristics with prognostic and therapeutic implications. Although researchers have begun to study cancer differentiation and subtype prediction, most of relevant methods are based on traditional machine learning and rely on single omics data. It is necessary to explore a deep learning algorithm that integrates multi-omics data to achieve classification prediction of cancer differentiation and subtypes. Methods: This paper proposes a multi-omics data fusion algorithm based on a multi-view graph neural network (MVGNN) for predicting cancer differentiation and subtype classification. The model framework consists of a graph convolutional network (GCN) module for learning features from different omics data and an attention module for integrating multi-omics data. Three different types of omics data are used. For each type of omics data, feature selection is performed using methods such as the chi-square test and minimum redundancy maximum relevance (mRMR). Weighted patient similarity networks are constructed based on the selected omics features, and GCN is trained using omics features and corresponding similarity networks. Finally, an attention module integrates different types of omics features and performs the final cancer classification prediction. Results: To validate the cancer classification predictive performance of the MVGNN model, we conducted experimental comparisons with traditional machine learning models and currently popular methods based on integrating multi-omics data using 5-fold cross-validation. Additionally, we performed comparative experiments on cancer differentiation and its subtypes based on single omics data, two omics data, and three omics data. Discussion: This paper proposed the MVGNN model and it performed well in cancer classification prediction based on multiple omics data.
Collapse
Affiliation(s)
- Yanjiao Ren
- College of Information Technology, Smart Agriculture Research Institute, Jilin Agricultural University, Changchun, Jilin, China
| | - Yimeng Gao
- College of Information Technology, Smart Agriculture Research Institute, Jilin Agricultural University, Changchun, Jilin, China
| | - Wei Du
- College of Computer Science and Technology, Jilin University, Changchun, China
| | - Weibo Qiao
- College of Computer Science and Technology, Jilin University, Changchun, China
| | - Wei Li
- College of Information Technology, Smart Agriculture Research Institute, Jilin Agricultural University, Changchun, Jilin, China
| | - Qianqian Yang
- College of Information Technology, Smart Agriculture Research Institute, Jilin Agricultural University, Changchun, Jilin, China
| | - Yanchun Liang
- College of Computer Science and Technology, Jilin University, Changchun, China
- School of Computer Science, Zhuhai College of Science and Technology, Zhuhai, China
| | - Gaoyang Li
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, China
| |
Collapse
|
29
|
Chen X, Hassan MM, Yu J, Zhu A, Han Z, He P, Chen Q, Li H, Ouyang Q. Time series prediction of insect pests in tea gardens. J Sci Food Agric 2024. [PMID: 38372506 DOI: 10.1002/jsfa.13393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 02/08/2024] [Accepted: 02/15/2024] [Indexed: 02/20/2024]
Abstract
BACKGROUND Tea-garden pest control is crucial to ensure tea quality. In this context, the time-series prediction of insect pests in tea gardens is very important. Deep-learning-based time-series prediction techniques are advancing rapidly but research into their use in tea-garden pest prediction is limited. The current study investigates the time-series prediction of whitefly populations in the Tea Expo Garden, Jurong City, Jiangsu Province, China, employing three deep-learning algorithms, namely Informer, the Long Short-Term Memory (LSTM) network, and LSTM-Attention. RESULTS The comparative analysis of the three deep-learning algorithms revealed optimal results for LSTM-Attention, with an average root mean square error (RMSE) of 2.84 and average mean absolute error (MAE) of 2.52 for 7 days' prediction length, respectively. For a prediction length of 3 days, LSTM achieved the best performance, with an average RMSE of 2.60 and an average MAE of 2.24. CONCLUSION These findings suggest that different prediction lengths influence model performance in tea garden pest time series prediction. Deep learning could be applied satisfactorily to predict time series of insect pests in tea gardens based on LSTM-Attention. Thus, this study provides a theoretical basis for the research on the time series of pest and disease infestations in tea plants. © 2024 Society of Chemical Industry.
Collapse
Affiliation(s)
- Xuanyu Chen
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang, PR China
| | - Md Mehedi Hassan
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang, PR China
| | - Jinghao Yu
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang, PR China
| | - Afang Zhu
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang, PR China
| | - Zhang Han
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang, PR China
| | - Peihuan He
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang, PR China
- School of Grain Science and Technology, Jiangsu University of Science and Technology, Zhenjiang, PR China
| | - Quansheng Chen
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang, PR China
- College of Food and Biological Engineering, Jimei University, Xiamen, PR China
| | - Huanhuan Li
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang, PR China
| | - Qin Ouyang
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang, PR China
| |
Collapse
|
30
|
Liu Z, Wu G, Xie T, Li S, Wu C, Zhang Z, Zhou J. A Light Multi-View Stereo Method with Patch-Uncertainty Awareness. Sensors (Basel) 2024; 24:1293. [PMID: 38400452 PMCID: PMC10892961 DOI: 10.3390/s24041293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 02/09/2024] [Accepted: 02/13/2024] [Indexed: 02/25/2024]
Abstract
Multi-view stereo methods utilize image sequences from different views to generate a 3D point cloud model of the scene. However, existing approaches often overlook coarse-stage features, impacting the final reconstruction accuracy. Moreover, using a fixed range for all the pixels during inverse depth sampling can adversely affect depth estimation. To address these challenges, we present a novel learning-based multi-view stereo method incorporating attention mechanisms and an adaptive depth sampling strategy. Firstly, we propose a lightweight, coarse-feature-enhanced feature pyramid network in the feature extraction stage, augmented by a coarse-feature-enhanced module. This module integrates features with channel and spatial attention, enriching the contextual features that are crucial for the initial depth estimation. Secondly, we introduce a novel patch-uncertainty-based depth sampling strategy for depth refinement, dynamically configuring depth sampling ranges within the GRU-based optimization process. Furthermore, we incorporate an edge detection operator to extract edge features from the reference image's feature map. These edge features are additionally integrated into the iterative cost volume construction, enhancing the reconstruction accuracy. Lastly, our method is rigorously evaluated on the DTU and Tanks and Temples benchmark datasets, revealing its low GPU memory consumption and competitive reconstruction quality compared to other learning-based MVS methods.
Collapse
Affiliation(s)
- Zhen Liu
- College of Science, Zhejiang University of Technology, Hangzhou 310023, China; (Z.L.); (G.W.); (T.X.); (S.L.); (C.W.)
| | - Guangzheng Wu
- College of Science, Zhejiang University of Technology, Hangzhou 310023, China; (Z.L.); (G.W.); (T.X.); (S.L.); (C.W.)
| | - Tao Xie
- College of Science, Zhejiang University of Technology, Hangzhou 310023, China; (Z.L.); (G.W.); (T.X.); (S.L.); (C.W.)
| | - Shilong Li
- College of Science, Zhejiang University of Technology, Hangzhou 310023, China; (Z.L.); (G.W.); (T.X.); (S.L.); (C.W.)
| | - Chao Wu
- College of Science, Zhejiang University of Technology, Hangzhou 310023, China; (Z.L.); (G.W.); (T.X.); (S.L.); (C.W.)
| | | | - Jiali Zhou
- College of Science, Zhejiang University of Technology, Hangzhou 310023, China; (Z.L.); (G.W.); (T.X.); (S.L.); (C.W.)
| |
Collapse
|
31
|
Kabir A, Bhattarai M, Rasmussen KØ, Shehu A, Bishop AR, Alexandrov B, Usheva A. Advancing Transcription Factor Binding Site Prediction Using DNA Breathing Dynamics and Sequence Transformers via Cross Attention. bioRxiv 2024:2024.01.16.575935. [PMID: 38293094 PMCID: PMC10827174 DOI: 10.1101/2024.01.16.575935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Understanding the impact of genomic variants on transcription factor binding and gene regulation remains a key area of research, with implications for unraveling the complex mechanisms underlying various functional effects. Our study delves into the role of DNA's biophysical properties, including thermodynamic stability, shape, and flexibility in transcription factor (TF) binding. We developed a multi-modal deep learning model integrating these properties with DNA sequence data. Trained on ChIP-Seq (chromatin immunoprecipitation sequencing) data in vivo involving 690 TF-DNA binding events in human genome, our model significantly improves prediction performance in over 660 binding events, with up to 9.6% increase in AUROC metric compared to the baseline model when using no DNA biophysical properties explicitly. Further, we expanded our analysis to in vitro high-throughput Systematic Evolution of Ligands by Exponential enrichment (SELEX) and Protein Binding Microarray (PBM) datasets, comparing our model with established frameworks. The inclusion of DNA breathing features consistently improved TF binding predictions across different cell lines in these datasets. Notably, for complex ChIP-Seq datasets, integrating DNABERT2 with a cross-attention mechanism provided greater predictive capabilities and insights into the mechanisms of disease-related non-coding variants found in genome-wide association studies. This work highlights the importance of DNA biophysical characteristics in TF binding and the effectiveness of multi-modal deep learning models in gene regulation studies.
Collapse
Affiliation(s)
- Anowarul Kabir
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, 87544, NM, USA
- Department of Computer Science, George Mason University, 4400 University Dr, 22030, VA, USA
| | - Manish Bhattarai
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, 87544, NM, USA
| | - Kim Ø Rasmussen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, 87544, NM, USA
| | - Amarda Shehu
- Department of Computer Science, George Mason University, 4400 University Dr, 22030, VA, USA
| | - Alan R Bishop
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, 87544, NM, USA
| | - Boian Alexandrov
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, 87544, NM, USA
| | - Anny Usheva
- Department of Surgery, Brown University, 69 Brown St Box 1822, 02912, RI, USA
| |
Collapse
|
32
|
An Q, Chen W, Shao W. A Deep Convolutional Neural Network for Pneumonia Detection in X-ray Images with Attention Ensemble. Diagnostics (Basel) 2024; 14:390. [PMID: 38396430 PMCID: PMC10887593 DOI: 10.3390/diagnostics14040390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 02/07/2024] [Accepted: 02/09/2024] [Indexed: 02/25/2024] Open
Abstract
In the domain of AI-driven healthcare, deep learning models have markedly advanced pneumonia diagnosis through X-ray image analysis, thus indicating a significant stride in the efficacy of medical decision systems. This paper presents a novel approach utilizing a deep convolutional neural network that effectively amalgamates the strengths of EfficientNetB0 and DenseNet121, and it is enhanced by a suite of attention mechanisms for refined pneumonia image classification. Leveraging pre-trained models, our network employs multi-head, self-attention modules for meticulous feature extraction from X-ray images. The model's integration and processing efficiency are further augmented by a channel-attention-based feature fusion strategy, one that is complemented by a residual block and an attention-augmented feature enhancement and dynamic pooling strategy. Our used dataset, which comprises a comprehensive collection of chest X-ray images, represents both healthy individuals and those affected by pneumonia, and it serves as the foundation for this research. This study delves deep into the algorithms, architectural details, and operational intricacies of the proposed model. The empirical outcomes of our model are noteworthy, with an exceptional performance marked by an accuracy of 95.19%, a precision of 98.38%, a recall of 93.84%, an F1 score of 96.06%, a specificity of 97.43%, and an AUC of 0.9564 on the test dataset. These results not only affirm the model's high diagnostic accuracy, but also highlight its promising potential for real-world clinical deployment.
Collapse
Affiliation(s)
- Qiuyu An
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China;
| | - Wei Chen
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China;
| | - Wei Shao
- Nanjing University of Aeronautics and Astronautics Shenzhen Research Institute, Shenzhen 518067, China
| |
Collapse
|
33
|
Alzoubi I, Zhang L, Zheng Y, Loh C, Wang X, Graeber MB. PathoGraph: An Attention-Based Graph Neural Network Capable of Prognostication Based on CD276 Labelling of Malignant Glioma Cells. Cancers (Basel) 2024; 16:750. [PMID: 38398141 PMCID: PMC10886785 DOI: 10.3390/cancers16040750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 02/07/2024] [Accepted: 02/08/2024] [Indexed: 02/25/2024] Open
Abstract
Computerized methods have been developed that allow quantitative morphological analyses of whole slide images (WSIs), e.g., of immunohistochemical stains. The latter are attractive because they can provide high-resolution data on the distribution of proteins in tissue. However, many immunohistochemical results are complex because the protein of interest occurs in multiple locations (in different cells and also extracellularly). We have recently established an artificial intelligence framework, PathoFusion which utilises a bifocal convolutional neural network (BCNN) model for detecting and counting arbitrarily definable morphological structures. We have now complemented this model by adding an attention-based graph neural network (abGCN) for the advanced analysis and automated interpretation of such data. Classical convolutional neural network (CNN) models suffer from limitations when handling global information. In contrast, our abGCN is capable of creating a graph representation of cellular detail from entire WSIs. This abGCN method combines attention learning with visualisation techniques that pinpoint the location of informative cells and highlight cell-cell interactions. We have analysed cellular labelling for CD276, a protein of great interest in cancer immunology and a potential marker of malignant glioma cells/putative glioma stem cells (GSCs). We are especially interested in the relationship between CD276 expression and prognosis. The graphs permit predicting individual patient survival on the basis of GSC community features. Our experiments lay a foundation for the use of the BCNN-abGCN tool chain in automated diagnostic prognostication using immunohistochemically labelled histological slides, but the method is essentially generic and potentially a widely usable tool in medical research and AI based healthcare applications.
Collapse
Affiliation(s)
- Islam Alzoubi
- School of Computer Science, The University of Sydney, J12/1 Cleveland St, Darlington, Sydney, NSW 2008, Australia; (I.A.); (L.Z.)
| | - Lin Zhang
- School of Computer Science, The University of Sydney, J12/1 Cleveland St, Darlington, Sydney, NSW 2008, Australia; (I.A.); (L.Z.)
| | - Yuqi Zheng
- Ken Parker Brain Tumour Research Laboratories, Brain and Mind Centre, Faculty of Medicine and Health, University of Sydney, Camperdown, NSW 2050, Australia; (Y.Z.); (C.L.)
| | - Christina Loh
- Ken Parker Brain Tumour Research Laboratories, Brain and Mind Centre, Faculty of Medicine and Health, University of Sydney, Camperdown, NSW 2050, Australia; (Y.Z.); (C.L.)
| | - Xiuying Wang
- School of Computer Science, The University of Sydney, J12/1 Cleveland St, Darlington, Sydney, NSW 2008, Australia; (I.A.); (L.Z.)
| | - Manuel B. Graeber
- Ken Parker Brain Tumour Research Laboratories, Brain and Mind Centre, Faculty of Medicine and Health, University of Sydney, Camperdown, NSW 2050, Australia; (Y.Z.); (C.L.)
- University of Sydney Association of Professors (USAP), University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
34
|
Gao Q, Deng H, Zhang G. A Contraband Detection Scheme in X-ray Security Images Based on Improved YOLOv8s Network Model. Sensors (Basel) 2024; 24:1158. [PMID: 38400315 PMCID: PMC10893092 DOI: 10.3390/s24041158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Revised: 02/03/2024] [Accepted: 02/07/2024] [Indexed: 02/25/2024]
Abstract
X-ray inspections of contraband are widely used to maintain public transportation safety and protect life and property when people travel. To improve detection accuracy and reduce the probability of missed and false detection, a contraband detection algorithm YOLOv8s-DCN-EMA-IPIO* based on YOLOv8s is proposed. Firstly, the super-resolution reconstruction method based on the SRGAN network enhances the original data set, which is more conducive to model training. Secondly, DCNv2 (deformable convolution net v2) is introduced in the backbone network and merged with the C2f layer to improve the ability of the feature extraction and robustness of the model. Then, an EMA (efficient multi-scale attention) mechanism is proposed to suppress the interference of complex background noise and occlusion overlap in the detection process. Finally, the IPIO (improved pigeon-inspired optimization), which is based on the cross-mutation strategy, is employed to maximize the convolutional neural network's learning rate to derive the optimal group's weight information and ultimately improve the model's detection and recognition accuracy. The experimental results show that on the self-built data set, the mAP (mean average precision) of the improved model YOLOv8s-DCN-EMA-IPIO* is 73.43%, 3.98% higher than that of the original model YOLOv8s, and the FPS is 95, meeting the deployment requirements of both high precision and real-time.
Collapse
Affiliation(s)
- Qingji Gao
- Robotics Institute, Civil Aviation University of China, Tianjin 300300, China
| | - Haozhi Deng
- Robotics Institute, Civil Aviation University of China, Tianjin 300300, China
| | - Gaowei Zhang
- School of Electronic Information and Automation, Civil Aviation University of China, Tianjin 300300, China
| |
Collapse
|
35
|
Zhu PC, Wan JJ, Shao W, Meng XC, Chen BL. Colorectal image analysis for polyp diagnosis. Front Comput Neurosci 2024; 18:1356447. [PMID: 38404511 PMCID: PMC10884282 DOI: 10.3389/fncom.2024.1356447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 01/05/2024] [Indexed: 02/27/2024] Open
Abstract
Colorectal polyp is an important early manifestation of colorectal cancer, which is significant for the prevention of colorectal cancer. Despite timely detection and manual intervention of colorectal polyps can reduce their chances of becoming cancerous, most existing methods ignore the uncertainties and location problems of polyps, causing a degradation in detection performance. To address these problems, in this paper, we propose a novel colorectal image analysis method for polyp diagnosis via PAM-Net. Specifically, a parallel attention module is designed to enhance the analysis of colorectal polyp images for improving the certainties of polyps. In addition, our method introduces the GWD loss to enhance the accuracy of polyp diagnosis from the perspective of polyp location. Extensive experimental results demonstrate the effectiveness of the proposed method compared with the SOTA baselines. This study enhances the performance of polyp detection accuracy and contributes to polyp detection in clinical medicine.
Collapse
Affiliation(s)
- Peng-Cheng Zhu
- Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, China
| | - Jing-Jing Wan
- Department of Gastroenterology, The Second People's Hospital of Huai'an, The Affiliated Huai'an Hospital of Xuzhou Medical University, Huaian, Jiangsu, China
| | - Wei Shao
- Nanjing University of Aeronautics and Astronautics Shenzhen Research Institute, Shenzhen, China
| | - Xian-Chun Meng
- Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, China
| | - Bo-Lun Chen
- Faculty of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, China
- Department of Physics, University of Fribourg, Fribourg, Switzerland
| |
Collapse
|
36
|
Li M, Gong Y, Zheng Z. Finger Vein Identification Based on Large Kernel Convolution and Attention Mechanism. Sensors (Basel) 2024; 24:1132. [PMID: 38400290 PMCID: PMC10892868 DOI: 10.3390/s24041132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 02/04/2024] [Accepted: 02/06/2024] [Indexed: 02/25/2024]
Abstract
FV (finger vein) identification is a biometric identification technology that extracts the features of FV images for identity authentication. To address the limitations of CNN-based FV identification, particularly the challenge of small receptive fields and difficulty in capturing long-range dependencies, an FV identification method named Let-Net (large kernel and attention mechanism network) was introduced, which combines local and global information. Firstly, Let-Net employs large kernels to capture a broader spectrum of spatial contextual information, utilizing deep convolution in conjunction with residual connections to curtail the volume of model parameters. Subsequently, an integrated attention mechanism is applied to augment information flow within the channel and spatial dimensions, effectively modeling global information for the extraction of crucial FV features. The experimental results on nine public datasets show that Let-Net has excellent identification performance, and the EER and accuracy rate on the FV_USM dataset can reach 0.04% and 99.77%. The parameter number and FLOPs of Let-Net are only 0.89M and 0.25G, which means that the time cost of training and reasoning of the model is low, and it is easier to deploy and integrate into various applications.
Collapse
Affiliation(s)
- Meihui Li
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China;
- Jiangsu Engineering Laboratory of Cyberspace Security, Suzhou 215006, China
| | - Yufei Gong
- School of Software Engineering, Xi’an Jiaotong University, Xi’an 710049, China;
| | - Zhaohui Zheng
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China;
- Jiangsu Engineering Laboratory of Cyberspace Security, Suzhou 215006, China
| |
Collapse
|
37
|
Xiang X, Gao J, Ding Y. DeepPPThermo: A Deep Learning Framework for Predicting Protein Thermostability Combining Protein-Level and Amino Acid-Level Features. J Comput Biol 2024; 31:147-160. [PMID: 38100126 DOI: 10.1089/cmb.2023.0097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2024] Open
Abstract
Using wet experimental methods to discover new thermophilic proteins or improve protein thermostability is time-consuming and expensive. Machine learning methods have shown powerful performance in the study of protein thermostability in recent years. However, how to make full use of multiview sequence information to predict thermostability effectively is still a challenge. In this study, we proposed a deep learning-based classifier named DeepPPThermo that fuses features of classical sequence features and deep learning representation features for classifying thermophilic and mesophilic proteins. In this model, deep neural network (DNN) and bi-long short-term memory (Bi-LSTM) are used to mine hidden features. Furthermore, local attention and global attention mechanisms give different importance to multiview features. The fused features are fed to a fully connected network classifier to distinguish thermophilic and mesophilic proteins. Our model is comprehensively compared with advanced machine learning algorithms and deep learning algorithms, proving that our model performs better. We further compare the effects of removing different features on the classification results, demonstrating the importance of each feature and the robustness of the model. Our DeepPPThermo model can be further used to explore protein diversity, identify new thermophilic proteins, and guide directed mutations of mesophilic proteins.
Collapse
Affiliation(s)
- Xiaoyang Xiang
- School of Science, Jiangnan University, Wuxi, P. R. China
| | - Jiaxuan Gao
- School of Science, Jiangnan University, Wuxi, P. R. China
| | - Yanrui Ding
- School of Science, Jiangnan University, Wuxi, P. R. China
| |
Collapse
|
38
|
Liu X, Huang Z, Zhang Y, Jia Y, Wen W. CNN and Attention-Based Joint Source Channel Coding for Semantic Communications in WSNs. Sensors (Basel) 2024; 24:957. [PMID: 38339674 PMCID: PMC10857329 DOI: 10.3390/s24030957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2023] [Revised: 01/24/2024] [Accepted: 01/28/2024] [Indexed: 02/12/2024]
Abstract
Wireless Sensor Networks (WSNs) have emerged as an efficient solution for numerous real-time applications, attributable to their compactness, cost-effectiveness, and ease of deployment. The rapid advancement of 5G technology and mobile edge computing (MEC) in recent years has catalyzed the transition towards large-scale deployment of WSN devices. However, the resulting data proliferation and the dynamics of communication environments introduce new challenges for WSN communication: (1) ensuring robust communication in adverse environments and (2) effectively alleviating bandwidth pressure from massive data transmission. In response to the aforementioned challenges, this paper proposes a semantic communication solution. Specifically, considering the limited computational and storage resources of WSN devices, we propose a flexible Attention-based Adaptive Coding (AAC) module. This module integrates window and channel attention mechanisms, dynamically adjusts semantic information in response to the current channel state, and facilitates adaptation of a single model across various Signal-to-Noise Ratio (SNR) environments. Furthermore, to validate the effectiveness of this approach, the paper introduces an end-to-end Joint Source Channel Coding (JSCC) scheme for image semantic communication, employing the AAC module. Experimental results demonstrate that the proposed scheme surpasses existing deep JSCC schemes across datasets of varying resolutions; furthermore, they validate the efficacy of the proposed AAC module, which is capable of dynamically adjusting critical information according to the current channel state. This enables the model to be trained over a range of SNRs and obtain better results.
Collapse
Affiliation(s)
| | | | | | | | - Wanli Wen
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 401331, China; (X.L.); (Z.H.); (Y.Z.); (Y.J.)
| |
Collapse
|
39
|
Li G, Shi G, Zhu C. Dynamic Serpentine Convolution with Attention Mechanism Enhancement for Beef Cattle Behavior Recognition. Animals (Basel) 2024; 14:466. [PMID: 38338110 PMCID: PMC10854982 DOI: 10.3390/ani14030466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 01/25/2024] [Accepted: 01/26/2024] [Indexed: 02/12/2024] Open
Abstract
Behavior recognition in beef cattle is a crucial component of beef cattle behavior warning and intelligent farming. Traditional beef cattle behavior recognition faces challenges in both difficulty in identification and low accuracy. In this study, the YOLOv8n_BiF_DSC (Fusion of Dynamic Snake Convolution and BiFormer Attention) algorithm was employed for the non-intrusive recognition of beef cattle behavior. The specific steps are as follows: 45 beef cattle were observed using a fixed camera (A LINE OF DEFENSE) and a mobile phone (Huawei Mate20Pro) to collect and filter posture data, yielding usable videos ranging from 1 to 30 min in length. These videos cover nine different behaviors in various scenarios, including standing, lying, mounting, fighting, licking, eating, drinking, walking, and searching. After data augmentation, the dataset comprised 34,560 samples. The convolutional layer (CONV) was improved by introducing variable convolution and dynamic snake-like convolution modules. The dynamic snake-like convolution, which yielded the best results, expanded the model's receptive field, dynamically perceived key features of beef cattle behavior, and enhanced the algorithm's feature extraction capability. Attention mechanism modules, including SE (Squeeze-and-Excitation Networks), CBAM (Convolutional Block Attention Module), CA (Coordinate Attention), and BiFormer (Vision Transformer with Bi-Level Routing Attention), were introduced. The BiFormer attention mechanism, selected for its optimal performance, improved the algorithm's ability to capture long-distance context dependencies. The model's computational efficiency was enhanced through dynamic and query-aware perception. Experimental results indicated that YOLOv8n_BiF_DSC achieved the best results among all improved algorithms in terms of accuracy, average precision at IoU 50, and average precision at IoU 50:95. The accuracy of beef cattle behavior recognition reached 93.6%, with the average precision at IoU 50 and IoU 50:95 being 96.5% and 71.5%, respectively. This represents a 5.3%, 5.2%, and 7.1% improvement over the original YOLOv8n. Notably, the average accuracy of recognizing the lying posture of beef cattle reached 98.9%. In conclusion, the YOLOv8n_BiF_DSC algorithm demonstrates excellent performance in feature extraction and high-level data fusion, displaying high robustness and adaptability. It provides theoretical and practical support for the intelligent recognition and management of beef cattle.
Collapse
Affiliation(s)
- Guangbo Li
- College of Electronic and Information Engineering, Huaibei Institute of Technology, Huaibei 235000, China
| | - Guolong Shi
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei 230036, China
| | - Changjie Zhu
- College of Electronic and Information Engineering, Huaibei Institute of Technology, Huaibei 235000, China
| |
Collapse
|
40
|
Huang W, Li Y, Tang J, Qian L. Fault Diagnosis Methods for an Artillery Loading System Driving Motor in Complex Noisy Environments. Sensors (Basel) 2024; 24:847. [PMID: 38339564 PMCID: PMC10857249 DOI: 10.3390/s24030847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 01/22/2024] [Accepted: 01/26/2024] [Indexed: 02/12/2024]
Abstract
With the development of modern military technology, electrical drive technology has become a power source for modern artillery. In fault monitoring of a driving motor mounted on a piece of artillery, various sensors are susceptible to interference from the complex environment, both inside and outside the artillery itself. In this study, we creatively propose a fault diagnosis model based on an attention mechanism, the AdaBoost method and a wavelet noise reduction network to address the difficulty in obtaining high-quality motor signals in complex noisy interference environments. First, multiple fusion wavelet basis, soft thresholding, and index soft filter optimization were used to train multiple wavelet noise reduction networks that could recover sample signals under different noise conditions. Second, a convolutional neural network (CNN) classification module was added to construct end-to-end classification models that could correctly identify faults. The above basis classification models were then integrated into the AdaBoost method with an improved attention mechanism to develop a fault diagnosis model suitable for complex noisy environments. Finally, two experiments were conducted to validate the proposed method. Under motor signals with varying signal-to-noise ratios (SNRs) noises, the proposed method achieved an average accuracy of 92%, surpassing the conventional method by over 8.5%.
Collapse
Affiliation(s)
- Wenkuan Huang
- School of Mechanical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China; (W.H.); (Y.L.); (J.T.)
| | - Yong Li
- School of Mechanical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China; (W.H.); (Y.L.); (J.T.)
| | - Jinsong Tang
- School of Mechanical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China; (W.H.); (Y.L.); (J.T.)
| | - Linfang Qian
- School of Mechanical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China; (W.H.); (Y.L.); (J.T.)
- Northwest Institute of Mechanical and Electrical Engineering, Xianyang 712099, China
| |
Collapse
|
41
|
Song T, Yang Q, Qu P, Qiao L, Wang X. Attenphos: General Phosphorylation Site Prediction Model Based on Attention Mechanism. Int J Mol Sci 2024; 25:1526. [PMID: 38338804 PMCID: PMC10855885 DOI: 10.3390/ijms25031526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 01/18/2024] [Accepted: 01/23/2024] [Indexed: 02/12/2024] Open
Abstract
Phosphorylation site prediction has important application value in the field of bioinformatics. It can act as an important reference and help with protein function research, protein structure research, and drug discovery. So, it is of great significance to propose scientific and effective calculation methods to accurately predict phosphorylation sites. In this study, we propose a new method, Attenphos, based on the self-attention mechanism for predicting general phosphorylation sites in proteins. The method not only captures the long-range dependence information of proteins but also better represents the correlation between amino acids through feature vector encoding transformation. Attenphos takes advantage of the one-dimensional convolutional layer to reduce the number of model parameters, improve model efficiency and prediction accuracy, and enhance model generalization. Comparisons between our method and existing state-of-the-art prediction tools were made using balanced datasets from human proteins and unbalanced datasets from mouse proteins. We performed prediction comparisons using independent test sets. The results showed that Attenphos demonstrated the best overall performance in the prediction of Serine (S), Threonine (T), and Tyrosine (Y) sites on both balanced and unbalanced datasets. Compared to current state-of-the-art methods, Attenphos has significantly higher prediction accuracy. This proves the potential of Attenphos in accelerating the identification and functional analysis of protein phosphorylation sites and provides new tools and ideas for biological research and drug discovery.
Collapse
Affiliation(s)
| | | | | | | | - Xun Wang
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China; (T.S.); (Q.Y.); (P.Q.); (L.Q.)
| |
Collapse
|
42
|
Raj A, Gass A, Eisele P, Dabringhaus A, Kraemer M, Zöllner FG. A generalizable deep voxel-guided morphometry algorithm for the detection of subtle lesion dynamics in multiple sclerosis. Front Neurosci 2024; 18:1326108. [PMID: 38332857 PMCID: PMC10850259 DOI: 10.3389/fnins.2024.1326108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 01/10/2024] [Indexed: 02/10/2024] Open
Abstract
Introduction Multiple sclerosis (MS) is a chronic neurological disorder characterized by the progressive loss of myelin and axonal structures in the central nervous system. Accurate detection and monitoring of MS-related changes in brain structures are crucial for disease management and treatment evaluation. We propose a deep learning algorithm for creating Voxel-Guided Morphometry (VGM) maps from longitudinal MRI brain volumes for analyzing MS disease activity. Our approach focuses on developing a generalizable model that can effectively be applied to unseen datasets. Methods Longitudinal MS patient high-resolution 3D T1-weighted follow-up imaging from three different MRI systems were analyzed. We employed a 3D residual U-Net architecture with attention mechanisms. The U-Net serves as the backbone, enabling spatial feature extraction from MRI volumes. Attention mechanisms are integrated to enhance the model's ability to capture relevant information and highlight salient regions. Furthermore, we incorporate image normalization by histogram matching and resampling techniques to improve the networks' ability to generalize to unseen datasets from different MRI systems across imaging centers. This ensures robust performance across diverse data sources. Results Numerous experiments were conducted using a dataset of 71 longitudinal MRI brain volumes of MS patients. Our approach demonstrated a significant improvement of 4.3% in mean absolute error (MAE) against the state-of-the-art (SOTA) method. Furthermore, the algorithm's generalizability was evaluated on two unseen datasets (n = 116) with an average improvement of 4.2% in MAE over the SOTA approach. Discussion Results confirm that the proposed approach is fast and robust and has the potential for broader clinical applicability.
Collapse
Affiliation(s)
- Anish Raj
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Baden Württemberg, Germany
- Mannheim Institute for Intelligent Systems in Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Baden Württemberg, Germany
| | - Achim Gass
- Department of Neurology, University Medical Centre Mannheim, Medical Faculty Mannheim, Heidelberg University, Mannheim, Baden Württemberg, Germany
- Mannheim Center for Translational Neurosciences, Heidelberg University, Mannheim, Baden Württemberg, Germany
| | - Philipp Eisele
- Department of Neurology, University Medical Centre Mannheim, Medical Faculty Mannheim, Heidelberg University, Mannheim, Baden Württemberg, Germany
- Mannheim Center for Translational Neurosciences, Heidelberg University, Mannheim, Baden Württemberg, Germany
| | | | - Matthias Kraemer
- VGMorph GmbH, Mülheim an der Ruhr, Nordrhein-Westfalen, Germany
- NeuroCentrum, Grevenbroich, Nordrhein-Westfalen, Germany
| | - Frank G. Zöllner
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Baden Württemberg, Germany
- Mannheim Institute for Intelligent Systems in Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Baden Württemberg, Germany
| |
Collapse
|
43
|
Wu J, Kong L, Kang S, Zuo H, Yang Y, Cheng Z. Aircraft Engine Fault Diagnosis Model Based on 1DCNN-BiLSTM with CBAM. Sensors (Basel) 2024; 24:780. [PMID: 38339497 PMCID: PMC10857147 DOI: 10.3390/s24030780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 01/16/2024] [Accepted: 01/19/2024] [Indexed: 02/12/2024]
Abstract
As the operational status of aircraft engines evolves, their fault modes also undergo changes. In response to the operational degradation trend of aircraft engines, this paper proposes an aircraft engine fault diagnosis model based on 1DCNN-BiLSTM with CBAM. The model can be directly applied to raw monitoring data without the need for additional algorithms to extract fault degradation features. It fully leverages the advantages of 1DCNN in extracting local features along the spatial dimension and incorporates CBAM, a channel and spatial attention mechanism. CBAM could assign higher weights to features relevant to fault categories and make the model pay more attention to them. Subsequently, it utilizes BiLSTM to handle nonlinear time feature sequences and bidirectional contextual feature information. Finally, experimental validation is conducted on the publicly available CMAPSS dataset from NASA, categorizing fault modes into three types: faultless, HPC fault (the single fault), and HPC&Fan fault (the mixed fault). Comparative analysis with other models reveals that the proposed model has a higher classification accuracy, which is of practical significance in improving the reliability of aircraft engine operations and for Remaining Useful Life (RUL) prediction.
Collapse
Affiliation(s)
- Jiaju Wu
- Institute of Computer Application China Academy of Engineering Physics, Mianyang 621999, China
- College of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
| | - Linggang Kong
- Institute of Computer Application China Academy of Engineering Physics, Mianyang 621999, China
| | - Shijia Kang
- Institute of Computer Application China Academy of Engineering Physics, Mianyang 621999, China
| | - Hongfu Zuo
- College of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
| | - Yonghui Yang
- Institute of Computer Application China Academy of Engineering Physics, Mianyang 621999, China
| | - Zheng Cheng
- Institute of Computer Application China Academy of Engineering Physics, Mianyang 621999, China
| |
Collapse
|
44
|
Zeng F, Guo M, Tan L, Guo F, Liu X. Wearable Sensor-Based Residual Multifeature Fusion Shrinkage Networks for Human Activity Recognition. Sensors (Basel) 2024; 24:758. [PMID: 38339474 PMCID: PMC10857031 DOI: 10.3390/s24030758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 01/20/2024] [Accepted: 01/22/2024] [Indexed: 02/12/2024]
Abstract
Human activity recognition (HAR) based on wearable sensors has emerged as a low-cost key-enabling technology for applications such as human-computer interaction and healthcare. In wearable sensor-based HAR, deep learning is desired for extracting human active features. Due to the spatiotemporal dynamic of human activity, a special deep learning network for recognizing the temporal continuous activities of humans is required to improve the recognition accuracy for supporting advanced HAR applications. To this end, a residual multifeature fusion shrinkage network (RMFSN) is proposed. The RMFSN is an improved residual network which consists of a multi-branch framework, a channel attention shrinkage block (CASB), and a classifier network. The special multi-branch framework utilizes a 1D-CNN, a lightweight temporal attention mechanism, and a multi-scale feature extraction method to capture diverse activity features via multiple branches. The CASB is proposed to automatically select key features from the diverse features for each activity, and the classifier network outputs the final recognition results. Experimental results have shown that the accuracy of the proposed RMFSN for the public datasets UCI-HAR, WISDM, and OPPORTUNITY are 98.13%, 98.35%, and 93.89%, respectively. In comparison with existing advanced methods, the proposed RMFSN could achieve higher accuracy while requiring fewer model parameters.
Collapse
Affiliation(s)
| | - Mian Guo
- School of Electronics and Information, Guangdong Polytechnic Normal University, Guangzhou 510660, China; (F.Z.); (L.T.); (F.G.)
| | | | | | | |
Collapse
|
45
|
Peng J, Su W, Chen H, Sun J, Tian Z. CL-SPO2Net: Contrastive Learning Spatiotemporal Attention Network for Non-Contact Video-Based SpO2 Estimation. Bioengineering (Basel) 2024; 11:113. [PMID: 38391599 PMCID: PMC10885926 DOI: 10.3390/bioengineering11020113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 01/18/2024] [Accepted: 01/23/2024] [Indexed: 02/24/2024] Open
Abstract
Video-based peripheral oxygen saturation (SpO2) estimation, utilizing solely RGB cameras, offers a non-contact approach to measuring blood oxygen levels. Previous studies set a stable and unchanging environment as the premise for non-contact blood oxygen estimation. Additionally, they utilized a small amount of labeled data for system training and learning. However, it is challenging to train optimal model parameters with a small dataset. The accuracy of blood oxygen detection is easily affected by ambient light and subject movement. To address these issues, this paper proposes a contrastive learning spatiotemporal attention network (CL-SPO2Net), an innovative semi-supervised network for video-based SpO2 estimation. Spatiotemporal similarities in remote photoplethysmography (rPPG) signals were found in video segments containing facial or hand regions. Subsequently, integrating deep neural networks with machine learning expertise enabled the estimation of SpO2. The method had good feasibility in the case of small-scale labeled datasets, with the mean absolute error between the camera and the reference pulse oximeter of 0.85% in the stable environment, 1.13% with lighting fluctuations, and 1.20% in the facial rotation situation.
Collapse
Affiliation(s)
- Jiahe Peng
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Weihua Su
- School of Mechanical Engineering, Hebei University of Technology, Tianjin 300401, China
| | - Haiyong Chen
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Jingsheng Sun
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Zandong Tian
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| |
Collapse
|
46
|
Zeng J, Gao X, Gao L, Yu Y, Shen L, Pan X. Recognition of rare antinuclear antibody patterns based on a novel attention-based enhancement framework. Brief Bioinform 2024; 25:bbad531. [PMID: 38279651 PMCID: PMC10818137 DOI: 10.1093/bib/bbad531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 12/17/2023] [Accepted: 12/19/2023] [Indexed: 01/28/2024] Open
Abstract
Rare antinuclear antibody (ANA) pattern recognition has been a widely applied technology for routine ANA screening in clinical laboratories. In recent years, the application of deep learning methods in recognizing ANA patterns has witnessed remarkable advancements. However, the majority of studies in this field have primarily focused on the classification of the most common ANA patterns, while another subset has concentrated on the detection of mitotic metaphase cells. To date, no prior research has been specifically dedicated to the identification of rare ANA patterns. In the present paper, we introduce a novel attention-based enhancement framework, which was designed for the recognition of rare ANA patterns in ANA-indirect immunofluorescence images. More specifically, we selected the algorithm with the best performance as our target detection network by conducting comparative experiments. We then further developed and enhanced the chosen algorithm through a series of optimizations. Then, attention mechanism was introduced to facilitate neural networks in expediting the learning process, extracting more essential and distinctive features for the target features that belong to the specific patterns. The proposed approach has helped to obtained high precision rate of 86.40%, 82.75% recall, 84.24% F1 score and 84.64% mean average precision for a 9-category rare ANA pattern detection task on our dataset. Finally, we evaluated the potential of the model as medical technologist assistant and observed that the technologist's performance improved after referring to the results of the model prediction. These promising results highlighted its potential as an efficient and reliable tool to assist medical technologists in their clinical practice.
Collapse
Affiliation(s)
- Junxiang Zeng
- Department of Clinical Laboratory, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Faculty of Medical Laboratory Science, College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Institute of Artificial Intelligence Medicine, Shanghai Academy of Experimental Medicine, Shanghai, China
| | - Xiupan Gao
- Department of Clinical Laboratory, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Limei Gao
- Department of Immunology and Rheumatology, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Youyou Yu
- Department of Clinical Laboratory, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Lisong Shen
- Department of Clinical Laboratory, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Faculty of Medical Laboratory Science, College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Institute of Artificial Intelligence Medicine, Shanghai Academy of Experimental Medicine, Shanghai, China
| | - Xiujun Pan
- Department of Clinical Laboratory, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
47
|
Tan D, Yang C, Wang J, Su Y, Zheng C. scAMAC: self-supervised clustering of scRNA-seq data based on adaptive multi-scale autoencoder. Brief Bioinform 2024; 25:bbae068. [PMID: 38426327 PMCID: PMC10905526 DOI: 10.1093/bib/bbae068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 01/15/2024] [Accepted: 01/26/2024] [Indexed: 03/02/2024] Open
Abstract
Cluster assignment is vital to analyzing single-cell RNA sequencing (scRNA-seq) data to understand high-level biological processes. Deep learning-based clustering methods have recently been widely used in scRNA-seq data analysis. However, existing deep models often overlook the interconnections and interactions among network layers, leading to the loss of structural information within the network layers. Herein, we develop a new self-supervised clustering method based on an adaptive multi-scale autoencoder, called scAMAC. The self-supervised clustering network utilizes the Multi-Scale Attention mechanism to fuse the feature information from the encoder, hidden and decoder layers of the multi-scale autoencoder, which enables the exploration of cellular correlations within the same scale and captures deep features across different scales. The self-supervised clustering network calculates the membership matrix using the fused latent features and optimizes the clustering network based on the membership matrix. scAMAC employs an adaptive feedback mechanism to supervise the parameter updates of the multi-scale autoencoder, obtaining a more effective representation of cell features. scAMAC not only enables cell clustering but also performs data reconstruction through the decoding layer. Through extensive experiments, we demonstrate that scAMAC is superior to several advanced clustering and imputation methods in both data clustering and reconstruction. In addition, scAMAC is beneficial for downstream analysis, such as cell trajectory inference. Our scAMAC model codes are freely available at https://github.com/yancy2024/scAMAC.
Collapse
Affiliation(s)
- Dayu Tan
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 230601 Hefei, China
| | - Cheng Yang
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 230601 Hefei, China
| | - Jing Wang
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 230601 Hefei, China
| | - Yansen Su
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 230601 Hefei, China
| | - Chunhou Zheng
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 230601 Hefei, China
| |
Collapse
|
48
|
Zhou H, Wu S, Xu Z, Sun H. Automatic detection of standing dead trees based on improved YOLOv7 from airborne remote sensing imagery. Front Plant Sci 2024; 15:1278161. [PMID: 38318496 PMCID: PMC10839092 DOI: 10.3389/fpls.2024.1278161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 01/05/2024] [Indexed: 02/07/2024]
Abstract
Detecting and localizing standing dead trees (SDTs) is crucial for effective forest management and conservation. Due to challenges posed by mountainous terrain and road conditions, conducting a swift and comprehensive survey of SDTs through traditional manual inventory methods is considerably difficult. In recent years, advancements in deep learning and remote sensing technology have facilitated real-time and efficient detection of dead trees. Nevertheless, challenges persist in identifying individual dead trees in airborne remote sensing images, attributed to factors such as small target size, mutual occlusion and complex backgrounds. These aspects collectively contribute to the increased difficulty of detecting dead trees at a single-tree scale. To address this issue, the paper introduces an improved You Only Look Once version 7 (YOLOv7) model that incorporates the Simple Parameter-Free Attention Module (SimAM), an unparameterized attention mechanism. This improvement aims to enhance the network's feature extraction capabilities and increase the model's sensitivity to small target dead trees. To validate the superiority of SimAM_YOLOv7, we compared it with four widely adopted attention mechanisms. Additionally, a method to enhance model robustness is presented, involving the replacement of the Complete Intersection over Union (CIoU) loss in the original YOLOv7 model with the Wise-IoU (WIoU) loss function. Following these, we evaluated detection accuracy using a self-developed dataset of SDTs in forests. The results indicate that the improved YOLOv7 model can effectively identify dead trees in airborne remote sensing images, achieving precision, recall and mAP@0.5 values of 94.31%, 93.13% and 98.03%, respectively. These values are 3.67%, 2.28% and 1.56% higher than those of the original YOLOv7 model. This improvement model provides a convenient solution for forest management.
Collapse
Affiliation(s)
- Hongwei Zhou
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| | - Shangxin Wu
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| | - Zihan Xu
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| | - Hong Sun
- Key Laboratory of National Forestry and Grassland Administration on Forest and Grassland Pest Monitoring and Warning, Center for Biological Disaster Prevention and Control, National Forestry and Grassland Administration, Shenyang, China
| |
Collapse
|
49
|
Mehmood F, Arshad S, Shoaib M. ADH-Enhancer: an attention-based deep hybrid framework for enhancer identification and strength prediction. Brief Bioinform 2024; 25:bbae030. [PMID: 38385876 PMCID: PMC10885011 DOI: 10.1093/bib/bbae030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/30/2023] [Accepted: 01/11/2024] [Indexed: 02/23/2024] Open
Abstract
Enhancers play an important role in the process of gene expression regulation. In DNA sequence abundance or absence of enhancers and irregularities in the strength of enhancers affects gene expression process that leads to the initiation and propagation of diverse types of genetic diseases such as hemophilia, bladder cancer, diabetes and congenital disorders. Enhancer identification and strength prediction through experimental approaches is expensive, time-consuming and error-prone. To accelerate and expedite the research related to enhancers identification and strength prediction, around 19 computational frameworks have been proposed. These frameworks used machine and deep learning methods that take raw DNA sequences and predict enhancer's presence and strength. However, these frameworks still lack in performance and are not useful in real time analysis. This paper presents a novel deep learning framework that uses language modeling strategies for transforming DNA sequences into statistical feature space. It applies transfer learning by training a language model in an unsupervised fashion by predicting a group of nucleotides also known as k-mers based on the context of existing k-mers in a sequence. At the classification stage, it presents a novel classifier that reaps the benefits of two different architectures: convolutional neural network and attention mechanism. The proposed framework is evaluated over the enhancer identification benchmark dataset where it outperforms the existing best-performing framework by 5%, and 9% in terms of accuracy and MCC. Similarly, when evaluated over the enhancer strength prediction benchmark dataset, it outperforms the existing best-performing framework by 4%, and 7% in terms of accuracy and MCC.
Collapse
Affiliation(s)
- Faiza Mehmood
- Department of Computer Science, University of Engineering and Technology Lahore, (Faisalabad Campus) Pakistan
| | - Shazia Arshad
- Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan
| | - Muhammad Shoaib
- Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan
| |
Collapse
|
50
|
Wang W, Zhang Q, Qi Z, Huang M. CenterNet-Saccade: Enhancing Sonar Object Detection with Lightweight Global Feature Extraction. Sensors (Basel) 2024; 24:665. [PMID: 38276357 DOI: 10.3390/s24020665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 01/14/2024] [Accepted: 01/17/2024] [Indexed: 01/27/2024]
Abstract
Sonar imaging technology is widely used in the field of marine and underwater monitoring because sound waves can be transmitted in elastic media, such as the atmosphere and seawater, without much interference. In underwater object detection, due to the unique characteristics of the monitored sonar image, and since the target in an image is often accompanied by its own shadow, we can use the relative relationship between the shadow and the target for detection. To make use of shadow-information-aided detection and realize accurate real-time detection in sonar images, we put forward a network based on a lightweight module. By using the attention mechanism with a global receptive field, the network can make the target pay attention to the shadow information in the global environment, and because of its exquisite design, the computational time of the network is greatly reduced. Specifically, we design a ShuffleBlock model adapted to Hourglass to make the backbone network lighter. The concept of CNN dimension reduction is applied to MHSA to make it more efficient while paying attention to global features. Finally, CenterNet's unreasonable distribution method of positive and negative samples is improved. Simulation experiments were carried out using the proposed sonar object detection dataset. The experimental results further verify that our improved model has obvious advantages over many existing conventional deep learning models. Moreover, the real-time monitoring performance of our proposed model is more conducive to the implementation in the field of ocean monitoring.
Collapse
Affiliation(s)
- Wenling Wang
- College of Information and Communication Engineering, Hainan University, Haikou 570228, China
| | - Qiaoxin Zhang
- College of Electronic and Information Engineering, Guangdong Ocean University, Zhanjiang 524088, China
| | - Zhisheng Qi
- College of Information and Communication Engineering, Hainan University, Haikou 570228, China
| | - Mengxing Huang
- College of Information and Communication Engineering, Hainan University, Haikou 570228, China
| |
Collapse
|