1
|
Chen W, Lim LJR, Lim RQR, Yi Z, Huang J, He J, Yang G, Liu B. Artificial intelligence powered advancements in upper extremity joint MRI: A review. Heliyon 2024; 10:e28731. [PMID: 38596104 PMCID: PMC11002577 DOI: 10.1016/j.heliyon.2024.e28731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 03/21/2024] [Accepted: 03/22/2024] [Indexed: 04/11/2024] Open
Abstract
Magnetic resonance imaging (MRI) is an indispensable medical imaging examination technique in musculoskeletal medicine. Modern MRI techniques achieve superior high-quality multiplanar imaging of soft tissue and skeletal pathologies without the harmful effects of ionizing radiation. Some current limitations of MRI include long acquisition times, artifacts, and noise. In addition, it is often challenging to distinguish abutting or closely applied soft tissue structures with similar signal characteristics. In the past decade, Artificial Intelligence (AI) has been widely employed in musculoskeletal MRI to help reduce the image acquisition time and improve image quality. Apart from being able to reduce medical costs, AI can assist clinicians in diagnosing diseases more accurately. This will effectively help formulate appropriate treatment plans and ultimately improve patient care. This review article intends to summarize AI's current research and application in musculoskeletal MRI, particularly the advancement of DL in identifying the structure and lesions of upper extremity joints in MRI images.
Collapse
Affiliation(s)
- Wei Chen
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
| | - Lincoln Jian Rong Lim
- Department of Medical Imaging, Western Health, Footscray Hospital, Victoria, Australia
- Department of Surgery, The University of Melbourne, Victoria, Australia
| | - Rebecca Qian Ru Lim
- Department of Hand & Reconstructive Microsurgery, Singapore General Hospital, Singapore
| | - Zhe Yi
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
| | - Jiaxing Huang
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Jia He
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Ge Yang
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Bo Liu
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
| |
Collapse
|
2
|
Sannasi Chakravarthy SR, Bharanidharan N, Vinoth Kumar V, Mahesh TR, Alqahtani MS, Guluwadi S. Deep transfer learning with fuzzy ensemble approach for the early detection of breast cancer. BMC Med Imaging 2024; 24:82. [PMID: 38589813 DOI: 10.1186/s12880-024-01267-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 03/30/2024] [Indexed: 04/10/2024] Open
Abstract
Breast Cancer is a significant global health challenge, particularly affecting women with higher mortality compared with other cancer types. Timely detection of such cancer types is crucial, and recent research, employing deep learning techniques, shows promise in earlier detection. The research focuses on the early detection of such tumors using mammogram images with deep-learning models. The paper utilized four public databases where a similar amount of 986 mammograms each for three classes (normal, benign, malignant) are taken for evaluation. Herein, three deep CNN models such as VGG-11, Inception v3, and ResNet50 are employed as base classifiers. The research adopts an ensemble method where the proposed approach makes use of the modified Gompertz function for building a fuzzy ranking of the base classification models and their decision scores are integrated in an adaptive manner for constructing the final prediction of results. The classification results of the proposed fuzzy ensemble approach outperform transfer learning models and other ensemble approaches such as weighted average and Sugeno integral techniques. The proposed ResNet50 ensemble network using the modified Gompertz function-based fuzzy ranking approach provides a superior classification accuracy of 98.986%.
Collapse
Affiliation(s)
- S R Sannasi Chakravarthy
- Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam, India
| | - N Bharanidharan
- School of Computer Science Engineering and Information systems, Vellore Institute of Technology, Vellore, 632014, India
| | - V Vinoth Kumar
- School of Computer Science Engineering and Information systems, Vellore Institute of Technology, Vellore, 632014, India
| | - T R Mahesh
- Department of Computer Science and Engineering JAIN (Deemed-to-be University), Bengaluru, 562112, India
| | - Mohammed S Alqahtani
- Radiological Sciences Department, College of Applied Medical Sciences, King Khalid University, Abha, 61421, Saudi Arabia
| | - Suresh Guluwadi
- Adama Science and Technology University, Adama, 302120, Ethiopia.
| |
Collapse
|
3
|
Wu R, Qin K, Fang Y, Xu Y, Zhang H, Li W, Luo X, Han Z, Liu S, Li Q. Application of the convolution neural network in determining the depth of invasion of gastrointestinal cancer: a systematic review and meta-analysis. J Gastrointest Surg 2024; 28:538-547. [PMID: 38583908 DOI: 10.1016/j.gassur.2023.12.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 12/16/2023] [Accepted: 12/30/2023] [Indexed: 04/09/2024]
Abstract
BACKGROUND With the development of endoscopic technology, endoscopic submucosal dissection (ESD) has been widely used in the treatment of gastrointestinal tumors. It is necessary to evaluate the depth of tumor invasion before the application of ESD. The convolution neural network (CNN) is a type of artificial intelligence that has the potential to assist in the classification of the depth of invasion in endoscopic images. This meta-analysis aimed to evaluate the performance of CNN in determining the depth of invasion of gastrointestinal tumors. METHODS A search on PubMed, Web of Science, and SinoMed was performed to collect the original publications about the use of CNN in determining the depth of invasion of gastrointestinal neoplasms. Pooled sensitivity and specificity were calculated using an exact binominal rendition of the bivariate mixed-effects regression model. I2 was used for the evaluation of heterogeneity. RESULTS A total of 17 articles were included; the pooled sensitivity was 84% (95% CI, 0.81-0.88), specificity was 91% (95% CI, 0.85-0.94), and the area under the curve (AUC) was 0.93 (95% CI, 0.90-0.95). The performance of CNN was significantly better than that of endoscopists (AUC: 0.93 vs 0.83, respectively; P = .0005). CONCLUSION Our review revealed that CNN is one of the most effective methods of endoscopy to evaluate the depth of invasion of early gastrointestinal tumors, which has the potential to work as a remarkable tool for clinical endoscopists to make decisions on whether the lesion is feasible for endoscopic treatment.
Collapse
Affiliation(s)
- Ruo Wu
- Nanfang Hospital (The First School of Clinical Medicine), Southern Medical University, Guangzhou, Guangdong, China
| | - Kaiwen Qin
- Department of Gastroenterology, Guangdong Provincial Key Laboratory of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Yuxin Fang
- Department of Gastroenterology, Guangdong Provincial Key Laboratory of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Yuyuan Xu
- Department of Hepatology Unit and Infectious Diseases, State Key Laboratory of Organ Failure Research, Guangdong Provincial Key Laboratory of Viral Hepatitis Research, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Haonan Zhang
- Department of Gastroenterology, Guangdong Provincial Key Laboratory of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Wenhua Li
- Nanfang Hospital (The First School of Clinical Medicine), Southern Medical University, Guangzhou, Guangdong, China
| | - Xiaobei Luo
- Department of Gastroenterology, Guangdong Provincial Key Laboratory of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Zelong Han
- Department of Gastroenterology, Guangdong Provincial Key Laboratory of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Side Liu
- Department of Gastroenterology, Guangdong Provincial Key Laboratory of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China; Pazhou Lab, Guangzhou, Guangdong, China
| | - Qingyuan Li
- Department of Gastroenterology, Guangdong Provincial Key Laboratory of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China.
| |
Collapse
|
4
|
Tan Y, Wang Z, Tan L, Li C, Deng C, Li J, Tang H, Qin J. Image detection of aortic dissection complications based on multi-scale feature fusion. Heliyon 2024; 10:e27678. [PMID: 38533058 PMCID: PMC10963251 DOI: 10.1016/j.heliyon.2024.e27678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 03/04/2024] [Accepted: 03/05/2024] [Indexed: 03/28/2024] Open
Abstract
Background Aortic dissection refers to the true and false two-lumen separation of the aortic wall, in which the blood in the aortic lumen enters the aortic mesomembrane from the tear of the aortic intima to separate the mesomembrane and expand along the long axis of the aorta. Purpose In view of the problems of individual differences, complex complications and many small targets in clinical aortic dissection detection, this paper proposes a convolution neural network MFF-FPN (Multi-scale Feature Fusion based Feature Pyramid Network) for the detection of aortic dissection complications. Methods The proposed model uses Resnet50 as the backbone for feature extraction and builds a pyramid structure to fuse low-level and high-level feature information. We add an attention mechanism to the backbone network, which can establish inter-dependencies between feature graph channels and enhance the representation quality of CNN. Results The proposed method has a mean average precision (MAP) of 99.40% in the task of multi object detection for aortic dissection and complications, which is higher than the accuracy of 96.3% on SSD model and 99.05% on YoloV7 model. It greatly improves the accuracy of small target detection such as cysts, making it more suitable for clinical focus detection. Conclusions The proposed deep learning model achieves feature reuse and focuses on local important information. By adding only a small number of model parameters, we are able to greatly improve the detection accuracy, which is effective in detecting small target lesions commonly found in clinical settings, and also performs well on other medical and natural datasets.
Collapse
Affiliation(s)
- Yun Tan
- Central South University of Forestry and Technology, Hunan, China
| | - Zhenxu Wang
- Central South University of Forestry and Technology, Hunan, China
| | - Ling Tan
- The Second Xiangya Hospital of Central South University, Hunan, China
| | - Chunzhi Li
- Central South University of Forestry and Technology, Hunan, China
| | - Chao Deng
- The Second Xiangya Hospital of Central South University, Hunan, China
| | - Jingyu Li
- The Second Xiangya Hospital of Central South University, Hunan, China
| | - Hao Tang
- The Second Xiangya Hospital of Central South University, Hunan, China
| | - Jiaohua Qin
- Central South University of Forestry and Technology, Hunan, China
| |
Collapse
|
5
|
Liu Y, Zhao Q, Wang Y. Peak ground acceleration prediction for on-site earthquake early warning with deep learning. Sci Rep 2024; 14:5485. [PMID: 38448483 PMCID: PMC10917772 DOI: 10.1038/s41598-024-56004-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 02/29/2024] [Indexed: 03/08/2024] Open
Abstract
Rapid and accurate prediction of peak ground acceleration (PGA) is an important basis for determining seismic damage through on-site earthquake early warning (EEW). The current on-site EEW uses the feature parameters of the first arrival P-wave to predict PGA, but the selection of these feature parameters is limited by human experience, which limits the accuracy and timeliness of predicting peak ground acceleration (PGA). Therefore, an end-to-end deep learning model is proposed for predicting PGA (DLPGA) based on convolutional neural networks (CNNs). In DLPGA, the vertical initial arrival 3-6 s seismic wave from a single station is used as input, and PGA is used as output. Features are automatically extracted through a multilayer CNN to achieve rapid PGA prediction. The DLPGA is trained, verified, and tested using Japanese seismic records. It is shown that compared to the widely used peak displacement (Pd) method, the correlation coefficient of DLPGA for predicting PGA has increased by 12-23%, the standard deviation of error has decreased by 22-25%, and the error mean has decreased by 6.92-19.66% with the initial 3-6 s seismic waves. In particular, the accuracy of DLPGA for predicting PGA with the initial 3 s seismic wave is better than that of Pd for predicting PGA with the initial 6 s seismic wave. In addition, using the generalization test of Chilean seismic records, it is found that DLPGA has better generalization ability than Pd, and the accuracy of distinguishing ground motion destructiveness is improved by 35-150%. These results confirm that DLPGA has significant accuracy and timeliness advantages over artificially defined feature parameters in predicting PGA, which can greatly improve the effect of on-site EEW in judging the destructiveness of ground motion.
Collapse
Affiliation(s)
| | - Qingxu Zhao
- Key Laboratory of Urban Security and Disaster Engineering of China Ministry of Education, Beijing University of Technology, Beijing, China.
| | - Yanwei Wang
- Guangxi Key Laboratory of Geomechanics and Geotechnical Engineering, Guilin University of Technology, Guilin, China.
| |
Collapse
|
6
|
Fu B, Peng Y, He J, Tian C, Sun X, Wang R. HmsU-Net: A hybrid multi-scale U-net based on a CNN and transformer for medical image segmentation. Comput Biol Med 2024; 170:108013. [PMID: 38271837 DOI: 10.1016/j.compbiomed.2024.108013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 12/26/2023] [Accepted: 01/18/2024] [Indexed: 01/27/2024]
Abstract
Accurate medical image segmentation is of great significance for subsequent diagnosis and analysis. The acquisition of multi-scale information plays an important role in segmenting regions of interest of different sizes. With the emergence of Transformers, numerous networks adopted hybrid structures incorporating Transformers and CNNs to learn multi-scale information. However, the majority of research has focused on the design and composition of CNN and Transformer structures, neglecting the inconsistencies in feature learning between Transformer and CNN. This oversight has resulted in the hybrid network's performance not being fully realized. In this work, we proposed a novel hybrid multi-scale segmentation network named HmsU-Net, which effectively fused multi-scale features. Specifically, HmsU-Net employed a parallel design incorporating both CNN and Transformer architectures. To address the inconsistency in feature learning between CNN and Transformer within the same stage, we proposed the multi-scale feature fusion module. For feature fusion across different stages, we introduced the cross-attention module. Comprehensive experiments conducted on various datasets demonstrate that our approach surpasses current state-of-the-art methods.
Collapse
Affiliation(s)
- Bangkang Fu
- Medical College, Guizhou University, Guizhou 550000, China; Department of Medical Imaging, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guizhou 550002, China
| | - Yunsong Peng
- Department of Medical Imaging, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guizhou 550002, China
| | - Junjie He
- Department of Medical Imaging, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guizhou 550002, China
| | - Chong Tian
- Department of Medical Imaging, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guizhou 550002, China
| | - Xinhuan Sun
- Department of Medical Imaging, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guizhou 550002, China
| | - Rongpin Wang
- Department of Medical Imaging, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guizhou 550002, China.
| |
Collapse
|
7
|
Oura D, Gekka M, Sugimori H. The montage method improves the classification of suspected acute ischemic stroke using the convolution neural network and brain MRI. Radiol Phys Technol 2024; 17:297-305. [PMID: 37934345 DOI: 10.1007/s12194-023-00754-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 10/15/2023] [Accepted: 10/17/2023] [Indexed: 11/08/2023]
Abstract
This study investigated the usefulness of the montage method that combines four different magnetic resonance images into one images for automatic acute ischemic stroke (AIS) diagnosis with deep learning method. The montage image was consisted from diffusion weighted image (DWI), fluid attenuated inversion recovery (FLAIR), arterial spin labeling (ASL), and apparent diffusion coefficient (ASL). The montage method was compared with pseudo color map (pCM) which was consisted from FLAIR, ASL and ADC. 473 AIS patients were classified into four categories: mechanical thrombectomy, conservative therapy, hemorrhage, and other diseases. The results showed that the montage image significantly outperformed pCM in terms of accuracy (montage image = 0.76 ± 0.01, pCM = 0.54 ± 0.05) and the area under the curve (AUC) (montage image = 0.94 ± 0.01, pCM = 0.76 ± 0.01). This study demonstrates the usefulness of the montage method and its potential for overcoming the limitations of pCM.
Collapse
Affiliation(s)
- Daisuke Oura
- Department of Radiology, Otaru General Hospital, Otaru, 047-0152, Japan
- Graduate School of Health Sciences, Hokkaido University, Sapporo, 060-0812, Japan
| | - Masayuki Gekka
- Department of Neurosurgery, Otaru General Hospital, Otaru, 047-0152, Japan
| | - Hiroyuki Sugimori
- Faculty of Health Sciences, Hokkaido University, Sapporo, 060-0812, Japan.
| |
Collapse
|
8
|
Khatri U, Kwon GR. Diagnosis of Alzheimer's disease via optimized lightweight convolution-attention and structural MRI. Comput Biol Med 2024; 171:108116. [PMID: 38346370 DOI: 10.1016/j.compbiomed.2024.108116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 01/28/2024] [Accepted: 02/04/2024] [Indexed: 03/21/2024]
Abstract
Alzheimer's disease (AD) poses a substantial public health challenge, demanding accurate screening and diagnosis. Identifying AD in its early stages, including mild cognitive impairment (MCI) and healthy control (HC), is crucial given the global aging population. Structural magnetic resonance imaging (sMRI) is essential for understanding the brain's structural changes due to atrophy. While current deep learning networks overlook voxel long-term dependencies, vision transformers (ViT) excel at recognizing such dependencies in images, making them valuable in AD diagnosis. Our proposed method integrates convolution-attention mechanisms in transformer-based classifiers for AD brain datasets, enhancing performance without excessive computing resources. Replacing multi-head attention with lightweight multi-head self-attention (LMHSA), employing inverted residual (IRU) blocks, and introducing local feed-forward networks (LFFN) yields exceptional results. Training on AD datasets with a gradient-centralized optimizer and Adam achieves an impressive accuracy rate of 94.31% for multi-class classification, rising to 95.37% for binary classification (AD vs. HC) and 92.15% for HC vs. MCI. These outcomes surpass existing AD diagnosis approaches, showcasing the model's efficacy. Identifying key brain regions aids future clinical solutions for AD and neurodegenerative diseases. However, this study focused exclusively on the AD Neuroimaging Initiative (ADNI) cohort, emphasizing the need for a more robust, generalizable approach incorporating diverse databases beyond ADNI in future research.
Collapse
Affiliation(s)
- Uttam Khatri
- Dept. of Information and Communication Engineering, Chosun University, 309 Pilmun-Daero, Dong-Gu, Gwangju, 61452, Republic of Korea
| | - Goo-Rak Kwon
- Dept. of Information and Communication Engineering, Chosun University, 309 Pilmun-Daero, Dong-Gu, Gwangju, 61452, Republic of Korea.
| |
Collapse
|
9
|
Cherukuru P, Mustafa MB. CNN-based noise reduction for multi-channel speech enhancement system with discrete wavelet transform (DWT) preprocessing. PeerJ Comput Sci 2024; 10:e1901. [PMID: 38435554 PMCID: PMC10909157 DOI: 10.7717/peerj-cs.1901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 01/31/2024] [Indexed: 03/05/2024]
Abstract
Speech enhancement algorithms are applied in multiple levels of enhancement to improve the quality of speech signals under noisy environments known as multi-channel speech enhancement (MCSE) systems. Numerous existing algorithms are used to filter noise in speech enhancement systems, which are typically employed as a pre-processor to reduce noise and improve speech quality. They may, however, be limited in performing well under low signal-to-noise ratio (SNR) situations. The speech devices are exposed to all kinds of environmental noises which may go up to a high-level frequency of noises. The objective of this research is to conduct a noise reduction experiment for a multi-channel speech enhancement (MCSE) system in stationary and non-stationary environmental noisy situations with varying speech signal SNR levels. The experiments examined the performance of the existing and the proposed MCSE systems for environmental noises in filtering low to high SNRs environmental noises (-10 dB to 20 dB). The experiments were conducted using the AURORA and LibriSpeech datasets, which consist of different types of environmental noises. The existing MCSE (BAV-MCSE) makes use of beamforming, adaptive noise reduction and voice activity detection algorithms (BAV) to filter the noises from speech signals. The proposed MCSE (DWT-CNN-MCSE) system was developed based on discrete wavelet transform (DWT) preprocessing and convolution neural network (CNN) for denoising the input noisy speech signals to improve the performance accuracy. The performance of the existing BAV-MCSE and the proposed DWT-CNN-MCSE were measured using spectrogram analysis and word recognition rate (WRR). It was identified that the existing BAV-MCSE reported the highest WRR at 93.77% for a high SNR (at 20 dB) and 5.64% on average for a low SNR (at -10 dB) for different noises. The proposed DWT-CNN-MCSE system has proven to perform well at a low SNR with WRR of 70.55% and the highest improvement (64.91% WRR) at -10 dB SNR.
Collapse
Affiliation(s)
- Pavani Cherukuru
- Department of Software Engineering, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, Malaysia
- Department of Information Science, Dayananda Sagar Academy of Technology and Management, Bangalore, Karnataka, India
| | - Mumtaz Begum Mustafa
- Department of Software Engineering, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, Malaysia
| |
Collapse
|
10
|
王 慧, 张 玭, 金 丰, 赵 宝, 曾 勤, 肖 文. [Mental fatigue state recognition method based on convolution neural network and long short-term memory]. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi 2024; 41:34-40. [PMID: 38403602 PMCID: PMC10894741 DOI: 10.7507/1001-5515.202306016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 10/31/2023] [Indexed: 02/27/2024]
Abstract
The pace of modern life is accelerating, the pressure of life is gradually increasing, and the long-term accumulation of mental fatigue poses a threat to health. By analyzing physiological signals and parameters, this paper proposes a method that can identify the state of mental fatigue, which helps to maintain a healthy life. The method proposed in this paper is a new recognition method of psychological fatigue state of electrocardiogram signals based on convolutional neural network and long short-term memory. Firstly, the convolution layer of one-dimensional convolutional neural network model is used to extract local features, the key information is extracted through pooling layer, and some redundant data is removed. Then, the extracted features are used as input to the long short-term memory model to further fuse the ECG features. Finally, by integrating the key information through the full connection layer, the accurate recognition of mental fatigue state is successfully realized. The results show that compared with traditional machine learning algorithms, the proposed method significantly improves the accuracy of mental fatigue recognition to 96.3%, which provides a reliable basis for the early warning and evaluation of mental fatigue.
Collapse
Affiliation(s)
- 慧 王
- 北京科技大学 自动化学院(北京 100083)School of Automation, University of Science And Technology Beijing, Beijing 100083, P. R. China
| | - 玭 张
- 北京科技大学 自动化学院(北京 100083)School of Automation, University of Science And Technology Beijing, Beijing 100083, P. R. China
| | - 丰护 金
- 北京科技大学 自动化学院(北京 100083)School of Automation, University of Science And Technology Beijing, Beijing 100083, P. R. China
| | - 宝永 赵
- 北京科技大学 自动化学院(北京 100083)School of Automation, University of Science And Technology Beijing, Beijing 100083, P. R. China
| | - 勤波 曾
- 北京科技大学 自动化学院(北京 100083)School of Automation, University of Science And Technology Beijing, Beijing 100083, P. R. China
| | - 文栋 肖
- 北京科技大学 自动化学院(北京 100083)School of Automation, University of Science And Technology Beijing, Beijing 100083, P. R. China
- 中国兵器装备集团自动化研究所有限公司(四川绵阳 621000)China Ordnance Equipment Group Automation Research Institute Co., Mianyang, Sichuan 621000, P. R. China
- 北京科技大学 顺德创新学院(广东顺德 528399)Shunde Innovation School, University of Science, and Technology Beijing, Shunde, Guangdong 528399, P. R. China
| |
Collapse
|
11
|
Wang Z, Shi F, Zou F. Deep learning based ultrasonic reconstruction of rough surface morphology. Ultrasonics 2024; 138:107265. [PMID: 38354524 DOI: 10.1016/j.ultras.2024.107265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 01/10/2024] [Accepted: 02/07/2024] [Indexed: 02/16/2024]
Abstract
This paper introduces a methodology to recover the morphology of a complex rough surface from ultrasonic pulse echo measurements with an array of equidistant sensors using the one dimensional convolution neural network (1DCNN). The neural network is trained by the datasets simulated from high-fidelity finite element simulations for surfaces with a range of roughness parameters and is tested on both numerical and real experimental data. To assess the performance of our proposed method, the rough surface reconstruction results from the deep learning approach are compared with those obtained from conventional ultrasonic array imaging methods. Unlike array imaging-based methods that require a large number of sensors (e.g., 128, 64 or 32), the deep learning-based method uses pulse echo signals and can achieve accurate results with much fewer sensors. The developed deep learning approach has the potential to enable low-cost, accurate, and real-time reconstruction of complex surface profiles.
Collapse
Affiliation(s)
- Zhengjun Wang
- Department of Mechanical and Aerospace Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong Special Administrative Region of China
| | - Fan Shi
- Department of Mechanical and Aerospace Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong Special Administrative Region of China.
| | - Fangxin Zou
- Department of Aeronautical and Aviation Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China
| |
Collapse
|
12
|
Maheswari BU, Sam D, Mittal N, Sharma A, Kaur S, Askar SS, Abouhawwash M. Explainable deep-neural-network supported scheme for tuberculosis detection from chest radiographs. BMC Med Imaging 2024; 24:32. [PMID: 38317098 PMCID: PMC10840197 DOI: 10.1186/s12880-024-01202-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 01/15/2024] [Indexed: 02/07/2024] Open
Abstract
Chest radiographs are examined in typical clinical settings by competent physicians for tuberculosis diagnosis. However, this procedure is time consuming and subjective. Due to the growing usage of machine learning techniques in applied sciences, researchers have begun applying comparable concepts to medical diagnostics, such as tuberculosis screening. In the period of extremely deep neural nets which comprised of hundreds of convolution layers for feature extraction, we create a shallow-CNN for screening of TB condition from Chest X-rays so that the model is able to offer appropriate interpretation for right diagnosis. The suggested model consists of four convolution-maxpooling layers with various hyperparameters that were optimized for optimal performance using a Bayesian optimization technique. The model was reported with a peak classification accuracy, F1-score, sensitivity and specificity of 0.95. In addition, the receiver operating characteristic (ROC) curve for the proposed shallow-CNN showed a peak area under the curve value of 0.976. Moreover, we have employed class activation maps (CAM) and Local Interpretable Model-agnostic Explanations (LIME), explainer systems for assessing the transparency and explainability of the model in comparison to a state-of-the-art pre-trained neural net such as the DenseNet.
Collapse
Affiliation(s)
- B Uma Maheswari
- Department of Computer Science and Engineering, St. Joseph's College of Engineering, OMR, Chennai, Tamilnadu, 600119, India
| | - Dahlia Sam
- Department of Information Technology, Dr. M.G.R Educational and Research Institute, Periyar E.V.R High Road, Vishwas Nagar, Maduravoyal, Chennai, Tamilnadu, 600095, India
| | - Nitin Mittal
- University Centre for Research and Development, Chandigarh University, Mohali, Punjab, 140413, India
| | - Abhishek Sharma
- Department of Computer Engineering and Applications, GLA University, Mathura, Uttar Pradesh, 281406, India
| | - Sandeep Kaur
- Department of Computer Engineering & Technology, Guru Nanak Dev University, Amritsar, Punjab, 143005, India
| | - S S Askar
- Department of Statistics and Operations Research, College of Science, King Saud University, P.O. Box 2455, Riyadh, 11451, Saudi Arabia
| | - Mohamed Abouhawwash
- Department of Computational Mathematics, Science, and Engineering (CMSE), College of Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Mathematics, Faculty of Science, Mansoura University, Mansoura, 35516, Egypt.
| |
Collapse
|
13
|
Chen J, Shen X, Zhao Y, Qian W, Ma H, Sang L. Attention gate and dilation U-shaped network (GDUNet): an efficient breast ultrasound image segmentation network with multiscale information extraction. Quant Imaging Med Surg 2024; 14:2034-2048. [PMID: 38415149 PMCID: PMC10895089 DOI: 10.21037/qims-23-947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 01/08/2024] [Indexed: 02/29/2024]
Abstract
Background In recent years, computer-aided diagnosis (CAD) systems have played an important role in breast cancer screening and diagnosis. The image segmentation task is the key step in a CAD system for the rapid identification of lesions. Therefore, an efficient breast image segmentation network is necessary for improving the diagnostic accuracy in breast cancer screening. However, due to the characteristics of blurred boundaries, low contrast, and speckle noise in breast ultrasound images, breast lesion segmentation is challenging. In addition, many of the proposed breast tumor segmentation networks are too complex to be applied in practice. Methods We developed the attention gate and dilation U-shaped network (GDUNet), a lightweight, breast lesion segmentation model. This model improves the inverted bottleneck, integrating it with tokenized multilayer perceptron (MLP) to construct the encoder. Additionally, we introduce the lightweight attention gate (AG) within the skip connection, which effectively filters noise in low-level semantic information across spatial and channel dimensions, thus attenuating irrelevant features. To further improve performance, we innovated the AG dilation (AGDT) block and embedded it between the encoder and decoder in order to capture critical multiscale contextual information. Results We conducted experiments on two breast cancer datasets. The experiment's results show that compared to UNet, GDUNet could reduce the number of parameters by 10 times and the computational complexity by 58 times while providing a double of the inference speed. Moreover, the GDUNet achieved a better segmentation performance than did the state-of-the-art medical image segmentation architecture. Conclusions Our proposed GDUNet method can achieve advanced segmentation performance on different breast ultrasound image datasets with high efficiency.
Collapse
Affiliation(s)
- Jiadong Chen
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Xiaoyan Shen
- School of Life and Health Technology, Dongguan University of Technology, Dongguan, China
| | - Yu Zhao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Wei Qian
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - He Ma
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China
| | - Liang Sang
- Department of Ultrasound, The First Hospital of China Medical University, Shenyang, China
| |
Collapse
|
14
|
Zhou Y, Fu C, Jiang X, Yu Q, Liu H. Who might encounter hard-braking while speeding? Analysis for regular speeders using low-frequency taxi trajectories on arterial roads and explainable AI. Accid Anal Prev 2024; 195:107382. [PMID: 37979465 DOI: 10.1016/j.aap.2023.107382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 09/29/2023] [Accepted: 11/13/2023] [Indexed: 11/20/2023]
Abstract
Regular speeders are those who commit speeding recidivism during a period. Among their speeding behaviors, some occurring in specific scenarios may cause more hazards to road users. Therefore, there is a need to evaluate the driving risks if the regular speeders have different speeding propensities. This study considers speeding-related hard-braking events (SHEs) as a safety surrogate measure and recognizes the regular speeders who encounter at least one SHEs during the study period as risky individuals. To identify speeding behaviors and hard-braking events from low-frequency GPS trajectories, we compare the average travel speed between pairwise adjacent GPS points to the posted speed limit and examine the speed curve and the corresponding travel distance between these GPS points, respectively. Thereafter, a logistic model, XGBoost, and three 1D Convolutional Neural Networks (CNNs) including AlexNet CNN, Mini-AlexNet CNN, and Simple CNN are respectively developed to recognize the regular speeders who encountered SHEs based on their speeding propensities. The proposed Mini-AlexNet CNN achieves a global F1-score of 91% and recall of 90% on the testing data, which are superior to other models. Further, the study uses the Shapley Additive exPlanation (SHAP) framework to visually interpret the contribution of speeding propensities on SHE likelihood. It is found that speeding by 50% or greater for no more than 285 m is the most dangerous kind among all the speeding behaviors. Speeding on roads without bicycle lanes or on roads with roadside parking and excessive accesses increases the probability of encountering SHEs. Based on the analyses, we put forward tailored recommendations that aim to restrict hazard-related speeding behaviors rather than speeding behaviors of all kinds.
Collapse
Affiliation(s)
- Yue Zhou
- Flight Technology College, Civil Aviation Flight University of China, Guanghan 618307, China
| | - Chuanyun Fu
- School of Transportation Science and Engineering, Harbin Institute of Technology, Harbin 150090, China.
| | - Xinguo Jiang
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 611756, China; National United Engineering Laboratory of Integrated and Intelligent Transportation, Southwest Jiaotong University, Chengdu 611756, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, Southwest Jiaotong University, Chengdu 611756, China
| | - Qiong Yu
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 611756, China
| | - Haiyue Liu
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 611756, China
| |
Collapse
|
15
|
Talaat FM, El-Sappagh S, Alnowaiser K, Hassan E. Improved prostate cancer diagnosis using a modified ResNet50-based deep learning architecture. BMC Med Inform Decis Mak 2024; 24:23. [PMID: 38267994 PMCID: PMC10809762 DOI: 10.1186/s12911-024-02419-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 01/08/2024] [Indexed: 01/26/2024] Open
Abstract
Prostate cancer, the most common cancer in men, is influenced by age, family history, genetics, and lifestyle factors. Early detection of prostate cancer using screening methods improves outcomes, but the balance between overdiagnosis and early detection remains debated. Using Deep Learning (DL) algorithms for prostate cancer detection offers a promising solution for accurate and efficient diagnosis, particularly in cases where prostate imaging is challenging. In this paper, we propose a Prostate Cancer Detection Model (PCDM) model for the automatic diagnosis of prostate cancer. It proves its clinical applicability to aid in the early detection and management of prostate cancer in real-world healthcare environments. The PCDM model is a modified ResNet50-based architecture that integrates faster R-CNN and dual optimizers to improve the performance of the detection process. The model is trained on a large dataset of annotated medical images, and the experimental results show that the proposed model outperforms both ResNet50 and VGG19 architectures. Specifically, the proposed model achieves high sensitivity, specificity, precision, and accuracy rates of 97.40%, 97.09%, 97.56%, and 95.24%, respectively.
Collapse
Affiliation(s)
- Fatma M Talaat
- Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, 33516, Egypt
- Faculty of Computer Science & Engineering, New Mansoura University, Gamasa, 35712, Egypt
| | - Shaker El-Sappagh
- Faculty of Computer Science and Engineering, Galala University, Suez, 435611, Egypt
- Information Systems Department, Faculty of Computers and Artificial Intelligence, Benha University, Banha, 13518, Egypt
| | - Khaled Alnowaiser
- College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al Kharj, 11942, Saudi Arabia.
| | - Esraa Hassan
- Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, 33516, Egypt
| |
Collapse
|
16
|
Hyer AP, McMillin RE, Ferri JK. The shape of things to come: Axisymmetric drop shape analysis using deep learning. J Colloid Interface Sci 2024; 653:1188-1195. [PMID: 37793245 DOI: 10.1016/j.jcis.2023.09.120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 09/14/2023] [Accepted: 09/19/2023] [Indexed: 10/06/2023]
Abstract
HYPOTHESIS In the traditional approach to Axisymmetric Drop Shape Analysis (ADSA), the determination of surface tension or interfacial tension is constrained by computational speed and image quality. By implementing a machine learning-based approach, particularly using a convolutional neural network (CNN), it is posited that analysis of pendant drop images can be both faster and more accurate. EXPERIMENTS A CNN model was trained and used to predict the surface tension of drop images. The performance of our CNN model was compared to the traditional ADSA, i.e. direct numerical integration, in terms of precision, computational speed, and robustness in dealing with images of varying quality. Additionally, the ability of the CNN model to predict other drop properties such as Volume and Surface Area was evaluated. FINDINGS Our CNN demonstrated a significant enhancement in experimental fit precision, predicting surface tension with an accuracy of (+/-) 1.22×10-1 mN/m and at a speed of 1.50 ms-1, outpacing the traditional method by more than 5×103 times. The model maintained an average surface tension error of 2.42×10-1 mN/m even for experimental images with challenges such as misalignment and poor focus. The CNN model also demonstrated showcased a high degree of accuracy in determining other drop properties.
Collapse
Affiliation(s)
- Andres P Hyer
- Department of Chemical and Life Science Engineering, Virginia Commonwealth University, 601 Main Street, Richmond, 23220, VA, United States
| | - Robert E McMillin
- Department of Chemical and Life Science Engineering, Virginia Commonwealth University, 601 Main Street, Richmond, 23220, VA, United States
| | - James K Ferri
- Department of Chemical and Life Science Engineering, Virginia Commonwealth University, 601 Main Street, Richmond, 23220, VA, United States.
| |
Collapse
|
17
|
Long J, Yang C, Ren Y, Zeng Z. Semi-supervised medical image segmentation via feature similarity and reliable-region enhancement. Comput Biol Med 2023; 167:107668. [PMID: 37931524 DOI: 10.1016/j.compbiomed.2023.107668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 10/07/2023] [Accepted: 10/31/2023] [Indexed: 11/08/2023]
Abstract
Semantic segmentation is a crucial task in the field of computer vision, and medical image segmentation, as its downstream task, has made significant breakthroughs in recent years. However, the issue of requiring a large number of annotations in medical image segmentation has remained a major challenge. Semi-supervised semantic segmentation has provided a powerful approach to address the annotation problem. Nevertheless, existing semi-supervised semantic segmentation methods in medical images have drawbacks, such as insufficient exploitation of unlabeled data information and inefficient utilization of all pseudo-label information. We introduces a novel segmentation model, the Feature Similarity and Reliable-region Enhancement Network (FSRENet), to overcome these limitations. Firstly, this paper proposes a Feature Similarity Module (FSM), which combines the dense feature prediction ability of true labels for unlabeled images with segmentation features as additional constraints, utilizing the similarity relationship between dense features to constrain segmentation features, and thus fully exploiting the dense feature information of unlabeled data. Additionally, the Reliable-region Enhancement Module (REM) designs a high-confidence network structure by fusing two networks that can learn from each other, forming a triple-network structure. The high-confidence network generates reliable pseudo-labels that further constrain the predictions of the two networks, achieving the goal of enhancing the weight of reliable regions, reducing the noise interference of pseudo-labels, and efficiently utilizing all pseudo-label information. Experimental results on the ACDC and LA datasets demonstrate that the FSRENet model proposed in this paper excels in the task of semi-supervised semantic segmentation of medical images and outperforms the majority of existing methods. Our code is available at: https://github.com/gdghds0/FSRENet-master.
Collapse
Affiliation(s)
- Jianwu Long
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing, 400054, China
| | - Chengxin Yang
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing, 400054, China.
| | - Yan Ren
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing, 400054, China
| | - Ziqin Zeng
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing, 400054, China
| |
Collapse
|
18
|
Sawant S, Garg RD, Meshram V, Mistry S. Sen-2 LULC: Land use land cover dataset for deep learning approaches. Data Brief 2023; 51:109724. [PMID: 37965594 PMCID: PMC10641585 DOI: 10.1016/j.dib.2023.109724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 10/19/2023] [Accepted: 10/20/2023] [Indexed: 11/16/2023] Open
Abstract
Land Use Land Cover (LULC) classification is pivotal to sustainable environment and natural resource management. It is critical in planning, monitoring, and management programs at various local and national levels. Monitoring changes in LULC patterns over time is crucial for understanding evolving landscapes. Traditionally, LULC classification has been achieved through satellite data by remote sensing, geographic information system (GIS) techniques, machine learning classifiers, and deep learning models. Semantic segmentation, a technique for assigning land cover classes to individual pixels in an image, is commonly employed for LULC mapping. In recent years, the deep learning revolution, particularly Convolutional Neural Networks (CNNs), has reshaped the field of computer vision and LULC classification. Deep architectures have consistently outperformed traditional methods, offering greater accuracy and efficiency. However, the availability of high-quality datasets has been a limiting factor. Bridging the gap between modern computer vision and remote sensing data analysis can revolutionize our understanding of the environment and drive breakthroughs in urban planning and ecosystem change research. The "Sen-2 LULC Dataset" has been created to facilitate this convergence. This dataset comprises of 213,761 pre-processed 10 m resolution images representing seven LULC classes. These classes encompass water bodies, dense forests, sparse forests, barren land, built-up areas, agricultural land, and fallow land. Importantly, each image may contain multiple coexisting land use and land cover classes, mirroring the real-world complexity of landscapes. The dataset is derived from Sentinel-2 satellite imagery sourced from the Copernicus Open Access Hub (https://scihub.copernicus.eu/) platform. It includes spectral bands B4, B3, and B2, corresponding to red, green, and blue (RGB) channels, and offers a spectral resolution of 10 m. The dataset also provides an equal number of mask images. Structured into six folders, the dataset offers training, testing, and validation sets for images and masks. Researchers across various domains can leverage this resource to advance LULC classification in the context of the Indian region. Additionally, it catalyzes fostering collaboration between remote sensing and computer vision communities, enabling novel insights into environmental dynamics and urban planning challenges.
Collapse
Affiliation(s)
- Suraj Sawant
- Geomatics Engineering, IIT Roorkee, Uttarakhand 247667, India
- COEP Technological University, Pune, Maharashtra 411005, India
| | - Rahul Dev Garg
- Geomatics Engineering, IIT Roorkee, Uttarakhand 247667, India
| | - Vishal Meshram
- Vishwakarma Institute of Information Technology, Pune, Maharashtra 411048, India
| | - Shrayank Mistry
- COEP Technological University, Pune, Maharashtra 411005, India
| |
Collapse
|
19
|
Abd El-Wahab BS, Nasr ME, Khamis S, Ashour AS. BTC-fCNN: Fast Convolution Neural Network for Multi-class Brain Tumor Classification. Health Inf Sci Syst 2023; 11:3. [PMID: 36606077 PMCID: PMC9807719 DOI: 10.1007/s13755-022-00203-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/22/2022] [Indexed: 01/04/2023] Open
Abstract
Timely prognosis of brain tumors has a crucial role for powerful healthcare of remedy-making plans. Manual classification of the brain tumors in magnetic resonance imaging (MRI) images is a challenging task, which relies on the experienced radiologists to identify and classify the brain tumor. Automated classification of different brain tumors is significant based on designing computer-aided diagnosis (CAD) systems. Existing classification methods suffer from unsatisfactory performance and/or large computational cost/ time. This paper proposed a fast and efficient classification process, called BTC-fCNN, which is a deep learning-based system to distinguish between different views of three brain tumor types, namely meningioma, glioma, and pituitary tumors. The proposed system's model was applied on MRI images from the Figshare dataset. It consists of 13 layers with few trainable parameters involving convolution layer, 1 × 1 convolution layer, average pooling, fully connected layer, and softmax layer. Five iterations including transfer learning and five-fold cross-validation for retraining are considered to increase the proposed model performance. The proposed model achieved 98.63% average accuracy, using five iterations with transfer learning, and 98.86% using retrained five-fold cross-validation (internal transfer learning between the folds). Various evaluation metrics were measured to evaluate the proposed model, such as precision, F-score, recall, specificity and confusion matrix. The proposed BTC-fCNN model outstrips the state-of-the-art and other well-known convolution neural networks (CNN).
Collapse
Affiliation(s)
- Basant S. Abd El-Wahab
- Department of Electronics and Electrical Communications Engineering, Faculty of Engineering, Tanta University, Tanta, Egypt
| | - Mohamed E. Nasr
- Department of Electronics and Electrical Communications Engineering, Faculty of Engineering, Tanta University, Tanta, Egypt
| | - Salah Khamis
- Department of Electronics and Electrical Communications Engineering, Faculty of Engineering, Tanta University, Tanta, Egypt
| | - Amira S. Ashour
- Department of Electronics and Electrical Communications Engineering, Faculty of Engineering, Tanta University, Tanta, Egypt
| |
Collapse
|
20
|
Shi C, Wu H, Wang L. CEGAT: A CNN and enhanced-GAT based on key sample selection strategy for hyperspectral image classification. Neural Netw 2023; 168:105-122. [PMID: 37748391 DOI: 10.1016/j.neunet.2023.08.059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/16/2023] [Accepted: 08/31/2023] [Indexed: 09/27/2023]
Abstract
In recent years, the application of convolutional neural networks (CNNs) and graph convolutional networks (GCNs) in hyperspectral image classification (HSIC) has achieved remarkable results. However, the limited label samples are still a major challenge when using CNN and GCN to classify hyperspectral images. In order to alleviate this problem, a double branch fusion network of CNN and enhanced graph attention network (CEGAT) based on key sample selection strategy is proposed. First, a linear discrimination of spectral inter-class slices (LD_SICS) module is designed to eliminate spectral redundancy of HSIs. Then, a spatial spectral correlation attention (SSCA) module is proposed, which can extract and assign attention weight to the spatial and spectral correlation features. On the graph attention (GAT) branch, the HSI is segmented into some super pixels as input to reduce the amount of network parameters. In addition, an enhanced graph attention (EGAT) module is constructed to enhance the relationship between nodes. Finally, a key sample selection (KSS) strategy is proposed to enable the network to achieve better classification performance with few labeled samples. Compared with other state-of-the-art methods, CEGAT has better classification performance under limited label samples.
Collapse
Affiliation(s)
- Cuiping Shi
- College of Communication and Electronic Engineering, Qiqihar University, Qiqihar 161000, China.
| | - Haiyang Wu
- College of Communication and Electronic Engineering, Qiqihar University, Qiqihar 161000, China.
| | - Liguo Wang
- College of Information and Communication Engineering, Dalian Nationalities University, Dalian 116000, China.
| |
Collapse
|
21
|
Zhang R, Wang L, Cheng S, Song S. MLP-based classification of COVID-19 and skin diseases. Expert Syst Appl 2023; 228:120389. [PMID: 37193247 PMCID: PMC10170962 DOI: 10.1016/j.eswa.2023.120389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Revised: 05/03/2023] [Accepted: 05/04/2023] [Indexed: 05/18/2023]
Abstract
Recent years have witnessed a growing interest in neural network-based medical image classification methods, which have demonstrated remarkable performance in this field. Typically, convolutional neural network (CNN) architectures have been commonly employed to extract local features. However, the transformer, a newly emerged architecture, has gained popularity due to its ability to explore the relevance of remote elements in an image through a self-attention mechanism. Despite this, it is crucial to establish not only local connectivity but also remote relationships between lesion features and capture the overall image structure to improve image classification accuracy. Therefore, to tackle the aforementioned issues, this paper proposes a network based on multilayer perceptrons (MLPs) that can learn the local features of medical images on the one hand and capture the overall feature information in both spatial and channel dimensions on the other hand, thus utilizing image features effectively. This paper has been extensively validated on COVID19-CT dataset and ISIC 2018 dataset, and the results show that the method in this paper is more competitive and has higher performance in medical image classification compared with existing methods. This shows that the use of MLP to capture image features and establish connections between lesions is expected to provide novel ideas for medical image classification tasks in the future.
Collapse
Affiliation(s)
- Ruize Zhang
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, Xinjiang, China
| | - Liejun Wang
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, Xinjiang, China
| | - Shuli Cheng
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, Xinjiang, China
| | - Shiji Song
- Department of Automation, Tsinghua University, Beijing, 100084, China
| |
Collapse
|
22
|
He Z, Li X, Chen Y, Lv N, Cai Y. Attention-based dual-path feature fusion network for automatic skin lesion segmentation. BioData Min 2023; 16:28. [PMID: 37807076 PMCID: PMC10561442 DOI: 10.1186/s13040-023-00345-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 09/27/2023] [Indexed: 10/10/2023] Open
Abstract
Automatic segmentation of skin lesions is a critical step in Computer Aided Diagnosis (CAD) of melanoma. However, due to the blurring of the lesion boundary, uneven color distribution, and low image contrast, resulting in poor segmentation result. Aiming at the problem of difficult segmentation of skin lesions, this paper proposes an Attention-based Dual-path Feature Fusion Network (ADFFNet) for automatic skin lesion segmentation. Firstly, in the spatial path, a Boundary Refinement (BR) module is designed for the output of low-level features to filter out irrelevant background information and retain more boundary details of the lesion area. Secondly, in the context path, a Multi-scale Feature Selection (MFS) module is constructed for high-level feature output to capture multi-scale context information and use the attention mechanism to filter out redundant semantic information. Finally, we design a Dual-path Feature Fusion (DFF) module, which uses high-level global attention information to guide the step-by-step fusion of high-level semantic features and low-level detail features, which is beneficial to restore image detail information and further improve the pixel-level segmentation accuracy of skin lesion. In the experiment, the ISIC 2018 and PH2 datasets are employed to evaluate the effectiveness of the proposed method. It achieves a performance of 0.890/ 0.925 and 0.933 /0.954 on the F1-score and SE index, respectively. Comparative analysis with state-of-the-art segmentation methods reveals that the ADFFNet algorithm exhibits superior segmentation performance.
Collapse
Affiliation(s)
- Zhenxiang He
- School of Information Engineering, Southwest University of Science and Technology, Mianyang, China
- Tianfu College of Southwest University of Finance and Economics, Mianyang, China
| | - Xiaoxia Li
- School of Information Engineering, Southwest University of Science and Technology, Mianyang, China
- Robot Technology Used for Special Environment Key Laboratory of Sichuan Province, Mianyang, China
| | - Yuling Chen
- School of Information Engineering, Southwest University of Science and Technology, Mianyang, China
- Robot Technology Used for Special Environment Key Laboratory of Sichuan Province, Mianyang, China
| | - Nianzu Lv
- School of Information Engineering, Southwest University of Science and Technology, Mianyang, China
- Robot Technology Used for Special Environment Key Laboratory of Sichuan Province, Mianyang, China
| | - Yong Cai
- School of manufacturing science and Engineering, Southwest University of Science and Technology, Mianyang, China.
| |
Collapse
|
23
|
Zou X, Ji Z, Zhang T, Huang T, Wu S. Visual information processing through the interplay between fine and coarse signal pathways. Neural Netw 2023; 166:692-703. [PMID: 37604078 DOI: 10.1016/j.neunet.2023.07.048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 07/19/2023] [Accepted: 07/30/2023] [Indexed: 08/23/2023]
Abstract
Object recognition is often viewed as a feedforward, bottom-up process in machine learning, but in real neural systems, object recognition is a complicated process which involves the interplay between two signal pathways. One is the parvocellular pathway (P-pathway), which is slow and extracts fine features of objects; the other is the magnocellular pathway (M-pathway), which is fast and extracts coarse features of objects. It has been suggested that the interplay between the two pathways endows the neural system with the capacity of processing visual information rapidly, adaptively, and robustly. However, the underlying computational mechanism remains largely unknown. In this study, we build a two-pathway model to elucidate the computational properties associated with the interactions between two visual pathways. Specifically, we model two visual pathways using two convolution neural networks: one mimics the P-pathway, referred to as FineNet, which is deep, has small-size kernels, and receives detailed visual inputs; the other mimics the M-pathway, referred to as CoarseNet, which is shallow, has large-size kernels, and receives blurred visual inputs. We show that CoarseNet can learn from FineNet through imitation to improve its performance, FineNet can benefit from the feedback of CoarseNet to improve its robustness to noise; and the two pathways interact with each other to achieve rough-to-fine information processing. Using visual backward masking as an example, we further demonstrate that our model can explain visual cognitive behaviors that involve the interplay between two pathways. We hope that this study gives us insight into understanding the interaction principles between two visual pathways.
Collapse
Affiliation(s)
- Xiaolong Zou
- School of Psychological and Cognitive Sciences, IDG/McGovern Institute for Brain Research, Peking-Tsinghua Center for Life Sciences, Center of Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China; Beijing Academy of Artificial Intelligence, Beijing, China.
| | - Zilong Ji
- School of Psychological and Cognitive Sciences, IDG/McGovern Institute for Brain Research, Peking-Tsinghua Center for Life Sciences, Center of Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China; Institue of Cognitive Neuroscience, University College London, London, UK.
| | - Tianqiu Zhang
- School of Psychological and Cognitive Sciences, IDG/McGovern Institute for Brain Research, Peking-Tsinghua Center for Life Sciences, Center of Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.
| | - Tiejun Huang
- Beijing Academy of Artificial Intelligence, Beijing, China; School of Computer Science, Peking University, Beijing, China.
| | - Si Wu
- School of Psychological and Cognitive Sciences, IDG/McGovern Institute for Brain Research, Peking-Tsinghua Center for Life Sciences, Center of Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China; Beijing Academy of Artificial Intelligence, Beijing, China.
| |
Collapse
|
24
|
Nawaz M, Uvaliyev A, Bibi K, Wei H, Abaxi SMD, Masood A, Shi P, Ho HP, Yuan W. Unraveling the complexity of Optical Coherence Tomography image segmentation using machine and deep learning techniques: A review. Comput Med Imaging Graph 2023; 108:102269. [PMID: 37487362 DOI: 10.1016/j.compmedimag.2023.102269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 06/30/2023] [Accepted: 07/03/2023] [Indexed: 07/26/2023]
Abstract
Optical Coherence Tomography (OCT) is an emerging technology that provides three-dimensional images of the microanatomy of biological tissue in-vivo and at micrometer-scale resolution. OCT imaging has been widely used to diagnose and manage various medical diseases, such as macular degeneration, glaucoma, and coronary artery disease. Despite its wide range of applications, the segmentation of OCT images remains difficult due to the complexity of tissue structures and the presence of artifacts. In recent years, different approaches have been used for OCT image segmentation, such as intensity-based, region-based, and deep learning-based methods. This paper reviews the major advances in state-of-the-art OCT image segmentation techniques. It provides an overview of the advantages and limitations of each method and presents the most relevant research works related to OCT image segmentation. It also provides an overview of existing datasets and discusses potential clinical applications. Additionally, this review gives an in-depth analysis of machine learning and deep learning approaches for OCT image segmentation. It outlines challenges and opportunities for further research in this field.
Collapse
Affiliation(s)
- Mehmood Nawaz
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Adilet Uvaliyev
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Khadija Bibi
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Hao Wei
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Sai Mu Dalike Abaxi
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Anum Masood
- Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Trondheim, Norway
| | - Peilun Shi
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Ho-Pui Ho
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China
| | - Wu Yuan
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Hong Kong Special Administrative Region of China.
| |
Collapse
|
25
|
Wu Y, Zhao S, Qi S, Feng J, Pang H, Chang R, Bai L, Li M, Xia S, Qian W, Ren H. Two-stage contextual transformer-based convolutional neural network for airway extraction from CT images. Artif Intell Med 2023; 143:102637. [PMID: 37673569 DOI: 10.1016/j.artmed.2023.102637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 06/14/2023] [Accepted: 08/11/2023] [Indexed: 09/08/2023]
Abstract
Accurate airway segmentation from computed tomography (CT) images is critical for planning navigation bronchoscopy and realizing a quantitative assessment of airway-related chronic obstructive pulmonary disease (COPD). Existing methods face difficulty in airway segmentation, particularly for the small branches of the airway. These difficulties arise due to the constraints of limited labeling and failure to meet clinical use requirements in COPD. We propose a two-stage framework with a novel 3D contextual transformer for segmenting the overall airway and small airway branches using CT images. The method consists of two training stages sharing the same modified 3D U-Net network. The novel 3D contextual transformer block is integrated into both the encoder and decoder path of the network to effectively capture contextual and long-range information. In the first training stage, the proposed network segments the overall airway with the overall airway mask. To improve the performance of the segmentation result, we generate the intrapulmonary airway branch label, and train the network to focus on producing small airway branches in the second training stage. Extensive experiments were performed on in-house and multiple public datasets. Quantitative and qualitative analyses demonstrate that our proposed method extracts significantly more branches and longer lengths of the airway tree while accomplishing state-of-the-art airway segmentation performance. The code is available at https://github.com/zhaozsq/airway_segmentation.
Collapse
Affiliation(s)
- Yanan Wu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| | - Shuiqing Zhao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China.
| | - Shouliang Qi
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China.
| | - Jie Feng
- School of Chemical Equipment, Shenyang University of Technology, Liaoyang, China.
| | - Haowen Pang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China.
| | - Runsheng Chang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China.
| | - Long Bai
- Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| | - Mengqi Li
- Department of Respiratory, the Second Affiliated Hospital of Dalian Medical University, Dalian, China.
| | - Shuyue Xia
- Respiratory Department, Central Hospital Affiliated to Shenyang Medical College, Shenyang, China.
| | - Wei Qian
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China.
| | - Hongliang Ren
- Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China.
| |
Collapse
|
26
|
Song Y, Sun X, Ding L, Peng J, Song L, Zhang X. AHI estimation of OSAHS patients based on snoring classification and fusion model. Am J Otolaryngol 2023; 44:103964. [PMID: 37392727 DOI: 10.1016/j.amjoto.2023.103964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 06/13/2023] [Accepted: 06/17/2023] [Indexed: 07/03/2023]
Abstract
Obstructive sleep apnea-hypopnea syndrome (OSAHS) is a chronic and common sleep-breathing disease that could negatively influence lives of patients and cause serious concomitant diseases. Polysomnography(PSG) is the gold standard for diagnosing OSAHS, but it is expensive and requires overnight hospitalization. Snoring is a typical symptom of OSAHS. This study proposes an effective OSAHS screening method based on snoring sound analysis. Snores were labeled as OSAHS related snoring sounds and simple snoring sounds according to real-time PSG records. Three models were used, including acoustic features combined with XGBoost, Mel-spectrum combined with convolution neural network (CNN), and Mel-spectrum combined with residual neural network (ResNet). Further, the three models were fused by soft voting to detect these two types of snoring sounds. The subject's apnea-hypopnea index (AHI) was estimated according to these recognized snoring sounds. The accuracy and recall of the proposed fusion model achieved 83.44% and 85.27% respectively, and the predicted AHI has a Pearson correlation coefficient of 0.913 (R2 = 0.834, p < 0.001) with PSG. The results demonstrate the validity of predicting AHI based on analysis of snoring sound and show great potential for monitoring OSAHS at home.
Collapse
Affiliation(s)
- Yujun Song
- School of Physics and Optoelectronics, South China University of Technology, Guangzhou 510640, China
| | - Xiaoran Sun
- School of Physics and Optoelectronics, South China University of Technology, Guangzhou 510640, China.
| | - Li Ding
- School of Physics and Optoelectronics, South China University of Technology, Guangzhou 510640, China
| | - Jianxin Peng
- School of Physics and Optoelectronics, South China University of Technology, Guangzhou 510640, China.
| | - Lijuan Song
- State Key Laboratory of Respiratory Disease, Department of Otolaryngology-Head and Neck Surgery, Laboratory of ENT-HNS Disease, First Affiliated Hospital, Guangzhou Medical University, Guangzhou 510120, China
| | - Xiaowen Zhang
- State Key Laboratory of Respiratory Disease, Department of Otolaryngology-Head and Neck Surgery, Laboratory of ENT-HNS Disease, First Affiliated Hospital, Guangzhou Medical University, Guangzhou 510120, China
| |
Collapse
|
27
|
Gendy G, Sabor N, He G. Lightweight image super-resolution based multi-order gated aggregation network. Neural Netw 2023; 166:286-295. [PMID: 37531728 DOI: 10.1016/j.neunet.2023.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 04/25/2023] [Accepted: 07/04/2023] [Indexed: 08/04/2023]
Abstract
Recently, Transformer-based models are taken much focus on solving the task of image super-resolution (SR) due to their ability to achieve better performance. However, these models combined huge computational cost during the computing self-attention mechanism. To solve this problem, we proposed a multi-order gated aggregation super-resolution network (MogaSRN) for low-level vision based on the concept of the MogaNet that is developed for high-level vision. The concept of the MogaSRN model is based on spatial multi-order context aggregation and adaptive channel-wise reallocation with the aid of the multi-layer perceptron (MLP). In contrast to the MogaNet model, in which the resolution of each stage decreased by a factor of 2, the resolution of the MogaSRN is stayed fixed during the deep features extraction. Moreover, the structure of the MogaSRN model is built based on balancing the performance and the model complexity. We evaluated our model based on five benchmark datasets concluding that the MogaSRN model can achieve significant improvements compared to the state-of-the-art. Moreover, our model shows the good visual quality and accuracy of the reconstruction. Finally, our model has 3.7 × faster runtime at the scale of × 4 compared to LWSwinIR with better performance.
Collapse
Affiliation(s)
- Garas Gendy
- Department of Micro-Nano Electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Nabil Sabor
- Electrical Engineering Department, Faculty of Engineering, Assiut University, Assiut 71516, Egypt.
| | - Guanghui He
- Department of Micro-Nano Electronics, Shanghai Jiao Tong University, Shanghai 200240, China.
| |
Collapse
|
28
|
Wang H, Jin X, Zhang T, Wang J. Convolutional neural network-based recognition method for volleyball movements. Heliyon 2023; 9:e18124. [PMID: 37533999 PMCID: PMC10391947 DOI: 10.1016/j.heliyon.2023.e18124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 07/07/2023] [Accepted: 07/07/2023] [Indexed: 08/04/2023] Open
Abstract
With the development of network technology and computer intelligent monitoring technology, a large number of video data came into being. In view of the analysis of specific targets in video, the traditional artificial analysis method can not meet the existing needs. In volleyball teaching in college physical education, because each student has different movements, intelligent processing of video data has become a key issue. Through collecting and sorting out the relevant research results on behavior recognition, it is found that in the research of deep learning, the algorithm structure applied in this paper is representative, and its strong learning ability, especially compared with the traditional algorithm, can more accurately identify human movements, natural language processing and so on, but the research on volleyball action recognition is still less. Therefore, this paper constructs a data set, improves the convolution neural network model, subsequently, new models are constructed through the neural network structure to improve the accuracy of the nonlinear expression and optimize the content of the input data. In order to more accurately analyze the effectiveness of this algorithm, the new data are obtained by grouping the volleyball games in college sports courses. Compared with the original paper, the accuracy of the improved 3D network is improved by 3.3%-88.5%, and the complexity is reduced by 33.6%.
Collapse
Affiliation(s)
- Hua Wang
- Physical Education Department, Xinxiang Institute of Engineering, Xinxiang, 453000, China
| | - Xiaojiao Jin
- Physical Education Department, Xingtai Medical College, Xingtai, 054000, China
| | - Tianyang Zhang
- Sports Work Department, Hebei Vocational University of Technology and Engineering, Xingtai, 054000, China
| | - Jianbin Wang
- Physical Education Department, Xingtai Medical College, Xingtai, 054000, China
| |
Collapse
|
29
|
Lyu H, Li X, Zhang J, Zhou C, Tang X, Xu F, Yang Y, Huang Q, Xiang W, Li D. Automated inter-patient arrhythmia classification with dual attention neural network. Comput Methods Programs Biomed 2023; 236:107560. [PMID: 37116424 DOI: 10.1016/j.cmpb.2023.107560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Revised: 04/13/2023] [Accepted: 04/18/2023] [Indexed: 05/21/2023]
Abstract
BACKGROUND AND OBJECTIVES Arrhythmia classification based on electrocardiograms (ECG) can enhance clinical diagnostic efficiency. However, due to the significant differences in the number of different categories of heartbeats, the performance of classes with fewer samples in arrhythmia classification have not met expectations under the inter-patient paradigm. This paper aims to mitigate the adverse effects of category imbalance and improve arrhythmia classification performance. METHODS We constructed a novel dual attention hybrid network (DA-Net) for arrhythmia classification under sample imbalance, based on modified convolutional networks with channel attention (MCC-Net) and sequence-to-sequence network with global attention (Seq2Seq). The refined local features of the input heartbeat are first extracted by MCC-Net and then sent to Seq2Seq for further feature fusion. By applying local and global attention in the feature extraction and fusion parts, respectively, the method fully fuses low-level feature details and high-level context information and enhances the ability to extract discriminative features. RESULTS Based on the MIT-BIH arrhythmia database, under the inter-patient paradigm without any data augmentation methods, the proposed method achieved 99.98% accuracy (ACC) for five categories. The various performance indicators are as follows: Class N: sensitivity (SEN) = 99.96%, specificity (SPEC) = 99.93%, positive predictive value (PPV) = 99.99%; Class S: SEN = 99.67%, SPEC = 99.98%, PPV = 99.56%; Class V: SEN = 100%, SPEC = 99.99%, PPV = 99.91%; Class F: SEN = 100%, PPV = 99.98%, SPEC = 97.17%. In further experiments simulating extreme cases, the model still achieved ACC of 99.54% and 98.91% in the three-category and five-category categories when the training sample size was much smaller than the test sample. CONCLUSIONS Without any data augmentation methods, the proposed model not only alleviates the negative impact of class imbalance and achieves excellent performance in all categories but also provides a new approach for dealing with class imbalance in arrhythmia classification. Additionally, our method demonstrates potential in conditions with fewer samples.
Collapse
Affiliation(s)
- He Lyu
- Key Laboratory of Electronic and Information Engineering, State Ethnic Affairs Commission (Southwest Minzu University). Chengdu, China
| | - Xiangkui Li
- West China Biomedical Big Data Center, West China Hospital, Sichuan University. 37 Guoxue Alley, Chengdu 610041. China
| | - Jian Zhang
- West China Biomedical Big Data Center, West China Hospital, Sichuan University. 37 Guoxue Alley, Chengdu 610041. China
| | - Chenchen Zhou
- Key Laboratory of Electronic and Information Engineering, State Ethnic Affairs Commission (Southwest Minzu University). Chengdu, China
| | - Xuezhi Tang
- Key Laboratory of Electronic and Information Engineering, State Ethnic Affairs Commission (Southwest Minzu University). Chengdu, China
| | - Fanxin Xu
- Key Laboratory of Electronic and Information Engineering, State Ethnic Affairs Commission (Southwest Minzu University). Chengdu, China
| | - Ye Yang
- Key Laboratory of Electronic and Information Engineering, State Ethnic Affairs Commission (Southwest Minzu University). Chengdu, China
| | - Qinzhen Huang
- Key Laboratory of Electronic and Information Engineering, State Ethnic Affairs Commission (Southwest Minzu University). Chengdu, China.
| | - Wei Xiang
- Key Laboratory of Electronic and Information Engineering, State Ethnic Affairs Commission (Southwest Minzu University). Chengdu, China.
| | - Dong Li
- Division of Hospital Medicine, Emory School of Medicine, 201 Dowman Dr, Atlanta, GA 30322, USA
| |
Collapse
|
30
|
Song E, Long J, Ma G, Liu H, Hung CC, Jin R, Wang P, Wang W. Prostate lesion segmentation based on a 3D end-to-end convolution neural network with deep multi-scale attention. Magn Reson Imaging 2023; 99:98-109. [PMID: 36681311 DOI: 10.1016/j.mri.2023.01.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 07/06/2022] [Accepted: 01/14/2023] [Indexed: 01/20/2023]
Abstract
Prostate cancer is one of the deadest cancers among human beings. To better diagnose the prostate cancer, prostate lesion segmentation becomes a very important work, but its progress is very slow due to the prostate lesions small in size, irregular in shape, and blurred in contour. Therefore, automatic prostate lesion segmentation from mp-MRI is a great significant work and a challenging task. However, the most existing multi-step segmentation methods based on voxel-level classification are time-consuming, may introduce errors in different steps and lead to error accumulation. To decrease the computation time, harness richer 3D spatial features, and fuse the multi-level contextual information of mp-MRI, we present an automatic segmentation method in which all steps are optimized conjointly as one step to form our end-to-end convolutional neural network. The proposed end-to-end network DMSA-V-Net consists of two parts: (1) a 3D V-Net is used as the backbone network, it is the first attempt in employing 3D convolutional neural network for CS prostate lesion segmentation, (2) a deep multi-scale attention mechanism is introduced into the 3D V-Net which can highly focus on the ROI while suppressing the redundant background. As a merit, the attention can adaptively re-align the context information between the feature maps at different scales and the saliency maps in high-levels. We performed experiments based on five cross-fold validation with data including 97 patients. The results show that the Dice and sensitivity are 0.7014 and 0.8652 respectively, which demonstrates that our segmentation approach is more significant and accurate compared to other methods.
Collapse
Affiliation(s)
- Enmin Song
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Jiaosong Long
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Guangzhi Ma
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China.
| | - Hong Liu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Chih-Cheng Hung
- College of Computing and Software Engineering, Kennesaw State University, Atlanta, USA
| | - Renchao Jin
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Peijun Wang
- Department of Radiology, Tongji Hospital, School of Medcine, Tongji University, Shanghai 200065, China
| | - Wei Wang
- Department of Radiology, Tongji Hospital, School of Medcine, Tongji University, Shanghai 200065, China
| |
Collapse
|
31
|
Shu YC, Lo YC, Chiu HC, Chen LR, Lin CY, Wu WT, Özçakar L, Chang KV. Deep learning algorithm for predicting subacromial motion trajectory: Dynamic shoulder ultrasound analysis. Ultrasonics 2023; 134:107057. [PMID: 37290256 DOI: 10.1016/j.ultras.2023.107057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 05/14/2023] [Accepted: 05/24/2023] [Indexed: 06/10/2023]
Abstract
Subacromial motion metrics can be extracted from dynamic shoulder ultrasonography, which is useful for identifying abnormal motion patterns in painful shoulders. However, frame-by-frame manual labeling of anatomical landmarks in ultrasound images is time consuming. The present study aims to investigate the feasibility of a deep learning algorithm for extracting subacromial motion metrics from dynamic ultrasonography. Dynamic ultrasound imaging was retrieved by asking 17 participants to perform cyclic shoulder abduction and adduction along the scapular plane, whereby the trajectory of the humeral greater tubercle (in relation to the lateral acromion) was depicted by the deep learning algorithm. Extraction of the subacromial motion metrics was conducted using a convolutional neural network (CNN) or a self-transfer learning-based (STL)-CNN with or without an autoencoder (AE). The mean absolute error (MAE) compared with the manually-labeled data (ground truth) served as the main outcome variable. Using eight-fold cross-validation, the average MAE was proven to be significantly higher in the group using CNN than in those using STL-CNN or STL-CNN+AE for the relative difference between the greater tubercle and lateral acromion on the horizontal axis. The MAE for the localization of the two aforementioned landmarks on the vertical axis also seemed to be enlarged in those using CNN compared with those using STL-CNN. In the testing dataset, the errors in relation to the ground truth for the minimal vertical acromiohumeral distance were 0.081-0.333 cm using CNN, compared with 0.002-0.007 cm using STL-CNN. We successfully demonstrated the feasibility of a deep learning algorithm for automatic detection of the greater tubercle and lateral acromion during dynamic shoulder ultrasonography. Our framework also demonstrated the capability of capturing the minimal vertical acromiohumeral distance, which is the most important indicator of subacromial motion metrics in daily clinical practice.
Collapse
Affiliation(s)
- Yi-Chung Shu
- Institute of Applied Mechanics, College of Engineering, National Taiwan University, Taipei, Taiwan
| | - Yu-Cheng Lo
- Institute of Applied Mechanics, College of Engineering, National Taiwan University, Taipei, Taiwan
| | - Hsiao-Chi Chiu
- Institute of Applied Mechanics, College of Engineering, National Taiwan University, Taipei, Taiwan
| | - Lan-Rong Chen
- Department of Physical Medicine and Rehabilitation and Community and Geriatric Research Center, National Taiwan University Hospital, Bei-Hu Branch, Taipei, Taiwan
| | - Che-Yu Lin
- Institute of Applied Mechanics, College of Engineering, National Taiwan University, Taipei, Taiwan
| | - Wei-Ting Wu
- Department of Physical Medicine and Rehabilitation and Community and Geriatric Research Center, National Taiwan University Hospital, Bei-Hu Branch, Taipei, Taiwan; Department of Physical Medicine and Rehabilitation, National Taiwan University College of Medicine, Taipei, Taiwan
| | - Levent Özçakar
- Department of Physical and Rehabilitation Medicine, Hacettepe University Medical School, Ankara, Turkey
| | - Ke-Vin Chang
- Department of Physical Medicine and Rehabilitation and Community and Geriatric Research Center, National Taiwan University Hospital, Bei-Hu Branch, Taipei, Taiwan; Department of Physical Medicine and Rehabilitation, National Taiwan University College of Medicine, Taipei, Taiwan; Center for Regional Anesthesia and Pain Medicine, Wang-Fang Hospital, Taipei Medical University, Taipei, Taiwan.
| |
Collapse
|
32
|
ELsayed Y, ELSayed A, Abdou MA. An automatic improved facial expression recognition for masked faces. Neural Comput Appl 2023; 35:14963-14972. [PMID: 37274419 PMCID: PMC10067009 DOI: 10.1007/s00521-023-08498-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 03/16/2023] [Indexed: 04/05/2023]
Abstract
Automatic facial expression recognition (AFER), sometimes referred to as emotional recognition, is important for socializing. Automatic methods in the past two years faced challenges due to Covid-19 and the vital wearing of a mask. Machine learning techniques tremendously increase the amount of data processed and achieved good results in such AFER to detect emotions; however, those techniques are not designed for masked faces and thus achieved poor recognition. This paper introduces a hybrid convolutional neural network aided by a local binary pattern to extract features in an accurate way, especially for masked faces. The basic seven emotions classified into anger, happiness, sadness, surprise, contempt, disgust, and fear are to be recognized. The proposed method is applied on two datasets: the first represents CK and CK +, while the second represents M-LFW-FER. Obtained results show that emotion recognition with a face mask achieved an accuracy of 70.76% on three emotions. Results are compared to existing techniques and show significant improvement.
Collapse
Affiliation(s)
- Yasmeen ELsayed
- Faculty of Science, Alexandria University, Alexandria, Egypt
| | - Ashraf ELSayed
- Faculty of Science, Alexandria University, Alexandria, Egypt
- Faculty of Computer Science and Engineering, Alamein International University, El Alamein, Egypt
| | | |
Collapse
|
33
|
Wei J, Liu G, Liu S, Xiao Z. A novel algorithm for small object detection based on YOLOv4. PeerJ Comput Sci 2023; 9:e1314. [PMID: 37346537 PMCID: PMC10280595 DOI: 10.7717/peerj-cs.1314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 03/06/2023] [Indexed: 06/23/2023]
Abstract
Small object detection is one of the difficulties in the development of computer vision, especially in the case of complex image backgrounds, and the accuracy of small object detection still needs to be improved. In this article, we present a small object detection network based on YOLOv4, which solves some obstacles that hinder the performance of traditional methods in small object detection tasks in complex road environments, such as few effective features, the influence of image noise, and occlusion by large objects, and improves the detection of small objects in complex background situations such as drone aerial survey images. The improved network architecture reduces the computation and GPU memory consumption of the network by including the cross-stage partial network (CSPNet) structure into the spatial pyramid pool (SPP) structure in the YOLOv4 network and convolutional layers after concatenation operation. Secondly, the accuracy of the model on the small object detection task is improved by adding a more suitable small object detection head and removing one used for large object detection. Then, a new branch is added to extract feature information at a shallow location in the backbone part, and the feature information extracted from this branch is fused in the neck part to enrich the small object location information extracted by the model; when fusing feature information from different levels in the backbone, the fusion weight of useful information is increased by adding a weighting mechanism to improve detection performance at each scale. Finally, a coordinated attention (CA) module is embedded at a suitable location in the neck part, which enables the model to focus on spatial location relationships and inter-channel relationships and enhances feature representation capability. The proposed model has been tested to detect 10 different target objects in aerial images from drones and five different road traffic signal signs in images taken from vehicles in a complex road environment. The detection speed of the model meets the criteria of real-time detection, the model has better performance in terms of accuracy compared to the existing state-of-the-art detection models, and the model has only 44M parameters. On the drone aerial photography dataset, the average accuracy of YOLOv4 and YOLOv5L is 42.79% and 42.10%, respectively, while our model achieves an average accuracy (mAP) of 52.76%; on the urban road traffic light dataset, the proposed model achieves an average accuracy of 96.98%, which is also better than YOLOv4 (95.32%), YOLOv5L (94.79%) and other advanced models. The current work provides an efficient method for small object detection in complex road environments, which can be extended to scenarios involving small object detection, such as drone cruising and autonomous driving.
Collapse
|
34
|
Li J, Sun W, von Deneen KM, Fan X, An G, Cui G, Zhang Y. MG-Net: Multi-level global-aware network for thymoma segmentation. Comput Biol Med 2023; 155:106635. [PMID: 36791547 DOI: 10.1016/j.compbiomed.2023.106635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 01/26/2023] [Accepted: 02/04/2023] [Indexed: 02/10/2023]
Abstract
BACKGROUND AND OBJECTIVE Automatic thymoma segmentation in preoperative contrast-enhanced computed tomography (CECT) images makes great sense for diagnosis. Although convolutional neural networks (CNNs) are distinguished in medical image segmentation, they are challenged by thymomas with various shapes, scales and textures, owing to the intrinsic locality of convolution operations. In order to overcome this deficit, we built a deep learning network with enhanced global-awareness for thymoma segmentation. METHODS We propose a multi-level global-aware network (MG-Net) for thymoma segmentation, in which the multi-level feature interaction and integration are jointly designed to enhance the global-awareness of CNNs. Particularly, we design the cross-attention block (CAB) to calculate pixel-wise interactions of multi-level features, resulting in the Global Enhanced Convolution Block, which can enable the network to handle various thymomas by strengthening the global-awareness of the encoder. We further devise the Global Spatial Attention Module to integrate coarse- and fine-grain information for enhancing the semantic consistency between the encoder and decoder with CABs. We also develop an Adaptive Attention Fusion Module to adaptively aggregate different semantic-scale features in the decoder to preserve comprehensive details. RESULTS The MG-Net has been evaluated against several state-of-the-art models on the self-collected CECT dataset and NIH Pancreas-CT dataset. Results suggest that all designed components are effective, and MG-Net has superior segmentation performance and generalization ability over existing models. CONCLUSION Both the qualitative and quantitative experimental results indicate that our MG-Net with global-aware ability can achieve accurate thymoma segmentation and has generalization ability in different tasks. The code is available at: https://github.com/Leejyuan/MGNet.
Collapse
Affiliation(s)
- Jingyuan Li
- Center for Brain Imaging, School of Life Science and Technology, Xidian University & Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, Xi'an, Shaanxi, 710126, China; International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China
| | - Wenfang Sun
- International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China; School of Aerospace Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China.
| | - Karen M von Deneen
- Center for Brain Imaging, School of Life Science and Technology, Xidian University & Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, Xi'an, Shaanxi, 710126, China; International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China
| | - Xiao Fan
- Center for Brain Imaging, School of Life Science and Technology, Xidian University & Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, Xi'an, Shaanxi, 710126, China; International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China
| | - Gang An
- Center for Brain Imaging, School of Life Science and Technology, Xidian University & Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, Xi'an, Shaanxi, 710126, China; International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China
| | - Guangbin Cui
- Department of Radiology, Tangdu Hospital, Fourth Military Medical University, Xi'an, Shaanxi, 710038, China.
| | - Yi Zhang
- Center for Brain Imaging, School of Life Science and Technology, Xidian University & Engineering Research Center of Molecular and Neuro Imaging, Ministry of Education, Xi'an, Shaanxi, 710126, China; International Joint Research Center for Advanced Medical Imaging and Intelligent Diagnosis and Treatment & Xi'an Key Laboratory of Intelligent Sensing and Regulation of Trans-Scale Life Information, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi, 710126, China.
| |
Collapse
|
35
|
Shamshiri MA, Krzyżak A, Kowal M, Korbicz J. Compatible-domain Transfer Learning for Breast Cancer Classification with Limited Annotated Data. Comput Biol Med 2023; 154:106575. [PMID: 36758326 DOI: 10.1016/j.compbiomed.2023.106575] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 12/18/2022] [Accepted: 01/22/2023] [Indexed: 01/26/2023]
Abstract
Microscopic analysis of breast cancer images is the primary task in diagnosing cancer malignancy. Recent attempts to automate this task have employed deep learning models whose success has depended on large volumes of data, while acquiring annotated data in biomedical domains is time-consuming and may not always be feasible. A typical strategy to address this is to apply transfer learning using pre-trained models on a large natural image database (e.g., ImageNet) instead of training a model from scratch. This approach, however, has not been effective in several previous studies due to fundamental differences between natural and medical images. In this study, for the first time we proposed the idea of using a compatible data set of histopathological images to classify breast cancer cytological biopsy specimens. Despite intrinsic differences between histopathological and cytological images, we demonstrate that the features learned by deep networks during the pre-training procedure are compatible with those obtained throughout fine-tuning process. To thoroughly investigate this assertion, we explore three different strategies for training as well as two different approaches for fine-tuning deep learning models. By comparing the obtained results with those of previous state-of-the-art research conducted on the same data set, we demonstrate that the proposed method boasts of improved classification accuracy by 6% to 17% compared to the studies which were based on traditional machine learning techniques, and also enhanced accuracy by roughly 7% compared to those who utilized deep learning methods, eventually achieving 98.73% validation accuracy and 94.55% test accuracy. Exploring different training scenarios also revealed that using a compatible dataset has helped to elevate the classification accuracy by 3.0% compared to the typical approach of using ImageNet. Experimental results show that our approach, despite using a very small number of training images, has achieved performance comparable to that of experienced pathologists and has the potential to be applied in clinical settings.
Collapse
Affiliation(s)
- Mohammad Amin Shamshiri
- Department of Computer Science and Software Engineering, Concordia University, Montreal, H3G 1M8, Canada.
| | - Adam Krzyżak
- Department of Computer Science and Software Engineering, Concordia University, Montreal, H3G 1M8, Canada
| | - Marek Kowal
- Institute of Control and Computation Engineering, University of Zielona Góra, Zielona Góra, Poland
| | - Józef Korbicz
- Institute of Control and Computation Engineering, University of Zielona Góra, Zielona Góra, Poland
| |
Collapse
|
36
|
Kong X, Zhang K. A novel text sentiment analysis system using improved depthwise separable convolution neural networks. PeerJ Comput Sci 2023; 9:e1236. [PMID: 37346624 PMCID: PMC10280403 DOI: 10.7717/peerj-cs.1236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 01/11/2023] [Indexed: 06/23/2023]
Abstract
Human behavior is greatly affected by emotions. Human behavior can be predicted by classifying emotions. Therefore, mining people's emotional tendencies from text is of great significance for predicting the behavior of target groups and making decisions. The good use of emotion classification technology can produce huge social and economic benefits. However, due to the rapid development of the Internet, the text information generated on the Internet increases rapidly at an unimaginable speed, which makes the previous method of manually classifying texts one-by-one more and more unable to meet the actual needs. In the subject of sentiment analysis, one of the most pressing problems is how to make better use of computer technology to extract emotional tendencies from text data in a way that is both more efficient and accurate. In the realm of text-based sentiment analysis, the currently available deep learning algorithms have two primary issues to contend with. The first is the high level of complexity involved in training the model, and the second is that the model does not take into account all of the aspects of language and does not make use of word vector information. This research employs an upgraded convolutional neural network (CNN) model as a response to these challenges. The goal of this model is to improve the downsides caused by the problems described above. First, the text separable convolution algorithm is used to perform hierarchical convolution on text features to achieve the refined extraction of word vector information and context information. Doing so avoids semantic confusion and reduces the complexity of convolutional networks. Secondly, the text separable convolution algorithm is applied to text sentiment analysis, and an improved CNN is further proposed. Compared with other models, the proposed model shows better performance in text-based sentiment analysis tasks. This study provides great value for text-based sentiment analysis tasks.
Collapse
Affiliation(s)
- Xiaoyu Kong
- Wuxi Vocational Institute of Commerce, Wuxi, Jiangsu, China
| | - Ke Zhang
- Wuxi Vocational Institute of Commerce, Wuxi, Jiangsu, China
| |
Collapse
|
37
|
Meroueh C, Chen ZE. Artificial intelligence in anatomical pathology: building a strong foundation for precision medicine. Hum Pathol 2023; 132:31-38. [PMID: 35870567 DOI: 10.1016/j.humpath.2022.07.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 07/08/2022] [Indexed: 02/07/2023]
Abstract
With the convergence of digital pathology (DP) and artificial intelligence (AI), anatomic pathology practice has been experiencing an exciting paradigm shifting. Pathologists will be provided with an augmented ability to improve diagnostic accuracy, efficiency, and consistency. There will be subvisual morphometric features discovered and multiomics data integrated to provide better prognostic and theragnostic information to guide individual patients' management. The perspective for future precision medicine is promising. However, there are many challenges before AI-assisted DP diagnostic workflows can be successfully implemented. Herein, we briefly review some examples of AI application in anatomic pathology with an emphasis on the subspecialty of gastrointestinal pathology and discuss potential challenges for clinical implementation.
Collapse
Affiliation(s)
- Chady Meroueh
- Division of Anatomic Pathology, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, 55905, USA
| | - Zongming Eric Chen
- Division of Anatomic Pathology, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, 55905, USA.
| |
Collapse
|
38
|
Abunasser BS, Al-Hiealy MRJ, Zaqout IS, Abu-Naser SS. Convolution Neural Network for Breast Cancer Detection and Classification Using Deep Learning. Asian Pac J Cancer Prev 2023; 24:531-544. [PMID: 36853302 PMCID: PMC10162639 DOI: 10.31557/apjcp.2023.24.2.531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Indexed: 03/01/2023] Open
Abstract
OBJECTIVE Early detection and precise diagnosis of breast cancer (BC) plays an essential part in enhancing the diagnosis and improving the breast cancer survival rate of patients from 30 to 50%. Through the advances of technology in healthcare, deep learning takes a significant role in handling and inspecting a great number of X-ray, MRI, CTR images. The aim of this study is to propose a deep learning model (BCCNN) to detect and classify breast cancers into eight classes: benign adenosis (BA), benign fibroadenoma (BF), benign phyllodes tumor (BPT), benign tubular adenoma (BTA), malignant ductal carcinoma (MDC), malignant lobular carcinoma (MLC), malignant mucinous carcinoma (MMC), and malignant papillary carcinoma (MPC). METHODS Breast cancer MRI images were classified into BA, BF, BPT, BTA, MDC, MLC, MMC, and MPC using a proposed Deep Learning model with additional 5 fine-tuned Deep learning models consisting of Xception, InceptionV3, VGG16, MobileNet and ResNet50 trained on ImageNet database. The dataset was collected from Kaggle depository for breast cancer detection and classification. That Dataset was boosted using GAN technique. The images in the dataset have 4 magnifications (40X, 100X, 200X, 400X, and Complete Dataset). Thus we evaluated the proposed Deep Learning model and 5 pre-trained models using each dataset individually. That means we carried out a total of 30 experiments. The measurement that was used in the evaluation of all models includes: F1-score, recall, precision, accuracy. RESULTS The classification F1-score accuracies of Xception, InceptionV3, ResNet50, VGG16, MobileNet, and Proposed Model (BCCNN) were 97.54%, 95.33%, 98.14%, 97.67%, 93.98%, and 98.28%, respectively. CONCLUSION Dataset Boosting, preprocessing and balancing played a good role in enhancing the detection and classification of breast cancer of the proposed model (BCCNN) and the fine-tuned pre-trained models' accuracies greatly. The best accuracies were attained when the 400X magnification of the MRI images due to their high images resolution.
Collapse
Affiliation(s)
- Basem S Abunasser
- University Malaysia of Computer Science & Engineering (UNIMY), Cyberjaya, Malaysia
| | | | - Ihab S Zaqout
- Faculty of Engineering and Information Technology, Al-Azhar University, Gaza, Palestine
| | - Samy S Abu-Naser
- Faculty of Engineering and Information Technology, Al-Azhar University, Gaza, Palestine
| |
Collapse
|
39
|
Ma P, Ge B, Yang H, Guo T, Pan J, Wang W. Application of time-frequency domain and deep learning fusion feature in non-invasive diagnosis of congenital heart disease-related pulmonary arterial hypertension. MethodsX 2023; 10:102032. [PMID: 36718204 DOI: 10.1016/j.mex.2023.102032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Accepted: 01/18/2023] [Indexed: 01/21/2023] Open
Abstract
Pulmonary arterial hypertension associated with congenital heart disease (CHD-PAH) is a fatal cardiovascular disease. A novel method for non-invasive initial diagnosis of the CHD-PAH was put forward in this work. First, original heart sounds were segmented into each cardiac cycle by using double-threshold adaptive method. According to clinical auscultation, the pathological information of CHD-PAH is concentrated in S2, so the time-frequency features in both of an entire cardiac cycle and S2 were extracted. Then the time-frequency features combine with the deep learning features to form a feature vector. It is the fusion feature, which will be input into a classifier. Finally, the majority voting algorithm was used to obtain the optimal classification results. A classification accuracy of 88.61% was achieved using this novel method. Three points are essential: •A double-threshold adaptive method is used to segment heart sound into each cardiac cycle.•The time-frequency domain features in both of an entire cardiac cycle and S2 were extracted, which are combined with deep learning features to form the fusion feature.•The XGBoost was used as three-class classifier for the classification of normal, CHD and CHD-PAH. The majority voting algorithm was used to obtain the optimal classification results.
Collapse
|
40
|
Narula A, Vaegae NK. Development of CNN-LSTM combinational architecture for COVID-19 detection. J Ambient Intell Humaniz Comput 2022; 14:2645-2656. [PMID: 36590235 PMCID: PMC9789730 DOI: 10.1007/s12652-022-04508-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 12/14/2022] [Indexed: 06/17/2023]
Abstract
The world has been under extreme pressure due to the spread of the coronavirus. The urgency to eradicate the virus has caused distress amongst civilians and medical agencies to an equal extent. Due to anomalies observed in the results from reverse transcription-polymerase chain reaction (RTPCR) tests, more reliable options like computed tomography (CT) scan-based tests are being researched upon. In this paper, a novel combinational architecture is built upon the principles of Convolution Neural Networks (CNN) and Long Short Term Memory (LSTM) Networks to detect COVID-19 virus. This method uses chest X-ray images as inputs to combinational architecture for the classification of samples. The CNN part of the network will be used to extract features that help in the classification, and the LSTM part will be used for classification based on the extracted features. A total of 8 convolutional layers and 4 pooling layers are used for CNN and 4 LSTM layers of 64 and 128 cells respectively. Instead of the sigmoid function, a rectified linear unit function is used as an activation function. This provides non-linearity to the CNN and better accuracies in comparison. The proposed model employs a padding layer to prevent the loss of information. Accuracy, loss, F1 score, and Matthew's Correlation Coefficient (MCC) are calculated to analyse the effectiveness of the proposed architecture. The proposed model is validated using a relatively larger dataset of 7292 images. The combinational architecture provides a more informative and truthful result in the evaluation of classification as it caters to both the size of positive elements and negative elements in the dataset. The proposed CNN-LSTM model gives an accuracy of 98.91% and an MCC value of 97.84% respectively. The model is also compared with models employing transfer learning methods for similar applications.
Collapse
Affiliation(s)
- Abhinav Narula
- School of Electronics Engineering, Vellore Institute of Technology, Vellore, 632014 India
| | - Naveen Kumar Vaegae
- School of Electronics Engineering, Vellore Institute of Technology, Vellore, 632014 India
| |
Collapse
|
41
|
T N SG, Satish R, Sridhar R. Learning effective embedding for automated COVID-19 prediction from chest X-ray images. Multimed Syst 2022; 29:739-751. [PMID: 36310764 PMCID: PMC9596346 DOI: 10.1007/s00530-022-01015-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 10/13/2022] [Indexed: 06/16/2023]
Abstract
The pandemic that the SARS-CoV-2 originated in 2019 is continuing to cause serious havoc on the global population's health, economy, and livelihood. A critical way to suppress and restrain this pandemic is the early detection of COVID-19, which will help to control the virus. Chest X-rays are one of the more straightforward ways to detect the COVID-19 virus compared to the standard methods like CT scans and RT-PCR diagnosis, which are very complex, expensive, and take much time. Our research on various papers shows that the currently researchers are actively working for an efficient Deep Learning model to produce an unbiased detection of COVID-19 through chest X-ray images. In this work, we propose a novel convolution neural network model based on supervised classification that simultaneously computes identification and verification loss. We adopt a transfer learning approach using pretrained models trained on imagenet dataset such as Alex Net and VGG16 as back-bone models and use data augmentation techniques to solve class imbalance and boost the classifier's performance. Finally, our proposed classifier architecture model ensures unbiased and high accuracy results, outperforming existing deep learning models for COVID-19 detection from chest X-ray images producing State of the Art performance. It shows strong and robust performance and proves to be easily deployable and scalable, therefore increasing the efficiency of analyzing chest X-ray images with high accuracy in detection of Coronavirus.
Collapse
Affiliation(s)
- Sree Ganesh T N
- Department of Computer Science and Engineering, National Institute of Technology, Tiruchirappalli, Tamil Nadu 620015 India
| | - Rishi Satish
- Department of Computer Science and Engineering, National Institute of Technology, Tiruchirappalli, Tamil Nadu 620015 India
| | - Rajeswari Sridhar
- Department of Computer Science and Engineering, National Institute of Technology, Tiruchirappalli, Tamil Nadu 620015 India
| |
Collapse
|
42
|
Gupta RK, Kunhare N, Pathik N, Pathik B. An AI-enabled pre-trained model-based Covid detection model using chest X-ray images. Multimed Tools Appl 2022; 81:37351-37377. [PMID: 35844979 PMCID: PMC9273923 DOI: 10.1007/s11042-021-11580-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 08/25/2021] [Accepted: 09/20/2021] [Indexed: 06/15/2023]
Abstract
The year 2020 and 2021 was the witness of Covid 19 and it was the leading cause of death throughout the world during this time period. It has an impact on a large geographic area, particularly in countries with a large population. Due to the fact that this novel coronavirus has been detected in all countries around the world, the World Health Organization (WHO) has declared Covid-19 to be a pandemic. This novel coronavirus spread quickly from person to person through the saliva droplets and direct or indirect contact with an infected person. The tests carried out to detect the Covid-19 are time-consuming and the primary cause of rapid growth in Covid19 cases. Early detection of Covid patient can play a significant role in controlling the Covid chain by isolation the patient and proper treatment at the right time. Recent research on Covid-19 claim that Chest CT and X-ray images can be used as the preliminary screening for Covid-19 detection. This paper suggested an Artificial Intelligence (AI) based approach for detecting Covid-19 by using X-ray and CT scan images. Due to the availability of the small Covid dataset, we are using a pre-trained model. In this paper, four pre-trained models named VGGNet-19, ResNet50, InceptionResNetV2 and MobileNet are trained to classify the X-ray images into the Covid and Normal classes. A model is tuned in such a way that a smaller percentage of Covid cases will be classified as Normal cases by employing normalization and regularization techniques. The updated binary cross entropy loss (BCEL) function imposes a large penalty for classifying any Covid class to Normal class. The experimental results reveal that the proposed InceptionResNetV2 model outperforms the other pre-trained model with training, validation and test accuracy of 99.2%, 98% and 97% respectively.
Collapse
Affiliation(s)
| | | | | | - Babita Pathik
- Sagar Institute of Science and Technology, Bhopal, India
| |
Collapse
|
43
|
Kuo CE, Lu TH, Chen GT, Liao PY. Towards precision sleep medicine: Self-attention GAN as an innovative data augmentation technique for developing personalized automatic sleep scoring classification. Comput Biol Med 2022; 148:105828. [PMID: 35816855 DOI: 10.1016/j.compbiomed.2022.105828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Revised: 05/07/2022] [Accepted: 07/03/2022] [Indexed: 11/03/2022]
Abstract
It is very important to have good quality sleep, which can affect aspects such as memory consolidation, emotional regulation, learning, physical development, and quality of life. Diagnosing human sleep quality and problems quickly and accurately is an important issue for human well-being. Therefore, many automatic sleep scoring methods have been proposed. However, the methods have been developed using sleep data from different individuals or groups. The accuracies of these proposed methods might decrease, due to existing individual differences. In this study, the self-attention generative adversarial network (SAGAN) was applied as an advanced data augmentation technique to propose an improved personalized automatic sleep scoring classification. First, the spectrograms were converted from electroencephalography (EEG). Then, SAGAN was used to generate synthesized spectrograms for each subject. Finally, the real and synthesized spectrograms of each subject were utilized to train a personalized classifier. The averaged accuracy and standard deviation of the proposed method are 95.74% and 3.78%, respectively. Compared to the classifier trained with all subjects' training data, the average accuracy increased by 8.08%. The results proved that the generated spectrograms significantly improved the performance of the personalized automatic sleep scoring classification. The contributions of the proposed method were that made the medical staff and subjects save massive medical resources and time for manual recording and scoring.
Collapse
|
44
|
Huang X, Shi Y, Yan J, Qu W, Li X, Tan J. LPI-CSFFR: Combining serial fusion with feature reuse for predicting LncRNA-protein interactions. Comput Biol Chem 2022; 99:107718. [PMID: 35785626 DOI: 10.1016/j.compbiolchem.2022.107718] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/24/2022] [Accepted: 06/22/2022] [Indexed: 11/03/2022]
Abstract
Long non-coding RNAs (LncRNAs) play important roles in a series of life activities, and they function primarily with proteins. The wet experimental-based methods in lncRNA-protein interactions (lncRPIs) study are time-consuming and expensive. In this study, we propose for the first time a novel feature fusion method, the LPI-CSFFR, to train and predict LncRPIs based on a Convolutional Neural Network (CNN) with feature reuse and serial fusion in sequences, secondary structures, and physicochemical properties of proteins and lncRNAs. The experimental results indicate that LPI-CSFFR achieves excellent performance on the datasets RPI1460 and RPI1807 with an accuracy of 83.7 % and 98.1 %, respectively. We further compare LPI-CSFFR with the state-of-the-art existing methods on the same benchmark datasets to evaluate the performance. In addition, to test the generalization performance of the model, we independently test sample pairs of five model organisms, where Mus musculus are the highest prediction accuracy of 99.5 %, and we find multiple hotspot proteins after constructing an interaction network. Finally, we test the predictive power of the LPI-CSFFR for sample pairs with unknown interactions. The results indicate that LPI-CSFFR is promising for predicting potential LncRPIs. The relevant source code and the data used in this study are available at https://github.com/JianjunTan-Beijing/LPI-CSFFR.
Collapse
Affiliation(s)
- Xiaoqian Huang
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China
| | - Yi Shi
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China
| | - Jing Yan
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China
| | - Wenyan Qu
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China
| | - Xiaoyi Li
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China
| | - Jianjun Tan
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China.
| |
Collapse
|
45
|
Guo F, Zhu X, Wu Z, Zhu L, Wu J, Zhang F. Clinical applications of machine learning in the survival prediction and classification of sepsis: coagulation and heparin usage matter. J Transl Med 2022; 20:265. [PMID: 35690822 PMCID: PMC9187899 DOI: 10.1186/s12967-022-03469-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 05/30/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Sepsis is a life-threatening syndrome eliciting highly heterogeneous host responses. Current prognostic evaluation methods used in clinical practice are characterized by an inadequate effectiveness in predicting sepsis mortality. Rapid identification of patients with high mortality risk is urgently needed. The phenotyping of patients will assistant invaluably in tailoring treatments. METHODS Machine learning and deep learning technology are used to characterize the patients' phenotype and determine the sepsis severity. The database used in this study is MIMIC-III and MIMIC-IV ('Medical information Mart for intensive care') which is a large, public, and freely available database. The K-means clustering is used to classify the sepsis phenotype. Convolutional neural network (CNN) was used to predict the 28-day survival rate based on 35 blood test variables of the sepsis patients, whereas a double coefficient quadratic multivariate fitting function (DCQMFF) is utilized to predict the 28-day survival rate with only 11 features of sepsis patients. RESULTS The patients were grouped into four clusters with a clear survival nomogram. The first cluster (C_1) was characterized by low white blood cell count, low neutrophil, and the highest lymphocyte proportion. C_2 obtained the lowest Sequential Organ Failure Assessment (SOFA) score and the highest survival rate. C_3 was characterized by significantly prolonged PTT, high SIC, and a higher proportion of patients using heparin than the patients in other clusters. The early mortality rate of patients in C_3 was high but with a better long-term survival rate than that in C_4. C_4 contained septic coagulation patients with the worst prognosis, characterized by slightly prolonged partial thromboplastin time (PTT), significantly prolonged prothrombin time (PT), and high septic coagulation disease score (SIC). The survival rate prediction accuracy of CNN and DCQMFF models reached 92% and 82%, respectively. The models were tested on an external dataset (MIMIC-IV) and achieved good performance. A DCQMFF-based application platform was established for fast prediction of the 28-day survival rate. CONCLUSION CNN and DCQMFF accurately predicted the sepsis patients' survival, while K-means successfully identified the phenotype groups. The distinct phenotypes associated with survival, and significant features correlated with mortality were identified. The findings suggest that sepsis patients with abnormal coagulation had poor outcomes, abnormal coagulation increase mortality during sepsis. The anticoagulation effects of appropriate heparin sodium treatment may improve extensive micro thrombosis-caused organ failure.
Collapse
Affiliation(s)
- Fei Guo
- Ningbo Institute for Medicine & Biomedical Engineering Combined Innovation, Ningbo Medical Treatment Centre Lihuili Hospital, Ningbo University, Ningbo, 315040, Zhejiang, China
| | - Xishun Zhu
- School of Mechatronics Engineering, Nanchang University, Nanchang, 330031, Jiangxi, China
| | - Zhiheng Wu
- School of Information Engineering, Nanchang University, Nanchang, 330031, Jiangxi, China
| | - Li Zhu
- School of Information Engineering, Nanchang University, Nanchang, 330031, Jiangxi, China
| | - Jianhua Wu
- School of Information Engineering, Nanchang University, Nanchang, 330031, Jiangxi, China.
| | - Fan Zhang
- Department of Critical Care Medicine, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.
| |
Collapse
|
46
|
Wen H, Zhao J, Xiang S, Lin L, Liu C, Wang T, An L, Liang L, Huang B. Towards more efficient ophthalmic disease classification and lesion location via convolution transformer. Comput Methods Programs Biomed 2022; 220:106832. [PMID: 35525213 DOI: 10.1016/j.cmpb.2022.106832] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 04/01/2022] [Accepted: 04/21/2022] [Indexed: 06/14/2023]
Abstract
OBJECTIVE A retina optical coherence tomography (OCT) image differs from a traditional image due to its significant speckle noise, irregularity, and inconspicuous features. A conventional deep learning architecture cannot effectively improve the classification accuracy, sensitivity, and specificity of OCT images, and noisy images are not conducive to further diagnosis. This paper proposes a novel lesion-localization convolution transformer (LLCT) method, which combines both convolution and self-attention to classify ophthalmic diseases more accurately and localize the lesions in retina OCT images. METHODS A novel architecture design is accomplished through applying customized feature maps generated by convolutional neutral network (CNN) as the input sequence of self-attention network. This design takes advantages of CNN's extracting image features and transformer's consideration of global context and dynamic attention. Part of the model is backward propagated to calculate the gradient as a weight parameter, which is multiplied and summed with the global features generated by the forward propagation process to locate the lesion. RESULTS Extensive experiments show that our proposed design achieves improvement of about 7.6% in overall accuracy, 10.9% in overall sensitivity, and 9.2% in overall specificity compared with previous methods. And the lesions can be localized without the labeling data of lesion location in OCT images. CONCLUSION The results prove that our method significantly improves the performance and reduces the computation complexity in artificial intelligence assisted analysis of ophthalmic disease through OCT images. SIGNIFICANCE Our method has a significance boost in ophthalmic disease classification and location via convolution transformer. This is applicable to assist ophthalmologists greatly.1.
Collapse
Affiliation(s)
- Huajie Wen
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China; College of Applied Science, Shenzhen University, Shenzhen 518060, China
| | - Jian Zhao
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China
| | - Shaohua Xiang
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China
| | - Lin Lin
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China
| | - Chengjian Liu
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China
| | - Tao Wang
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China
| | - Lin An
- Guangdong Vision Medical Science & Technology Co., Ltd. Foshan 528000, China
| | - Lixin Liang
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China.
| | - Bingding Huang
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China.
| |
Collapse
|
47
|
Zhang G, Luo W, Lyu J, Yu ZG, Huang G. CNNLSTMac4CPred: A Hybrid Model for N4-Acetylcytidine Prediction. Interdiscip Sci 2022; 14:439-51. [PMID: 35106702 DOI: 10.1007/s12539-021-00500-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 12/04/2021] [Accepted: 12/13/2021] [Indexed: 12/23/2022]
Abstract
N4-Acetylcytidine (ac4C) is a highly conserved post-transcriptional and an extensively existing RNA modification, playing versatile roles in the cellular processes. Due to the limitation of techniques and knowledge, large-scale identification of ac4C is still a challenging task. RNA sequences are like sentences containing semantics in the natural language. Inspired by the semantics of language, we proposed a hybrid model for ac4C prediction. The model used long short-term memory and convolution neural network to extract the semantic features hidden in the sequences. The semantic and the two traditional features (k-nucleotide frequencies and pseudo tri-tuple nucleotide composition) were combined to represent ac4C or non-ac4C sequences. The eXtreme Gradient Boosting was used as the learning algorithm. Five-fold cross-validation over the training set consisting of 1160 ac4C and 10,855 non-ac4C sequences obtained the area under the receiver operating characteristic curve (AUROC) of 0.9004, and the independent test over 469 ac4C and 4343 non-ac4C sequences reached an AUROC of 0.8825. The model obtained a sensitivity of 0.6474 in the five-fold cross-validation and 0.6290 in the independent test, outperforming two state-of-the-art methods. The performance of semantic features alone was better than those of k-nucleotide frequencies and pseudo tri-tuple nucleotide composition, implying that ac4C sequences are of semantics. The proposed hybrid model was implemented into a user-friendly web-server which is freely available to scientific communities: http://47.113.117.61/ac4c/ . The presented model and tool are beneficial to identify ac4C on large scale.
Collapse
|
48
|
Hu S, Zhu Y, Dong D, Wang B, Zhou Z, Wang C, Tian J, Peng Y. Chest Radiographs Using a Context-Fusion Convolution Neural Network (CNN): Can It Distinguish the Etiology of Community-Acquired Pneumonia (CAP) in Children? J Digit Imaging 2022; 35:1079-1090. [PMID: 35585465 PMCID: PMC9116701 DOI: 10.1007/s10278-021-00543-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 11/02/2021] [Accepted: 11/11/2021] [Indexed: 11/25/2022] Open
Abstract
Clinical symptoms and inflammatory markers cannot reliably distinguish the etiology of CAP, and chest radiographs have abundant information related with CAP. Hence, we developed a context-fusion convolution neural network (CNN) to explore the application of chest radiographs to distinguish the etiology of CAP in children. This retrospective study included 1769 cases of pediatric pneumonia (viral pneumonia, n = 487; bacterial pneumonia, n = 496; and mycoplasma pneumonia, n = 786). The chest radiographs of the first examination, C-reactive protein (CRP), and white blood cell (WBC) were collected for analysis. All patients were stochastically divided into training, validation, and test cohorts in a 7:1:2 ratio. Automatic lung segmentation and hand-crafted pneumonia lesion segmentation were performed, from which three image-based models including a full-lung model, a local-lesion model, and a context-fusion model were built; two clinical characteristics were used to build a clinical model, while a logistic regression model combined the best CNN model and two clinical characteristics. Our experiments showed that the context-fusion model which integrated the features of the full-lung and local-lesion had better performance than the full-lung model and local-lesion model. The context-fusion model had area under curves of 0.86, 0.88, and 0.93 in identifying viral, bacterial, and mycoplasma pneumonia on the test cohort respectively. The addition of clinical characteristics to the context-fusion model obtained slight improvement. Mycoplasma pneumonia was more easily identified compared with the other two types. Using chest radiographs, we developed a context-fusion CNN model with good performance for noninvasively diagnosing the etiology of community-acquired pneumonia in children, which would help improve early diagnosis and treatment.
Collapse
Affiliation(s)
- Shasha Hu
- Department of Radiology, National Center for Children' Health, Beijing Children's Hospital, Capital Medical University, Beijing, 100045, China
| | - Yongbei Zhu
- CAS Key Laboratory of Molecular Imaging, State Key Laboratory of Management and Control for Complex Systems, Beijing Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University, Beijing, 100191, China
| | - Di Dong
- CAS Key Laboratory of Molecular Imaging, State Key Laboratory of Management and Control for Complex Systems, Beijing Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Bei Wang
- Department of Radiology, National Center for Children' Health, Beijing Children's Hospital, Capital Medical University, Beijing, 100045, China
| | - Zuofu Zhou
- Department of Radiology, Fujian Provincial Maternity and Children's Hospital, Fujian Medical University, Fuzhou, 350000, China
| | - Chi Wang
- Department of Radiology, National Center for Children' Health, Beijing Children's Hospital, Capital Medical University, Beijing, 100045, China
| | - Jie Tian
- CAS Key Laboratory of Molecular Imaging, State Key Laboratory of Management and Control for Complex Systems, Beijing Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China.
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University, Beijing, 100191, China.
| | - Yun Peng
- Department of Radiology, National Center for Children' Health, Beijing Children's Hospital, Capital Medical University, Beijing, 100045, China.
| |
Collapse
|
49
|
Kim J, Shujaat M, Tayara H. iProm-Zea: A two-layer model to identify plant promoters and their types using convolutional neural network. Genomics 2022; 114:110384. [PMID: 35533969 DOI: 10.1016/j.ygeno.2022.110384] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 04/18/2022] [Accepted: 05/02/2022] [Indexed: 01/14/2023]
Abstract
A promoter is a short DNA sequence near the start codon, responsible for initiating the transcription of a specific gene in the genome. The accurate recognition of promoters is important for achieving a better understanding of transcriptional regulation. Because of their importance in the process of biological transcriptional regulation, there is an urgent need to develop in silico tools to identify promoters and their types in a timely and accurate manner. A number of prediction methods have been developed in this regard; however, almost all of them are merely used for identifying promoters and their strength or sigma types. The TATA box region in TATA promoter influences the post-transcriptional processes; therefore, in the current study, we developed a two-layer predictor called "iProm-Zea" using the convolutional neural network (CNN) for identify TATA and TATA less promoters. The first layer can be used to identify a given DNA sequence as a promoter or non-promoter. The second layer can be used to identify whether the recognized promoter is the TATA promoter. To find an optimal feature encoding scheme and model, we employed four feature encoding schemes on different machine learning and CNN algorithms, and based on the evaluation results, we selected a one-hot encoding scheme and a CNN model for iProm-Zea. The 5-fold cross validation testing results demonstrated that the constructed predictor showed great potential for identifying promoters and classifying them as TATA and TATA less promoters. Furthermore, we performed cross-species analysis of iProm-Zea to evaluate its performance in other species. Moreover, to make it easier for other experimental scientists to obtain the results they need, we established a freely accessible and user-friendly web server at http://nsclbio.jbnu.ac.kr/tools/iProm-Zea/.
Collapse
|
50
|
Tang X, Zheng P, Li X, Wu H, Wei DQ, Liu Y, Huang G. Deep6mAPred: A CNN and Bi-LSTM-based deep learning method for predicting DNA N6-methyladenosine sites across plant species. Methods 2022; 204:142-150. [PMID: 35477057 DOI: 10.1016/j.ymeth.2022.04.011] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 04/16/2022] [Accepted: 04/20/2022] [Indexed: 12/11/2022] Open
Abstract
DNA N6-methyladenine (6mA) is a key DNA modification, which plays versatile roles in the cellular processes, including regulation of gene expression, DNA repair, and DNA replication. DNA 6mA is closely associated with many diseases in the mammals and with growth as well as development of plants. Precisely detecting DNA 6mA sites is of great importance to exploration of 6mA functions. Although many computational methods have been presented for DNA 6mA prediction, there is still a wide gap in the practical application. We presented a convolution neural network (CNN) and bi-directional long-short term memory (Bi-LSTM)-based deep learning method (Deep6mAPred) for predicting DNA 6mA sites across plant species. The Deep6mAPred stacked the CNNs and the Bi-LSTMs in a paralleling manner instead of a series-connection manner. The Deep6mAPred also employed the attention mechanism for improving the representations of sequences. The Deep6mAPred reached an accuracy of 0.9556 over the independent rice dataset, far outperforming the state-of-the-art methods. The tests across plant species showed that the Deep6mAPred is of a remarkable advantage over the state of the art methods. We developed a user-friendly web application for DNA 6mA prediction, which is freely available at http://106.13.196.152:7001/ for all the scientific researchers. The Deep6mAPred would enrich tools to predict DNA 6mA sites and speed up the exploration of DNA modification.
Collapse
Affiliation(s)
- Xingyu Tang
- School of Electrical Engineering, Shaoyang University, Shaoyang, Hunan 422000, China
| | - Peijie Zheng
- School of Electrical Engineering, Shaoyang University, Shaoyang, Hunan 422000, China
| | - Xueyong Li
- School of Electrical Engineering, Shaoyang University, Shaoyang, Hunan 422000, China
| | - Hongyan Wu
- The Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Dong-Qing Wei
- The Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China.
| | - Yuewu Liu
- College of Information and Intelligence, Hunan Agricultural University, Changsha, Hunan 410081, China
| | - Guohua Huang
- School of Electrical Engineering, Shaoyang University, Shaoyang, Hunan 422000, China.
| |
Collapse
|