1
|
Attention-based convolutional neural network with multi-modal temporal information fusion for motor imagery EEG decoding. Comput Biol Med 2024; 175:108504. [PMID: 38701593 DOI: 10.1016/j.compbiomed.2024.108504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 04/15/2024] [Accepted: 04/21/2024] [Indexed: 05/05/2024]
Abstract
Convolutional neural network (CNN) has been widely applied in motor imagery (MI)-based brain computer interface (BCI) to decode electroencephalography (EEG) signals. However, due to the limited perceptual field of convolutional kernel, CNN only extracts features from local region without considering long-term dependencies for EEG decoding. Apart from long-term dependencies, multi-modal temporal information is equally important for EEG decoding because it can offer a more comprehensive understanding of the temporal dynamics of neural processes. In this paper, we propose a novel deep learning network that combines CNN with self-attention mechanism to encapsulate multi-modal temporal information and global dependencies. The network first extracts multi-modal temporal information from two distinct perspectives: average and variance. A shared self-attention module is then designed to capture global dependencies along these two feature dimensions. We further design a convolutional encoder to explore the relationship between average-pooled and variance-pooled features and fuse them into more discriminative features. Moreover, a data augmentation method called signal segmentation and recombination is proposed to improve the generalization capability of the proposed network. The experimental results on the BCI Competition IV-2a (BCIC-IV-2a) and BCI Competition IV-2b (BCIC-IV-2b) datasets show that our proposed method outperforms the state-of-the-art methods and achieves 4-class average accuracy of 85.03% on the BCIC-IV-2a dataset. The proposed method implies the effectiveness of multi-modal temporal information fusion in attention-based deep learning networks and provides a new perspective for MI-EEG decoding. The code is available at https://github.com/Ma-Xinzhi/EEG-TransNet.
Collapse
|
2
|
Classification of lung cancer subtypes on CT images with synthetic pathological priors. Med Image Anal 2024; 95:103199. [PMID: 38759258 DOI: 10.1016/j.media.2024.103199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Revised: 12/12/2023] [Accepted: 05/06/2024] [Indexed: 05/19/2024]
Abstract
The accurate diagnosis on pathological subtypes for lung cancer is of significant importance for the follow-up treatments and prognosis managements. In this paper, we propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on computed tomography (CT) images. Inspired by studies stating that cross-scale associations exist in the image patterns between the same case's CT images and its pathological images, we innovatively developed a pathological feature synthetic module (PFSM), which quantitatively maps cross-modality associations through deep neural networks, to derive the "gold standard" information contained in the corresponding pathological images from CT images. Additionally, we designed a radiological feature extraction module (RFEM) to directly acquire CT image information and integrated it with the pathological priors under an effective feature fusion framework, enabling the entire classification model to generate more indicative and specific pathologically related features and eventually output more accurate predictions. The superiority of the proposed model lies in its ability to self-generate hybrid features that contain multi-modality image information based on a single-modality input. To evaluate the effectiveness, adaptability, and generalization ability of our model, we performed extensive experiments on a large-scale multi-center dataset (i.e., 829 cases from three hospitals) to compare our model and a series of state-of-the-art (SOTA) classification models. The experimental results demonstrated the superiority of our model for lung cancer subtypes classification with significant accuracy improvements in terms of accuracy (ACC), area under the curve (AUC), positive predictive value (PPV) and F1-score.
Collapse
|
3
|
Use of one-dimensional CNN for input data size reduction in LSTM for improved computational efficiency and accuracy in hourly rainfall-runoff modeling. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 359:120931. [PMID: 38678895 DOI: 10.1016/j.jenvman.2024.120931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 03/18/2024] [Accepted: 04/14/2024] [Indexed: 05/01/2024]
Abstract
A deep learning architecture, denoted as CNNsLSTM, is proposed for hourly rainfall-runoff modeling in this study. The architecture involves a serial coupling of the one-dimensional convolutional neural network (1D-CNN) and the long short-term memory (LSTM) network. In the proposed framework, multiple layers of the CNN component process long-term hourly meteorological time series data, while the LSTM component handles short-term meteorological time series data and utilizes the extracted features from the 1D-CNN. In order to demonstrate the effectiveness of the proposed approach, it was implemented for hourly rainfall-runoff modeling in the Ishikari River watershed, Japan. A meteorological dataset, including precipitation, air temperature, evapotranspiration, longwave radiation, and shortwave radiation, was utilized as input. The results of the proposed approach (CNNsLSTM) were compared with those of previously proposed deep learning approaches used in hydrologic modeling, such as 1D-CNN, LSTM with only hourly inputs (LSTMwHour), a parallel architecture of 1D-CNN and LSTM (CNNpLSTM), and the LSTM architecture, which uses both daily and hourly input data (LSTMwDpH). Meteorological and runoff datasets were separated into training, validation, and test periods to train the deep learning model without overfitting, and evaluate the model with an independent dataset. The proposed approach clearly improved estimation accuracy compared to previously utilized deep learning approaches in rainfall = runoff modeling. In comparison with the observed flows, the median values of the Nash-Sutcliffe efficiency for the test period were 0.455-0.469 for 1D-CNN, 0.639-0.656 for CNNpLSTM, 0.745 for LSTMwHour, 0.831 for LSTMwDpH, and 0.865-0.873 for the proposed CNNsLSTM. Furthermore, the proposed CNNsLSTM reduced the median root mean square error (RMSE) of 1D-CNN by 50.2%-51.4%, CNNpLSTM by 37.4%-40.8%, LSTMwHour by 27.3%-29.5%, and LSTMwDpH by 10.6%-13.4%. Particularly, the proposed CNNsLSTM improved the estimations for high flows (≧75th percentile) and peak flows (≧95th percentile). The computational speed of LSTMwDpH is the fastest among the five architectures. Although the computation speed of CNNsLSTM is slower than LSTMwDpH's, it is still 6.9-7.9 times faster than that of LSTMwHour. Therefore, the proposed CNNsLSTM would be an effective approach for flood management and hydraulic structure design, mainly under climate change conditions that require estimating hourly river flows using meteorological datasets.
Collapse
|
4
|
STCGRU: A hybrid model based on CNN and BiGRU for mild cognitive impairment diagnosis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 248:108123. [PMID: 38471292 DOI: 10.1016/j.cmpb.2024.108123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/28/2023] [Accepted: 03/07/2024] [Indexed: 03/14/2024]
Abstract
BACKGROUND AND OBJECTIVE Early diagnosis of mild cognitive impairment (MCI) is one of the essential measures to prevent its further development into Alzheimer's disease (AD). In this paper, we propose a hybrid deep learning model for early diagnosis of MCI, called spatio-temporal convolutional gated recurrent unit network (STCGRU). METHODS The STCGRU comprises three bespoke convolutional neural network (CNN) modules and a bi-directional gated recurrent unit (BiGRU) module, which can effectively extract the spatial and temporal features of EEG and obtain excellent diagnostic results. We use a publicly available EEG dataset that has not undergone pre-processing to verify the robustness and accuracy of the model. Ablation experiments on STCGRU are conducted to showcase the individual performance improvement of each module. RESULTS Compared with other state-of-the-art approaches using the same publicly available EEG dataset, the results show that STCGRU is more suitable for early diagnosis of MCI. After 10-fold cross-validation, the average classification accuracy of the hybrid model reached 99.95 %, while the average kappa value reached 0.9989. CONCLUSIONS The experimental results show that the hybrid model proposed in this paper can directly extract compelling spatio-temporal features from the raw EEG data for classification. The STCGRU allows for accurate diagnosis of patients with MCI and has a high practical value.
Collapse
|
5
|
Comparative study on convolutional neural network and regression analysis to evaluate uniaxial compressive strength of Sandy Dolomite. Sci Rep 2024; 14:9880. [PMID: 38688970 PMCID: PMC11061122 DOI: 10.1038/s41598-024-60085-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 04/18/2024] [Indexed: 05/02/2024] Open
Abstract
Sandy Dolomite is a kind of widely distributed rock. The uniaxial compressive strength (UCS) of Sandy Dolomite is an important metric in the application in civil engineering, geotechnical engineering, and underground engineering. Direct measurement of UCS is costly, time-consuming, and even infeasible in some cases. To address this problem, we establish an indirect measuring method based on the convolutional neural network (CNN) and regression analysis (RA). The new method is straightforward and effective for UCS prediction, and has significant practical implications. To evaluate the performance of the new method, 158 dolomite samples of different sandification grades are collected for testing their UCS along and near the Yuxi section of the Central Yunnan Water Diversion (CYWD) Project in Yunnan Province, Southwest of China. Two regression equations with high correlation coefficients are established according to the RA results, to predict the UCS of Sandy Dolomites. Moreover, the minimum thickness of Sandy Dolomite was determined by the Schmidt hammer rebound test. Results show that CNN outperforms RA in terms of prediction the precision of Sandy Dolomite UCS. In addition, CNN can effectively deal with uncertainty in test results, making it one of the most effective tools for predicting the UCS of Sandy Dolomite.
Collapse
|
6
|
Nondestructive identification and classification of starch types based on multispectral techniques coupled with chemometrics. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 311:123976. [PMID: 38330764 DOI: 10.1016/j.saa.2024.123976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 01/16/2024] [Accepted: 01/27/2024] [Indexed: 02/10/2024]
Abstract
Starch is the main source of energy and nutrition. Therefore, some merchants often illegally add cheaper starches to other types of starches or package cheaper starches as higher priced starches to raise the price. In this study, 159 samples of commercially available wheat starch, potato starch, corn starch and sweet potato starch were selected for the identification and classification based on multispectral techniques, including near-infrared (NIR), mid-infrared (MIR) and Raman spectroscopy combined with chemometrics, including pretreatment methods, characteristic wavelength selection methods and classification algorithms. The results indicate that all three spectral techniques can be used to discriminate starch types. The Raman spectroscopy demonstrated superior performance compared to that of NIR and MIR spectroscopy. The accuracy of the models after characteristic wavelength selection is generally superior to that of the full spectrum, and two-dimensional correlation spectroscopy (2D-COS) achieves better model performance than other wavelength selection methods. Among the four classification methods, convolutional neural network (CNN) exhibited the best prediction performance, achieving accuracies of 99.74 %, 97.57 % and 98.65 % in NIR, MIR and Raman spectra, respectively, without pretreatment or characteristic wavelength selection.
Collapse
|
7
|
Developing a multivariate time series forecasting framework based on stacked autoencoders and multi-phase feature. Heliyon 2024; 10:e27860. [PMID: 38689959 PMCID: PMC11059412 DOI: 10.1016/j.heliyon.2024.e27860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 02/29/2024] [Accepted: 03/07/2024] [Indexed: 05/02/2024] Open
Abstract
Time series forecasting across different domains has received massive attention as it eases intelligent decision-making activities. Recurrent neural networks and various deep learning algorithms have been applied to modeling and forecasting multivariate time series data. Due to intricate non-linear patterns and significant variations in the randomness of characteristics across various categories of real-world time series data, achieving effectiveness and robustness simultaneously poses a considerable challenge for specific deep-learning models. We have proposed a novel prediction framework with a multi-phase feature selection technique, a long short-term memory-based autoencoder, and a temporal convolution-based autoencoder to fill this gap. The multi-phase feature selection is applied to retrieve the optimal feature selection and optimal lag window length for different features. Moreover, the customized stacked autoencoder strategy is employed in the model. The first autoencoder is used to resolve the random weight initialization problem. Additionally, the second autoencoder models the temporal relation between non-linear correlated features with convolution networks and recurrent neural networks. Finally, the model's ability to generalize, predict accurately, and perform effectively is validated through experimentation with three distinct real-world time series datasets. In this study, we conducted experiments on three real-world datasets: Energy Appliances, Beijing PM2.5 Concentration, and Solar Radiation. The Energy Appliances dataset consists of 29 attributes with a training size of 15,464 instances and a testing size of 4239 instances. For the Beijing PM2.5 Concentration dataset, there are 18 attributes, with 34,952 instances in the training set and 8760 instances in the testing set. The Solar Radiation dataset comprises 11 attributes, with 22,857 instances in the training set and 9797 instances in the testing set. The experimental setup involved evaluating the performance of forecasting models using two distinct error measures: root mean square error and mean absolute error. To ensure robust evaluation, the errors were calculated at the identical scale of the data. The results of the experiments demonstrate the superiority of the proposed model compared to existing models, as evidenced by significant advantages in various metrics such as mean squared error and mean absolute error. For PM2.5 air quality data, the proposed model's mean absolute error is 7.51 over 12.45, about ∼40% improvement. Similarly, the mean square error for the dataset is improved from 23.75 to 11.62, which is ∼51%of improvement. For the solar radiation dataset, the proposed model resulted in ∼34.7% improvement in means squared error and ∼75% in mean absolute error. The recommended framework demonstrates outstanding capabilities in generalization and outperforms datasets spanning multiple indigenous domains.
Collapse
|
8
|
Establishing a novel deep learning model for detecting peri-implantiti s. J Dent Sci 2024; 19:1165-1173. [PMID: 38618118 PMCID: PMC11010782 DOI: 10.1016/j.jds.2023.11.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/21/2023] [Accepted: 11/23/2023] [Indexed: 04/16/2024] Open
Abstract
BACKGROUND/PURPOSE The diagnosis of peri-implantitis using periapical radiographs is crucial. Recently, artificial intelligence may apply in radiographic image analysis effectively. The aim of this study was to differentiate the degree of marginal bone loss of an implant, and also to classify the severity of peri-implantitis using a deep learning model. MATERIALS AND METHODS A dataset of 800 periapical radiographic images were divided into training (n = 600), validation (n = 100), and test (n = 100) datasets with implants used for deep learning. An object detection algorithm (YOLOv7) was used to identify peri-implantitis. The classification performance of this model was evaluated using metrics, including the specificity, precision, recall, and F1 score. RESULTS Considering the classification performance, the specificity was 100%, precision was 100%, recall was 94.44%, and F1 score was 97.10%. CONCLUSION Results of this study suggested that implants can be identified from periapical radiographic images using deep learning-based object detection. This identification system could help dentists and patients suffering from implant problems. However, more images of other implant systems are needed to increase the learning performance to apply this system in clinical practice.
Collapse
|
9
|
Identification of kidney stones in KUB X-ray images using VGG16 empowered with explainable artificial intelligence. Sci Rep 2024; 14:6173. [PMID: 38486010 PMCID: PMC10940612 DOI: 10.1038/s41598-024-56478-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 03/06/2024] [Indexed: 03/18/2024] Open
Abstract
A kidney stone is a solid formation that can lead to kidney failure, severe pain, and reduced quality of life from urinary system blockages. While medical experts can interpret kidney-ureter-bladder (KUB) X-ray images, specific images pose challenges for human detection, requiring significant analysis time. Consequently, developing a detection system becomes crucial for accurately classifying KUB X-ray images. This article applies a transfer learning (TL) model with a pre-trained VGG16 empowered with explainable artificial intelligence (XAI) to establish a system that takes KUB X-ray images and accurately categorizes them as kidney stones or normal cases. The findings demonstrate that the model achieves a testing accuracy of 97.41% in identifying kidney stones or normal KUB X-rays in the dataset used. VGG16 model delivers highly accurate predictions but lacks fairness and explainability in their decision-making process. This study incorporates the Layer-Wise Relevance Propagation (LRP) technique, an explainable artificial intelligence (XAI) technique, to enhance the transparency and effectiveness of the model to address this concern. The XAI technique, specifically LRP, increases the model's fairness and transparency, facilitating human comprehension of the predictions. Consequently, XAI can play an important role in assisting doctors with the accurate identification of kidney stones, thereby facilitating the execution of effective treatment strategies.
Collapse
|
10
|
Predicting high-resolution air quality using machine learning: Integration of large eddy simulation and urban morphology data. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2024; 344:123371. [PMID: 38266694 DOI: 10.1016/j.envpol.2024.123371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 01/15/2024] [Accepted: 01/15/2024] [Indexed: 01/26/2024]
Abstract
Accurately predicting air pollutants, especially in urban areas with well-defined spatial structures, is crucial. Over the past decade, machine learning techniques have been widely used to forecast urban air quality. However, traditional machine learning approaches have limitations in accuracy and interpretability for predicting pollutants. In this study, we propose a convolutional neural network (CNN) model to predict the spatial distribution of CO concentration in Nanjing urban area at 10 m resolution. Our model incorporates various factors as input, such as building height, topography, emissions, and is trained against the outputs simulated by the parallelized large-eddy simulation model (PALM). The PALM model has 48 different scenarios that varied in emissions, wind speeds, and wind directions. The results display a strong consistency between the two models. Furthermore, we evaluate the performance of our model using a 10-fold cross-validation and out-of-sample cross-validation approach. This yields a robust correlation (with both R2 > 0.8) and a low RMSE between the CO predicted by the PALM and CNN models, which demonstrates the generalization capability of our CNN model. The CNN can extract crucial features from the resulted weight contribution map. This map indicates that the CO concentration at a location is more influenced by nearby buildings and emissions than distant ones. The interpretable patterns uncovered by our model are related to neighborhood effects, wind speeds, directions, and the impact of orientation on urban CO distribution. The model also shows high prediction accuracy (R > 0.8) when applied to another city. Overall, the integration of our CNN framework with the PALM model enhances the accuracy of air quality predictions, while enabling a fluid dynamic laws interpretation, providing effective tools for air quality management.
Collapse
|
11
|
MSDEnet: Multi-scale detail enhanced network based on human visual system for medical image segmentation. Comput Biol Med 2024; 170:108010. [PMID: 38262203 DOI: 10.1016/j.compbiomed.2024.108010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 12/24/2023] [Accepted: 01/15/2024] [Indexed: 01/25/2024]
Abstract
In medical image segmentation, accuracy is commonly high for tasks involving clear boundary partitioning features, as seen in the segmentation of X-ray images. However, for objects with less obvious boundary partitioning features, such as skin regions with similar color textures or CT images of adjacent organs with similar Hounsfield value ranges, segmentation accuracy significantly decreases. Inspired by the human visual system, we proposed the multi-scale detail enhanced network. Firstly, we designed a detail enhanced module to enhance the contrast between central and peripheral receptive field information using the superposition of two asymmetric convolutions in different directions and a standard convolution. Then, we expanded the scale of the module into a multi-scale detail enhanced module. The difference between central and peripheral information at different scales makes the network more sensitive to changes in details, resulting in more accurate segmentation. In order to reduce the impact of redundant information on segmentation results and increase the effective receptive field, we proposed the channel multi-scale module, adapted from the Res2net module. This creates independent parallel multi-scale branches within a single residual structure, increasing the utilization of redundant information and the effective receptive field at the channel level. We conducted experiments on four different datasets, and our method outperformed the common medical image segmentation algorithms currently being used. Additionally, we carried out detailed ablation experiments to confirm the effectiveness of each module.
Collapse
|
12
|
Detection and mitigation of coordinated cyber-physical attack in CPPS. Heliyon 2024; 10:e26332. [PMID: 38420452 PMCID: PMC10900950 DOI: 10.1016/j.heliyon.2024.e26332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 02/11/2024] [Accepted: 02/12/2024] [Indexed: 03/02/2024] Open
Abstract
Cyber-Physical Power System (CPPS) refers to a system in which the elements of the internet and the physical power system communicate and work together. With the use of modern communication and information technology, grid monitoring and control have improved. However, the components of a cyber system are extremely vulnerable to cyberattacks via cyber connections due to inadequate cyber security measures. Therefore, an adaptive defence strategy is required for the analysis and mitigation of the coordinated attack. The conventional approach of using an offline controller requires tuning for changes in the operating conditions of the system, which is inappropriate for the modern CPPS. To counter the coordinated attack, a framework that integrates STATCOM based Adaptive Model Predictive Controller with RPME and time delay compensator is proposed. This paper addresses attack impact, detection, and mitigation methods in CPPS. In both time domain and frequency domain simulations the case studies are conducted for three distinct situations namely physical attack, cyberattack, and coordinated attack. Convolutional Neural Network (CNN), Support Vector Machine (SVM), Random Forest (RF), and K Nearest Neighbour (KNN) are four data-driven methods used for the detection of anomalies in PMU measurement data. Simulation studies show that CNN performs better in anomaly detection than other classifiers based on assessed performance metrics. For coordinated attack mitigation the proposed STATCOM based Adaptive Model Predictive Controller with RPME quickly recovers the system than the STATCOM based conventional lead-lag controller. The efficacy of the proposed strategy is validated on the WSCC 3 machine 9 bus system.
Collapse
|
13
|
Convolutional neural networks combined with classification algorithms for the diagnosis of periodontitis. Oral Radiol 2024:10.1007/s11282-024-00739-5. [PMID: 38393548 DOI: 10.1007/s11282-024-00739-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 01/03/2024] [Indexed: 02/25/2024]
Abstract
OBJECTIVES We aim to develop a deep learning model based on a convolutional neural network (CNN) combined with a classification algorithm (CA) to assist dentists in quickly and accurately diagnosing the stage of periodontitis. MATERIALS AND METHODS Periapical radiographs (PERs) and clinical data were collected. The CNNs including Alexnet, VGG16, and ResNet18 were trained on PER to establish the PER-CNN models for no periodontal bone loss (PBL) and PBL. The CAs including random forest (RF), support vector machine (SVM), naive Bayes (NB), logistic regression (LR), and k-nearest neighbor (KNN) were added to the PER-CNN model for control, stage I, stage II and stage III/IV periodontitis. Heat map was produced using a gradient-weighted class activation mapping method to visualize the regions of interest of the PER-Alexnet model. Clustering analysis was performed based on the ten PER-CNN scores and the clinical characteristics. RESULTS The accuracy of the PER-Alexnet and PER-VGG16 models with the higher performance was 0.872 and 0.853, respectively. The accuracy of the PER-Alexnet + RF model with the highest performance for control, stage I, stage II and stage III/IV was 0.968, 0.960, 0.835 and 0.842, respectively. Heat map showed that the regions of interest predicted by the model were periodontitis bone lesions. We found that age and smoking were significantly related to periodontitis based on the PER-Alexnet scores. CONCLUSION The PER-Alexnet + RF model has reached high performance for whole-case periodontal diagnosis. The CNN models combined with CA can assist dentists in quickly and accurately diagnosing the stage of periodontitis.
Collapse
|
14
|
CDC-NET: a cell detection and confirmation network of bone marrow aspirate images for the aided diagnosis of AML. Med Biol Eng Comput 2024; 62:575-589. [PMID: 37953336 DOI: 10.1007/s11517-023-02955-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 10/20/2023] [Indexed: 11/14/2023]
Abstract
Standardized morphological evaluation in pathology is usually qualitative. Classifying and qualitatively analyzing the nucleated cells in the bone marrow aspirate images based on morphology is crucial for the diagnosis of acute myoid leukemia (AML), acute lymphoblastic leukemia (ALL), and Myelodysplastic syndrome (MDS), etc. However, it is time-consuming and difficult to accurately identify nucleated cells and calculate the percentage of the cells because of the complexity of bone marrow aspirate images. This paper proposed a deep learning analysis model of bone marrow aspirate images, termed Cell Detection and Confirmation Network (CDC-NET), for the aided diagnosis of AML by improving the accuracy of cell detection and recognition. Specifically, we take the nucleated cells in the bone marrow aspirate images as the detection objects to establish the model. Since some cells from different categories have similar morphology, classification error is inevitable. We design a confirmation network in which multiple trained classifiers work as pathologists to confirm the cell category by a voting method. To demonstrate the effectiveness of the proposed approach, experiments on clinical microscopic datasets are conducted. The Recall and Precision of CDC-NET are 78.54% and 91.74% respectively, and the missed rate of our method is lower than those of the other popular methods. The experimental results demonstrated that the proposed model has the potential for the pathological analysis of aspirate smears and the aided diagnosis of AML.
Collapse
|
15
|
Watershed groundwater level multistep ahead forecasts by fusing convolutional-based autoencoder and LSTM models. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 351:119789. [PMID: 38100860 DOI: 10.1016/j.jenvman.2023.119789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 10/31/2023] [Accepted: 12/03/2023] [Indexed: 12/17/2023]
Abstract
The development of deep learning-based groundwater level forecast models can tackle the challenge of high dimensional groundwater dynamics, predict groundwater variation trends accurately, and manage groundwater resources effectively, thereby contributing to sustainable water resources management. This study proposed a novel ConvAE-LSTM model, which fused a Convolutional-based Autoencoder model (ConvAE) and a Long Short-Term Memory Neural Network model (LSTM), to provide accurate spatiotemporal groundwater level forecasts over the next three months. The HBV-light and LSTM models are chosen as benchmarks. An ensemble of point data and the corresponding derived images concerning the past (observations) and the future (forecasts from a conceptual model) of groundwater levels at 33 groundwater wells in Jhuoshuei River basin of Taiwan between 2000 and 2019 constituted the case study. The findings showcase the effectiveness of the ConvAE-LSTM model in extracting crucial features from both point and imagery datasets. This model successfully establishes spatiotemporal dependencies between regional images and groundwater level data over diverse time frames, leading to accurate multi-step-ahead forecasts of groundwater levels. Notably, the ConvAE-LSTM model exhibits a substantial improvement, with the R-squared values showing an increase of more than 18%, 22%, and 49% for the R1, R2, and R3 regions, respectively, compared to the HBV-light model. Additionally, it outperforms the LSTM model in this regard. This study represents a noteworthy milestone in environmental modeling, offering key insights for designing sustainable groundwater management strategies to ensure the long-term availability of this vital resource.
Collapse
|
16
|
EESCN: A novel spiking neural network method for EEG-based emotion recognition. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107927. [PMID: 38000320 DOI: 10.1016/j.cmpb.2023.107927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 10/16/2023] [Accepted: 11/13/2023] [Indexed: 11/26/2023]
Abstract
BACKGROUND AND OBJECTIVE Although existing artificial neural networks have achieved good results in electroencephalograph (EEG) emotion recognition, further improvements are needed in terms of bio-interpretability and robustness. In this research, we aim to develop a highly efficient and high-performance method for emotion recognition based on EEG. METHODS We propose an Emo-EEGSpikeConvNet (EESCN), a novel emotion recognition method based on spiking neural network (SNN). It consists of a neuromorphic data generation module and a NeuroSpiking framework. The neuromorphic data generation module converts EEG data into 2D frame format as input to the NeuroSpiking framework, while the NeuroSpiking framework is used to extract spatio-temporal features of EEG for classification. RESULTS EESCN achieves high emotion recognition accuracies on DEAP and SEED-IV datasets, ranging from 94.56% to 94.81% on DEAP and a mean accuracy of 79.65% on SEED-IV. Compared to existing SNN methods, EESCN significantly improves EEG emotion recognition performance. In addition, it also has the advantages of faster running speed and less memory footprint. CONCLUSIONS EESCN has shown excellent performance and efficiency in EEG-based emotion recognition with potential for practical applications requiring portability and resource constraints.
Collapse
|
17
|
Developing an automatic warning system for anomalous chicken dispersion and movement using deep learning and machine learning. Poult Sci 2023; 102:103040. [PMID: 37769488 PMCID: PMC10539969 DOI: 10.1016/j.psj.2023.103040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 08/07/2023] [Accepted: 08/13/2023] [Indexed: 10/02/2023] Open
Abstract
Chicken is a major source of dietary protein worldwide. The dispersion and movement of chickens constitute vital indicators of their health and status. This is especially evident in Taiwanese native chickens (TNCs), a local variety which is high in physical activity when healthy. Conventionally, the dispersion and movement of chicken flocks are observed in patrols. However, manual patrolling is laborious and time-consuming. Moreover, frequent patrols increase the risk of carrying pathogens into chicken farms. To address these issues, this study proposes an approach to develop an automatic warning system for anomalous dispersion and movement of chicken flocks in commercial chicken farms. Embendded systems were developed to acquire videos of chickens from overhead view in a chicken house, in which approximately 20,000 TNCs were raised for a period of 10 wk. Each video was 5-min in length. The videos were transmitted to a remote cloud server and were converted into images. A You Only Look Once-version 7 tiny (YOLOv7-tiny) object detection model was trained to detect chickens in the images. The dispersion of the chicken flocks in a 5-min long video was calculated using nearest neighbor index (NNI). The movement of the chicken flocks in a 5-min long video was quantified using simple online and real-time tracking algorithm (SORT). The normal ranges (i.e., 95% confidence intervals) of chicken dispersion and movement were established using an autoregressive integrated moving average (ARIMA) model and a seasonal autoregressive integrated moving average with exogenous factors (SARIMAX) model, respectively. The system allows farmers to check up on the chicken farm only when the dispersion or movement values were not in the normal ranges. Thus, labor time can be saved and the risk of carrying pathogens into chicken farms can be reduced. The trained YOLOv7-tiny model achieved an average precision of 98.2% in chicken detection. SORT achieved a multiple object tracking accuracy of 95.3%. The ARIMA and SARIMAX achieved a mean absolute percentage error 3.71% and 13.39%, respectively, in forecasting dispersion and movement. The proposed approach can serve as a solution for automatic monitoring of anomalous chicken dispersion and movement in chicken farming, alerting farmers of potential health risks and environmental hazards in chicken farms.
Collapse
|
18
|
Multi-task learning framework to predict the status of central venous catheter based on radiographs. Artif Intell Med 2023; 146:102721. [PMID: 38042594 DOI: 10.1016/j.artmed.2023.102721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Revised: 09/29/2023] [Accepted: 11/14/2023] [Indexed: 12/04/2023]
Abstract
Hospital patients can have catheters and lines inserted during the course of their admission to give medicines for the treatment of medical issues, especially the central venous catheter (CVC). However, malposition of CVC will lead to many complications, even death. Clinicians always detect the status of the catheter to avoid the above issues via X-ray images. To reduce the workload of clinicians and improve the efficiency of CVC status detection, a multi-task learning framework for catheter status classification based on the convolutional neural network (CNN) is proposed. The proposed framework contains three significant components which are modified HRNet, multi-task supervision including segmentation supervision and heatmap regression supervision as well as classification branch. The modified HRNet maintaining high-resolution features from the start to the end can ensure to generation of high-quality assisted information for classification. The multi-task supervision can assist in alleviating the presence of other line-like structures such as other tubes and anatomical structures shown in the X-ray image. Furthermore, during the inference, this module is also considered as an interpretation interface to show where the framework pays attention to. Eventually, the classification branch is proposed to predict the class of the status of the catheter. A public CVC dataset is utilized to evaluate the performance of the proposed method, which gains 0.823 AUC (Area under the ROC curve) and 82.6% accuracy in the test dataset. Compared with two state-of-the-art methods (ATCM method and EDMC method), the proposed method can perform best.
Collapse
|
19
|
Enhanced premature ventricular contraction pulse detection and classification using deep convolutional neural network. Phys Eng Sci Med 2023; 46:1677-1691. [PMID: 37721684 DOI: 10.1007/s13246-023-01329-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 09/03/2023] [Indexed: 09/19/2023]
Abstract
Access to accurate and precise monitoring systems for cardiac arrhythmia could contribute significantly to preventing damage and subsequent heart disorders. The present research concentrates on using photoplethysmography (PPG) and arterial blood pressure (ABP) with deep convolutional neural networks (CNN) for the classification and detection of fetal cardiac arrhythmia or premature ventricular contractions (PMVCs). The framework for the study entails (Icentia 11k) a public dataset of ECG signals consisting of different cardiac abnormalities. Following this, the weights obtained from the Icentia 11k dataset are transferred to the proposed CNN. Finally, fine-tuning was carried out to improve the accuracy of classification. Results obtained showcase the capacity of the proposed method to detect and classify PMVCs into three types: Normal, P1, and P2 with an accuracy of 99.9%, 99.8%, and 99.5%.
Collapse
|
20
|
Visual transformer and deep CNN prediction of high-risk COVID-19 infected patients using fusion of CT images and clinical data. BMC Med Inform Decis Mak 2023; 23:265. [PMID: 37978393 PMCID: PMC10656999 DOI: 10.1186/s12911-023-02344-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Accepted: 10/16/2023] [Indexed: 11/19/2023] Open
Abstract
BACKGROUND Despite the globally reducing hospitalization rates and the much lower risks of Covid-19 mortality, accurate diagnosis of the infection stage and prediction of outcomes are clinically of interest. Advanced current technology can facilitate automating the process and help identifying those who are at higher risks of developing severe illness. This work explores and represents deep-learning-based schemes for predicting clinical outcomes in Covid-19 infected patients, using Visual Transformer and Convolutional Neural Networks (CNNs), fed with 3D data fusion of CT scan images and patients' clinical data. METHODS We report on the efficiency of Video Swin Transformers and several CNN models fed with fusion datasets and CT scans only vs. a set of conventional classifiers fed with patients' clinical data only. A relatively large clinical dataset from 380 Covid-19 diagnosed patients was used to train/test the models. RESULTS Results show that the 3D Video Swin Transformers fed with the fusion datasets of 64 sectional CT scans + 67 clinical labels outperformed all other approaches for predicting outcomes in Covid-19-infected patients amongst all techniques (i.e., TPR = 0.95, FPR = 0.40, F0.5 score = 0.82, AUC = 0.77, Kappa = 0.6). CONCLUSIONS We demonstrate how the utility of our proposed novel 3D data fusion approach through concatenating CT scan images with patients' clinical data can remarkably improve the performance of the models in predicting Covid-19 infection outcomes. SIGNIFICANCE Findings indicate possibilities of predicting the severity of outcome using patients' CT images and clinical data collected at the time of admission to hospital.
Collapse
|
21
|
DeepAutoGlioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping. BioData Min 2023; 16:32. [PMID: 37968655 PMCID: PMC10652591 DOI: 10.1186/s13040-023-00349-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 11/06/2023] [Indexed: 11/17/2023] Open
Abstract
BACKGROUND AND OBJECTIVE The classification of glioma subtypes is essential for precision therapy. Due to the heterogeneity of gliomas, the subtype-specific molecular pattern can be captured by integrating and analyzing high-throughput omics data from different genomic layers. The development of a deep-learning framework enables the integration of multi-omics data to classify the glioma subtypes to support the clinical diagnosis. RESULTS Transcriptome and methylome data of glioma patients were preprocessed, and differentially expressed features from both datasets were identified. Subsequently, a Cox regression analysis determined genes and CpGs associated with survival. Gene set enrichment analysis was carried out to examine the biological significance of the features. Further, we identified CpG and gene pairs by mapping them in the promoter region of corresponding genes. The methylation and gene expression levels of these CpGs and genes were embedded in a lower-dimensional space with an autoencoder. Next, ANN and CNN were used to classify subtypes using the latent features from embedding space. CNN performs better than ANN for subtyping lower-grade gliomas (LGG) and glioblastoma multiforme (GBM). The subtyping accuracy of CNN was 98.03% (± 0.06) and 94.07% (± 0.01) in LGG and GBM, respectively. The precision of the models was 97.67% in LGG and 90.40% in GBM. The model sensitivity was 96.96% in LGG and 91.18% in GBM. Additionally, we observed the superior performance of CNN with external datasets. The genes and CpGs pairs used to develop the model showed better performance than the random CpGs-gene pairs, preprocessed data, and single omics data. CONCLUSIONS The current study showed that a novel feature selection and data integration strategy led to the development of DeepAutoGlioma, an effective framework for diagnosing glioma subtypes.
Collapse
|
22
|
A self-attention model for cross-subject seizure detection. Comput Biol Med 2023; 165:107427. [PMID: 37683531 DOI: 10.1016/j.compbiomed.2023.107427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 08/03/2023] [Accepted: 08/28/2023] [Indexed: 09/10/2023]
Abstract
Epilepsy is a neurological disorder characterized by recurring seizures, detected by electroencephalography (EEG). EEG signals can be detected by manual time-consuming analysis and recently by automatic detection. The latter poses a significant challenge due to the high dimensional and non-stationary nature of EEG signals. Recently, deep learning (DL) techniques have emerged as valuable tools for seizure detection. In this study, a novel data-driven model based on DL, incorporating a self-attention mechanism (SAT), is proposed. One notable advantage of the proposed method is its simplicity in application, as the raw signal data is directly fed into the suggested network without requiring expertise in signal processing. The model leverages a one-dimensional convolutional neural network (CNN) to extract relevant features from EEG signals. These features are then passed through a long short-term memory (LSTM) module to benefit from its memory capabilities, along with a SAT mechanism. The key contribution of this paper lies in the addition of the SAT layer to the LSTM encoder, enabling enhanced exploration of the latent mapping during the encoding step. Cross-subject experiments revealed good performance of this approach with F1-score of 97.8% and 92.7% for binary and five-class epileptic seizure recognition tasks, respectively, on the public UCI dataset, and 97.9% on the CHB-MIT database, surpassing state-of-the-art DL performance. Besides, the proposed method exhibits robustness to inter-subject variability.
Collapse
|
23
|
Effective automatic detection of anterior cruciate ligament injury using convolutional neural network with two attention mechanism modules. BMC Med Imaging 2023; 23:120. [PMID: 37697236 PMCID: PMC10494428 DOI: 10.1186/s12880-023-01091-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 08/30/2023] [Indexed: 09/13/2023] Open
Abstract
BACKGROUND To develop a fully automated CNN detection system based on magnetic resonance imaging (MRI) for ACL injury, and to explore the feasibility of CNN for ACL injury detection on MRI images. METHODS Including 313 patients aged 16 - 65 years old, the raw data are 368 pieces with injured ACL and 100 pieces with intact ACL. By adding flipping, rotation, scaling and other methods to expand the data, the final data set is 630 pieces including 355 pieces of injured ACL and 275 pieces of intact ACL. Using the proposed CNN model with two attention mechanism modules, data sets are trained and tested with fivefold cross-validation. RESULTS The performance is evaluated using accuracy, precision, sensitivity, specificity and F1 score of our proposed CNN model, with results of 0.8063, 0.7741, 0.9268, 0.6509 and 0.8436. The average accuracy in the fivefold cross-validation is 0.8064. For our model, the average area under curves (AUC) for detecting injured ACL has results of 0.8886. CONCLUSION We propose an effective and automatic CNN model to detect ACL injury from MRI of human knees. This model can effectively help clinicians diagnose ACL injury, improving diagnostic efficiency and reducing misdiagnosis and missed diagnosis.
Collapse
|
24
|
Deep Learning-based Assessment of Internal Carotid Artery Anatomy to Predict Difficult Intracranial Access in Endovascular Recanalization of Acute Ischemic Stroke. Clin Neuroradiol 2023; 33:783-792. [PMID: 36928398 PMCID: PMC10449951 DOI: 10.1007/s00062-023-01276-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 02/03/2023] [Indexed: 03/18/2023]
Abstract
BACKGROUND Endovascular thrombectomy (EVT) duration is an important predictor for neurological outcome. Recently it was shown that an angle of ≤ 90° of the internal carotid artery (ICA) is predictive for longer EVT duration. As manual angle measurement is not trivial and time-consuming, deep learning (DL) could help identifying difficult EVT cases in advance. METHODS We included 379 CT angiographies (CTA) of patients who underwent EVT between January 2016 and December 2020. Manual segmentation of 121 CTAs was performed for the aortic arch, common carotid artery (CCA) and ICA. These were used to train a nnUNet. The remaining 258 CTAs were segmented using the trained nnUNet with manual verification afterwards. Angles of left and right ICAs were measured resulting in two classes: acute angle ≤ 90° and > 90°. The segmentations together with angle measurements were used to train a convolutional neural network (CNN) determining the ICA angle. The performance was evaluated using Dice scores. The classification was evaluated using AUC and accuracy. Associations of ICA angle and procedural times was explored using median and Whitney‑U test. RESULTS Median EVT duration for cases with ICA angle > 90° was 48 min and with ≤ 90° was 64 min (p = 0.001). Segmentation evaluation showed Dice scores of 0.94 for the aorta and 0.86 for CCA/ICA, respectively. Evaluation of ICA angle determination resulted in an AUC of 0.92 and accuracy of 0.85. CONCLUSION The association between ICA angle and EVT duration could be verified and a DL-based method for semi-automatic assessment with the potential for full automation was developed. More anatomical features of interest could be examined in a similar fashion.
Collapse
|
25
|
An effective convolutional neural network for classification of benign and malignant breast and thyroid tumors from ultrasound images. Phys Eng Sci Med 2023; 46:995-1013. [PMID: 37195403 DOI: 10.1007/s13246-023-01262-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 04/16/2023] [Indexed: 05/18/2023]
Abstract
Breast and thyroid cancers are the two most common cancers among women worldwide. The early clinical diagnosis of breast and thyroid cancers often utilizes ultrasonography. Most of the ultrasound images of breast and thyroid cancer lack specificity, which reduces the accuracy of ultrasound clinical diagnosis. This study attempts to develop an effective convolutional neural network (E-CNN) for the classification of benign and malignant breast and thyroid tumors from ultrasound images. The 2-Dimension (2D) ultrasound images of 1052 breast tumors were collected, and 8245 2D tumor images were obtained from 76 thyroid cases. We performed tenfold cross-validation on breast and thyroid data, with a mean classification accuracy of 0.932 and 0.902, respectively. In addition, the proposed E-CNN was applied to classify and evaluate 9297 mixed images (breast and thyroid images). The mean classification accuracy was 0.875, and the mean area under the curve (AUC) was 0.955. Based on data in the same modality, we transferred the breast model to classify typical tumor images of 76 patients. The finetuning model achieved a mean classification accuracy of 0.945, and a mean AUC of 0.958. Meanwhile, the transfer thyroid model realized a mean classification accuracy of 0.932, and a mean AUC of 0.959, on 1052 breast tumor images. The experimental results demonstrate the ability of the E-CNN to learn the features and classify breast and thyroid tumors. Besides, it is promising to classify benign and malignant tumors from ultrasound images with the transfer model under the same modality.
Collapse
|
26
|
Conv-Swinformer: Integration of CNN and shift window attention for Alzheimer's disease classification. Comput Biol Med 2023; 164:107304. [PMID: 37549456 DOI: 10.1016/j.compbiomed.2023.107304] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 07/14/2023] [Accepted: 07/28/2023] [Indexed: 08/09/2023]
Abstract
Deep learning (DL) algorithms based on brain MRI images have achieved great success in the prediction of Alzheimer's disease (AD), with classification accuracy exceeding even that of the most experienced clinical experts. As a novel feature fusion method, Transformer has achieved excellent performance in many computer vision tasks, which also greatly promotes the application of Transformer in medical images. However, when Transformer is used for 3D MRI image feature fusion, existing DL models treat the input local features equally, which is inconsistent with the fact that adjacent voxels have stronger semantic connections than spatially distant voxels. In addition, due to the relatively small size of the dataset for medical images, it is difficult to capture local lesion features in limited iterative training by treating all input features equally. This paper proposes a deep learning model Conv-Swinformer that focuses on extracting and integrating local fine-grained features. Conv-Swinformer consists of a CNN module and a Transformer encoder module. The CNN module summarizes the planar features of the MRI slices, and the Transformer module establishes semantic connections in 3D space for these planar features. By introducing the shift window attention mechanism in the Transformer encoder, the attention is focused on a small spatial area of the MRI image, which effectively reduces unnecessary background semantic information and enables the model to capture local features more accurately. In addition, the layer-by-layer enlarged attention window can further integrate local fine-grained features, thus enhancing the model's attention ability. Compared with DL algorithms that indiscriminately fuse local features of MRI images, Conv-Swinformer can fine-grained extract local lesion features, thus achieving better classification results.
Collapse
|
27
|
Overlapping filter bank convolutional neural network for multisubject multicategory motor imagery brain-computer interface. BioData Min 2023; 16:19. [PMID: 37434221 DOI: 10.1186/s13040-023-00336-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 07/03/2023] [Indexed: 07/13/2023] Open
Abstract
BACKGROUND Motor imagery brain-computer interfaces (BCIs) is a classic and potential BCI technology achieving brain computer integration. In motor imagery BCI, the operational frequency band of the EEG greatly affects the performance of motor imagery EEG recognition model. However, as most algorithms used a broad frequency band, the discrimination from multiple sub-bands were not fully utilized. Thus, using convolutional neural network (CNNs) to extract discriminative features from EEG signals of different frequency components is a promising method in multisubject EEG recognition. METHODS This paper presents a novel overlapping filter bank CNN to incorporate discriminative information from multiple frequency components in multisubject motor imagery recognition. Specifically, two overlapping filter banks with fixed low-cut frequency or sliding low-cut frequency are employed to obtain multiple frequency component representations of EEG signals. Then, multiple CNN models are trained separately. Finally, the output probabilities of multiple CNN models are integrated to determine the predicted EEG label. RESULTS Experiments were conducted based on four popular CNN backbone models and three public datasets. And the results showed that the overlapping filter bank CNN was efficient and universal in improving multisubject motor imagery BCI performance. Specifically, compared with the original backbone model, the proposed method can improve the average accuracy by 3.69 percentage points, F1 score by 0.04, and AUC by 0.03. In addition, the proposed method performed best among the comparison with the state-of-the-art methods. CONCLUSION The proposed overlapping filter bank CNN framework with fixed low-cut frequency is an efficient and universal method to improve the performance of multisubject motor imagery BCI.
Collapse
|
28
|
An Exploration into Human-Computer Interaction: Hand Gesture Recognition Management in a Challenging Environment. SN COMPUTER SCIENCE 2023; 4:441. [PMID: 37334142 PMCID: PMC10258789 DOI: 10.1007/s42979-023-01751-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 02/21/2023] [Indexed: 06/20/2023]
Abstract
Scientists are developing hand gesture recognition systems to improve authentic, efficient, and effortless human-computer interactions without additional gadgets, particularly for the speech-impaired community, which relies on hand gestures as their only mode of communication. Unfortunately, the speech-impaired community has been underrepresented in the majority of human-computer interaction research, such as natural language processing and other automation fields, which makes it more difficult for them to interact with systems and people through these advanced systems. This system's algorithm is in two phases. The first step is the Region of Interest Segmentation, based on the color space segmentation technique, with a pre-set color range that will remove pixels (hand) of the region of interest from the background (pixels not in the desired area of interest). The system's second phase is inputting the segmented images into a Convolutional Neural Network (CNN) model for image categorization. For image training, we utilized the Python Keras package. The system proved the need for image segmentation in hand gesture recognition. The performance of the optimal model is 58 percent which is about 10 percent higher than the accuracy obtained without image segmentation.
Collapse
|
29
|
Transfer learning-driven ensemble model for detection of diabetic retinopathy disease. Med Biol Eng Comput 2023:10.1007/s11517-023-02863-6. [PMID: 37296285 DOI: 10.1007/s11517-023-02863-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 05/29/2023] [Indexed: 06/12/2023]
Abstract
In this study, we propose an ensemble model for the detection of diabetic retinopathy (DR) illness that is driven by transfer learning. Due to diabetes, the DR is a problem that affects the eyes. The retinal blood vessels in a person with high blood sugar deteriorate. The blood arteries may enlarge and leak as a result, or they may close and stop the flow of blood. If DR is not treated, it can become severe, damage vision, and eventually result in blindness. Medical experts study the colored fundus photos for this reason in order to manually diagnose disease, however this is a perilous technique. As a result, the condition was automatically identified utilizing retinal scans and a number of computer vision-based methods. A model is trained on one task or datasets employing the transfer learning (TL) technique, and then the pre-trained models or weights are applied to another task or dataset. Six deep learning (DL)-based convolutional neural network (CNN) models were trained in this study using huge datasets of reasonable photos, including DenseNet-169, VGG-19, ResNet101-V2, Mobilenet-V2, and Inception-V3. We also applied a data-preprocessing strategy to improve the accuracy and lower the training costs in order to improve the results. The experimental results demonstrate that the suggested model works better than existing approaches on the same dataset, with an accuracy of up to 98%, and detects the stage of DR.
Collapse
|
30
|
CNNSplice: Robust models for splice site prediction using convolutional neural networks. Comput Struct Biotechnol J 2023; 21:3210-3223. [PMID: 37304005 PMCID: PMC10250157 DOI: 10.1016/j.csbj.2023.05.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 05/25/2023] [Accepted: 05/28/2023] [Indexed: 06/13/2023] Open
Abstract
The identification of splice site, or segments of an RNA gene where noncoding and coding sequences are connected in the 5' and 3' directions, is an essential post-transcriptional step for the annotation of functional genes and is required for the study and analysis of biological function in eukaryotic organisms through protein production and gene expression. Splice site detection tools have been proposed for this purpose; however, the models of these tools have a specific use case and are inefficiently or typically untransferable between organisms. Here, we present CNNSplice, a set of deep convolutional neural network models for splice site prediction. Using the five-fold cross-validation model selection technique, we explore several models based on typical machine learning applications and propose five high-performing models to efficiently predict the true and false SS in balanced and imbalanced datasets. Our evaluation results indicate that CNNSplice's models achieve a better performance compared with existing methods across five organisms' datasets. In addition, our generality test shows CNNSplice's model ability to predict and annotate splice sites in new or poorly trained genome datasets indicating a broad application spectrum. CNNSplice demonstrates improved model prediction, interpretability, and generalizability on genomic datasets compared to existing splice site prediction tools. We have developed a web server for the CNNSplice algorithm which can be publicly accessed here: http://www.cnnsplice.online.
Collapse
|
31
|
Diagnosis of COVID-19 from blood parameters using convolutional neural network. Soft comput 2023; 27:1-16. [PMID: 37362276 PMCID: PMC10225057 DOI: 10.1007/s00500-023-08508-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2023] [Indexed: 06/28/2023]
Abstract
Asymptomatically presenting COVID-19 complicates the detection of infected individuals. Additionally, the virus changes too many genomic variants, which increases the virus's ability to spread. Because there isn't a specific treatment for COVID-19 in a short time, the essential goal is to reduce the virulence of the disease. Blood parameters, which contain essential clinical information about infectious diseases and are easy to access, have an important place in COVID-19 detection. The convolutional neural network (CNN) architecture, which is popular in image processing, produces highly successful results for COVID-19 detection models. When the literature is examined, it is seen that COVID-19 studies with CNN are generally done using lung images. In this study, one-dimensional (1D) blood parameters data were converted into two-dimensional (2D) image data after preprocessing, and COVID-19 detection was made with CNN. The t-distributed stochastic neighbor embedding method was applied to transfer the feature vectors to the 2D plane. All data were framed with convex hull and minimum bounding rectangle algorithms to obtain image data. The image data obtained by pixel mapping was presented to the developed 3-line CNN architecture. This study proposes an effective and successful model by providing a combination of low-cost and rapidly-accessible blood parameters and CNN architecture making image data processing highly successful for COVID-19 detection. Ultimately, COVID-19 detection was made with a success rate of 94.85%. This study has brought a new perspective to COVID-19 detection studies by obtaining 2D image data from 1D COVID-19 blood parameters and using CNN.
Collapse
|
32
|
Hierarchical neural network with efficient selection inference. Neural Netw 2023; 161:535-549. [PMID: 36812830 DOI: 10.1016/j.neunet.2023.02.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 10/28/2022] [Accepted: 02/09/2023] [Indexed: 02/17/2023]
Abstract
The image classification precision is vastly enhanced with the growing complexity of convolutional neural network (CNN) structures. However, the uneven visual separability between categories leads to various difficulties in classification. The hierarchical structure of categories can be leveraged to deal with it, but a few CNNs pay attention to the character of data. Besides, a network model with a hierarchical structure is promising to extract more specific features from the data than current CNNs, since, for the latter, all categories have the same fixed number of layers for feed-forward computation. In this paper, we propose to use category hierarchies to integrate ResNet-style modules to form a hierarchical network model in a top-down manner. To extract abundant discriminative features and improve the computation efficiency, we adopt residual block selection based on coarse categories to allocate different computation paths. Each residual block works as a switch to determine the JUMP or JOIN mode for an individual coarse category. Interestingly, since some categories need less feed-forward computation than others by jumping layers, the average inference time cost is reduced. Extensive experiments show that our hierarchical network achieves higher prediction accuracy with similar FLOPs on CIFAR-10 and CIFAR-100, SVHM, and Tiny-ImageNet datasets compared to original residual networks and other existing selection inference methods.
Collapse
|
33
|
Cross-modal guiding and reweighting network for multi-modal RSVP-based target detection. Neural Netw 2023; 161:65-82. [PMID: 36736001 DOI: 10.1016/j.neunet.2023.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 10/31/2022] [Accepted: 01/11/2023] [Indexed: 01/17/2023]
Abstract
Rapid Serial Visual Presentation (RSVP) based Brain-Computer Interface (BCI) facilities the high-throughput detection of rare target images by detecting evoked event-related potentials (ERPs). At present, the decoding accuracy of the RSVP-based BCI system limits its practical applications. This study introduces eye movements (gaze and pupil information), referred to as EYE modality, as another useful source of information to combine with EEG-based BCI and forms a novel target detection system to detect target images in RSVP tasks. We performed an RSVP experiment, recorded the EEG signals and eye movements simultaneously during a target detection task, and constructed a multi-modal dataset including 20 subjects. Also, we proposed a cross-modal guiding and fusion network to fully utilize EEG and EYE modalities and fuse them for better RSVP decoding performance. In this network, a two-branch backbone was built to extract features from these two modalities. A Cross-Modal Feature Guiding (CMFG) module was proposed to guide EYE modality features to complement the EEG modality for better feature extraction. A Multi-scale Multi-modal Reweighting (MMR) module was proposed to enhance the multi-modal features by exploring intra- and inter-modal interactions. And, a Dual Activation Fusion (DAF) was proposed to modulate the enhanced multi-modal features for effective fusion. Our proposed network achieved a balanced accuracy of 88.00% (±2.29) on the collected dataset. The ablation studies and visualizations revealed the effectiveness of the proposed modules. This work implies the effectiveness of introducing the EYE modality in RSVP tasks. And, our proposed network is a promising method for RSVP decoding and further improves the performance of RSVP-based target detection systems.
Collapse
|
34
|
Model-based and model-free deep features fusion for high performed human gait recognition. THE JOURNAL OF SUPERCOMPUTING 2023; 79:1-38. [PMID: 37359324 PMCID: PMC10024915 DOI: 10.1007/s11227-023-05156-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 03/01/2023] [Indexed: 06/28/2023]
Abstract
In the last decade, the need for a non-contact biometric model for recognizing candidates has increased, especially after the pandemic of COVID-19 appeared and spread worldwide. This paper presents a novel deep convolutional neural network (CNN) model that guarantees quick, safe, and precise human authentication via their poses and walking style. The concatenated fusion between the proposed CNN and a fully connected model has been formulated, utilized, and tested. The proposed CNN extracts the human features from two main sources: (1) human silhouette images according to model-free and (2) human joints, limbs, and static joint distances according to a model-based via a novel, fully connected deep-layer structure. The most commonly used dataset, CASIA gait families, has been utilized and tested. Numerous performance metrics have been evaluated to measure the system quality, including accuracy, specificity, sensitivity, false negative rate, and training time. Experimental results reveal that the proposed model can enhance recognition performance in a superior manner compared with the latest state-of-the-art studies. Moreover, the suggested system introduces a robust real-time authentication with any covariate conditions, scoring 99.8% and 99.6% accuracy in identifying casia (B) and casia (A) datasets, respectively.
Collapse
|
35
|
VGG-TSwinformer: Transformer-based deep learning model for early Alzheimer's disease prediction. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 229:107291. [PMID: 36516516 DOI: 10.1016/j.cmpb.2022.107291] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 11/22/2022] [Accepted: 11/28/2022] [Indexed: 06/17/2023]
Abstract
BACKGROUND AND OBJECTIVE Mild cognitive impairment (MCI) is a transitional state between normal aging and Alzheimer's disease (AD), and accurately predicting the progression trend of MCI is critical to the early prevention and treatment of AD. Brain structural magnetic resonance imaging (sMRI), as one of the most important biomarkers for the diagnosis of AD, has been applied in various deep learning models. However, due to the inherent disadvantage of deep learning in dealing with longitudinal medical image data, few applications of deep learning for longitudinal analysis of MCI, and the majority of existing deep learning algorithms for MCI progress prediction rely on the analysis of the sMRI images collected at a single time-point, ignoring the progressive nature of the disorder. METHODS In this work, we propose a VGG-TSwinformer model based on convolutional neural network (CNN) and Transformer for short-term longitudinal study of MCI. In this model, VGG-16 based CNN is used to extract low-level spatial features of longitudinal sMRI images and map these low-level features to high-level feature representations, sliding-window attention is used for fine-grained fusion of spatially adjacent feature representations, and gradually fuses distant spatial feature representations through the superposition of attention windows of different sizes, temporal attention is used to measure the evolution of this feature representations as a result of disease progression. RESULTS We validated our model on the ADNI dataset. For the classification task of sMCI vs pMCI, accuracy, sensitivity, specificity and AUC reached 77.2%, 79.97%, 71.59% and 0.8153 respectively. Compared with other cross-sectional studies also applied to sMRI, the proposed model achieved better results in terms of accuracy, sensitivity, and AUC. CONCLUSION The proposed VGG-TSwinformer is a deep learning model for short-term longitudinal study of MCI, which can build brain atrophy progression model from longitudinal sMRI images, and improve diagnostic efficiency compared to algorithms using only cross-sectional sMRI images.
Collapse
|
36
|
Automatic identification of harmful algae based on multiple convolutional neural networks and transfer learning. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:15311-15324. [PMID: 36169848 DOI: 10.1007/s11356-022-23280-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 09/22/2022] [Indexed: 06/16/2023]
Abstract
The monitoring of harmful phytoplankton is very important for the maintenance of the aquatic ecological environment. Traditional algae monitoring methods require professionals with substantial experience in algae species, which are time-consuming, expensive, and limited in practice. The automatic classification of algae cell images and the identification of harmful phytoplankton images were realized by the combination of multiple convolutional neural networks (CNNs) and deep learning techniques based on transfer learning in this work. Eleven common harmful and 31 harmless phytoplankton genera were collected as input samples; the five CNNs classification models of AlexNet, VGG16, GoogLeNet, ResNet50, and MobileNetV2 were fine-tuned to automatically classify phytoplankton images; and the average accuracy was improved 11.9% when compared to models without fine-tuning. In order to monitor harmful phytoplankton which can cause red tides or produce toxins severely polluting drinking water, a new identification method of harmful phytoplankton which combines the recognition results of five CNN models was proposed, and the recall rate reached 98.0%. The experimental results validate that the recognition performance of harmful phytoplankton could be significantly improved by transfer learning, and the proposed identification method is effective in the preliminary screening of harmful phytoplankton and greatly reduces the workload of professional personnel.
Collapse
|
37
|
ChestX-Ray6: Prediction of multiple diseases including COVID-19 from chest X-ray images using convolutional neural network. EXPERT SYSTEMS WITH APPLICATIONS 2023; 211:118576. [PMID: 36062267 PMCID: PMC9420006 DOI: 10.1016/j.eswa.2022.118576] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 08/10/2022] [Accepted: 08/13/2022] [Indexed: 05/27/2023]
Abstract
In the last few decades, several epidemic diseases have been introduced. In some cases, doctors and medical physicians are facing difficulties in identifying these diseases correctly. A machine can perform some of these identification tasks more accurately than a human if it is trained correctly. With time, the number of medical data is increasing. A machine can analyze this medical data and extract knowledge from this data, which can help doctors and medical physicians. This study proposed a lightweight convolutional neural network (CNN) named ChestX-ray6 that automatically detects pneumonia, COVID19, cardiomegaly, lung opacity, and pleural from digital chest x-ray images. Here multiple databases have been combined, containing 9,514 chest x-ray images of normal and other five diseases. The lightweight ChestX-ray6 model achieved an accuracy of 80% for the detection of six diseases. The ChestX-ray6 model has been saved and used for binary classification of normal and pneumonia patients to reveal the model's generalization power. The pre-trained ChestX-ray6 model has achieved an accuracy and recall of 97.94% and 98% for binary classification, which outweighs the state-of-the-art (SOTA) models.
Collapse
|
38
|
Learning for retinal image quality assessment with label regularization. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 228:107238. [PMID: 36423485 DOI: 10.1016/j.cmpb.2022.107238] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 10/03/2022] [Accepted: 11/08/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE The assessment of the image quality is crucial before the computer-aided diagnosis of fundus images. This task is very challenging. Firstly, the subjective judgments of graders on image quality lead to ambiguous labels. Secondly, despite being treated as classification in existing works, grading has regression properties that cannot be ignored. Solving the ambiguity problem and regression problem in the label space, and extracting discriminative features, have become the keys to quality assessment. METHODS In this paper, we proposed a framework that can assess the quality of fundus images accurately and reasonably based on deep convolutional neural networks. Drawing on the experience of human graders, a dual-path convolutional neural network with attention blocks is designed to better extract discriminative features and present the bases of decision. Label smoothing and cost-sensitive regularization are designed to solve the label ambiguity problem and the potential regression problem respectively. Besides, a large number of images are annotated by us to further improve the results. RESULTS We conducted our experiments on the largest retinal image quality assessment dataset with 28,792 retinal images. Our approach achieves 0.8868 precision, 0.8786 recall, 0.8820 F1, and 0.9138 Kappa score. Results show that our approach outperforms state-of-the-art methods. CONCLUSIONS The promising performances reveal that our methods are beneficial to retinal image quality assessment and have potential in other grading tasks.
Collapse
|
39
|
Utilizing an Emotional Robot Capable of Lip-Syncing in Robot-Assisted Speech Therapy Sessions for Children with Language Disorders. Int J Soc Robot 2023; 15:165-183. [PMID: 36467283 PMCID: PMC9684761 DOI: 10.1007/s12369-022-00946-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/04/2022] [Indexed: 11/24/2022]
Abstract
This study scrutinizes the impacts of utilizing a socially assistive robot, the RASA robot, during speech therapy sessions for children with language disorders. Two capabilities were developed for the robotic platform to enhance children-robot interactions during speech therapy interventions: facial expression communication (containing recognition and expression) and lip-syncing. Facial expression recognition was conducted by training several well-known CNN architectures on one of the most extensive facial expressions databases, the AffectNet database, and then modifying them using the transfer learning strategy performed on the CK+ dataset. The robot's lip-syncing capability was designed in two steps. The first step was concerned with designing precise schemes of the articulatory elements needed during the pronunciation of the Persian phonemes (i.e., consonants and vowels). The second step included developing an algorithm to pronounce words by disassembling them into their components (including consonants and vowels) and then morphing them into each other successively. To pursue the study's primary goal, two comparable groups of children with language disorders were considered, the intervention and control groups. The intervention group attended therapy sessions in which the robot acted as the therapist's assistant, while the control group only communicated with the human therapist. The study's first purpose was to compare the children's engagement while playing a mimic game with the affective robot and the therapist, conducted via video coding. The second objective was to assess the efficacy of the robot's presence in the speech therapy sessions alongside the therapist, accomplished by administering the Persian Test of Language Development, Persian TOLD. According to the first scenario, playing with the affective robot is more engaging than playing with the therapist. Furthermore, the statistical analysis of the study's results indicates that participating in robot-assisted speech therapy (RAST) sessions enhances children with language disorders' achievements in comparison with taking part in conventional speech therapy interventions.
Collapse
|
40
|
A novel GCL hybrid classification model for paddy diseases. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY : AN OFFICIAL JOURNAL OF BHARATI VIDYAPEETH'S INSTITUTE OF COMPUTER APPLICATIONS AND MANAGEMENT 2023; 15:1127-1136. [PMID: 36159716 PMCID: PMC9484355 DOI: 10.1007/s41870-022-01094-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 09/06/2022] [Indexed: 11/25/2022]
Abstract
The demand for agricultural products increased exponentially as the global population grew. The rapid development of computer vision-based artificial intelligence and deep learning-related technologies has impacted a wide range of industries, including disease detection and classification. This paper introduces a novel neural network-based hybrid model (GCL). GCL is a dataset-augmentation fusion of long-short term memory (LSTM) and convolutional neural network (CNN) with generative adversarial network (GAN). GAN is used for the augmentation of the dataset, CNN extracts the features and LSTM classifies the various paddy diseases. The GCL model is being investigated to improve the classification model's accuracy and reliability. The dataset was compiled using secondary resources such as Mendeley, Kaggle, UCI, and GitHub, having images of bacterial blight, leaf smut, and rice blast. The experimental setup for proving the efficacy of the GCL model demonstrates that the GCL is suitable for disease classification and works with 97% testing accuracy. GCL can further be used for the classification of more diseases of paddy.
Collapse
|
41
|
Automatic lesion detection and segmentation in 18F-flutemetamol positron emission tomography images using deep learning. Biomed Eng Online 2022; 21:88. [PMID: 36539779 PMCID: PMC9768895 DOI: 10.1186/s12938-022-01058-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 12/07/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Beta amyloid in the brain, which was originally confirmed by post-mortem examinations, can now be confirmed in living patients using amyloid positron emission tomography (PET) tracers, and the accuracy of diagnosis can be improved by beta amyloid plaque confirmation in patients. Amyloid deposition in the brain is often associated with the expression of dementia. Hence, it is important to identify the anatomically and functionally meaningful areas of the human brain cortex surface using PET to diagnose the possibility of developing dementia. In this study, we demonstrated the validity of automated 18F-flutemetamol PET lesion detection and segmentation based on a complete 2D U-Net convolutional neural network via masking treatment strategies. METHODS PET data were first normalized by volume and divided into five amyloid accumulation zones through axial, coronary, and thalamic slices. A single U-Net was trained using a divided dataset for one of these zones. Ground truth segmentations were obtained by manual delineation and thresholding (1.5 × background). RESULTS The following intersection over union values were obtained for the various slices in the verification dataset: frontal lobe axial/sagittal: 0.733/0.804; posterior cingulate cortex and precuneus coronal/sagittal: 0.661/0.726; lateral temporal lobe axial/coronal: 0.864/0.892; parietal lobe axial/coronal: 0.542/0.759; and striatum axial/sagittal: 0.679/0.752. The U-Net convolutional neural network architecture allowed fully automated 2D division of the 18F-flutemetamol PET brain images of Alzheimer's patients. CONCLUSIONS As dementia should be tested and evaluated in various ways, there is a need for artificial intelligence programs. This study can serve as a reference for future studies using auxiliary roles and research in Alzheimer's diagnosis.
Collapse
|
42
|
A generalized framework of feature learning enhanced convolutional neural network for pathology-image-oriented cancer diagnosis. Comput Biol Med 2022; 151:106265. [PMID: 36401968 DOI: 10.1016/j.compbiomed.2022.106265] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 10/24/2022] [Accepted: 10/30/2022] [Indexed: 11/11/2022]
Abstract
In this paper, a feature learning enhanced convolutional neural network (FLE-CNN) is proposed for cancer detection from histopathology images. To build a highly generalized computer-aided diagnosis (CAD) system, an information refinement unit employing depth- and point-wise convolutions is meticulously designed, where a dual-domain attention mechanism is adopted to focus primarily on the important areas. By deploying a residual fusion unit, context information is further integrated to extract highly discriminative features with strong representation ability. Experimental results demonstrate the merits of the proposed FLE-CNN in terms of feature extraction, which has achieved average sensitivity, specificity, precision, accuracy and F1 score of 0.9992, 0.9998, 0.9992, 0.9997 and 0.9992 in a five-class cancer detection task, and in comparison to some other advanced deep learning models, above indicators have been improved by 1.23%, 0.31%, 1.24%, 0.5% and 1.26%, respectively. Moreover, the proposed FLE-CNN provides satisfactory results in three important diagnosis, which further validates that FLE-CNN is a competitive CAD model with high generalization ability.
Collapse
|
43
|
The prediction of cardiac abnormality and enhancement in minority class accuracy from imbalanced ECG signals using modified deep neural network models. Comput Biol Med 2022; 150:106142. [PMID: 36182760 DOI: 10.1016/j.compbiomed.2022.106142] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 09/04/2022] [Accepted: 09/18/2022] [Indexed: 11/03/2022]
Abstract
Cardiovascular disease (CVD) is the most fatal disease in the world, so its accurate and automated detection in the early stages will certainly support the medical expert in timely diagnosis and treatment, which can save many lives. Many types of research have been carried out in this regard, but due to the problem of data imbalance in the medical and health care sector, it may not provide the desired results in all aspects. To overcome this problem, a sequential ensemble technique has been proposed that detects 6 types of cardiac arrhythmias on large ECG imbalanced datasets, and the data imbalanced issue of the ECG dataset has been addressed by using a hybrid data resampling technique called "Synthetically Minority Oversampling Technique and Tomek Link (SMOTE + Tomek)". The sequential ensemble technique employs two distinct deep learning models: Convolutional Neural Network (CNN) and a hybrid model, CNN with Long Short-Term Memory Network (CNN-LSTM). The two standard datasets "MIT-BIH arrhythmias database" (MITDB) and "PTB diagnostic database" (PTBDB) are combined and extracted 23, 998 ECG beats for the model validation. In this work, the three models CNN, CNN-LSTM, and ensemble approach were tested on four kinds of ECG datasets: the original data (imbalanced), the data sampled using a random oversampled technique, data sampled using SMOTE, and the dataset resampled using SMOTE + Tomek algorithm. The overall highest accuracy was obtained of 99.02% on the SMOTE + Tomek sampled dataset by ensemble technique and the minority class accuracy result (Recall) is improved by 20% as compared to the imbalanced data.
Collapse
|
44
|
Reconstruction of missing spring discharge by using deep learning models with ensemble empirical mode decomposition of precipitation. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:82451-82466. [PMID: 35751724 DOI: 10.1007/s11356-022-21597-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 06/16/2022] [Indexed: 06/15/2023]
Abstract
A continuous and complete spring discharge record is critical in understanding the hydrodynamic behavior of karst aquifers and the variability of freshwater resources. However, due to equipment errors, failure of observation and other reasons, missing data is a common problem for spring discharge monitoring and further hydrological investigations and data analysis. In this study, a novel approach that integrates deep learning algorithms and ensemble empirical mode decomposition (EEMD) is proposed to reconstruct the missing spring discharge data with a given local precipitation record. Using EEMD, the local precipitation data is decomposed into several intrinsic mode functions (IMFs) from high to low frequencies and a residual function, which are served as the input of convolutional neural network (CNN), long short-term memory (LSTM), and hybrid CNN-LSTM models to reconstruct the missing discharge data. Evaluation metrics, including root mean squared error (RMSE), mean absolute error (MAE), and Nash-Sutcliffe efficiency coefficient (NSE), are calculated to evaluate the reconstruction performance. The monthly spring discharge and precipitation data from March 1978 to October 2021 collected at Barton Springs in Texas are used for the validation and evaluation of newly proposed deep learning models. The results indicate that deep learning models coupled with EEMD overperform the models without EEMD and significantly improve the reconstruction results. The LSTM-EEMD model obtains the best reconstruction results among three deep learning algorithms. For models with monthly data, the missing rate affects the reconstruction performance because of the number of data samples: the best reconstruction results are achieved when the missing rate was low. If the missing rate was 50%, the reconstruction results become notably poorer. However, when the daily precipitation and discharge data are used, the models can obtain satisfactory reconstruction results with missing rate ranged from 10 to 50%.
Collapse
|
45
|
EnsembleSplice: ensemble deep learning model for splice site prediction. BMC Bioinformatics 2022; 23:413. [PMID: 36203144 PMCID: PMC9535948 DOI: 10.1186/s12859-022-04971-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 09/29/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Identifying splice site regions is an important step in the genomic DNA sequencing pipelines of biomedical and pharmaceutical research. Within this research purview, efficient and accurate splice site detection is highly desirable, and a variety of computational models have been developed toward this end. Neural network architectures have recently been shown to outperform classical machine learning approaches for the task of splice site prediction. Despite these advances, there is still considerable potential for improvement, especially regarding model prediction accuracy, and error rate. RESULTS Given these deficits, we propose EnsembleSplice, an ensemble learning architecture made up of four (4) distinct convolutional neural networks (CNN) model architecture combination that outperform existing splice site detection methods in the experimental evaluation metrics considered including the accuracies and error rates. We trained and tested a variety of ensembles made up of CNNs and DNNs using the five-fold cross-validation method to identify the model that performed the best across the evaluation and diversity metrics. As a result, we developed our diverse and highly effective splice site (SS) detection model, which we evaluated using two (2) genomic Homo sapiens datasets and the Arabidopsis thaliana dataset. The results showed that for of the Homo sapiens EnsembleSplice achieved accuracies of 94.16% for one of the acceptor splice sites and 95.97% for donor splice sites, with an error rate for the same Homo sapiens dataset, 4.03% for the donor splice sites and 5.84% for the acceptor splice sites datasets. CONCLUSIONS Our five-fold cross validation ensured the prediction accuracy of our models are consistent. For reproducibility, all the datasets used, models generated, and results in our work are publicly available in our GitHub repository here: https://github.com/OluwadareLab/EnsembleSplice.
Collapse
|
46
|
Application of Deep Learning Techniques in Diagnosis of Covid-19 (Coronavirus): A Systematic Review. Neural Process Lett 2022; 55:1-53. [PMID: 36158520 PMCID: PMC9483290 DOI: 10.1007/s11063-022-11023-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/29/2022] [Indexed: 01/09/2023]
Abstract
Covid-19 is now one of the most incredibly intense and severe illnesses of the twentieth century. Covid-19 has already endangered the lives of millions of people worldwide due to its acute pulmonary effects. Image-based diagnostic techniques like X-ray, CT, and ultrasound are commonly employed to get a quick and reliable clinical condition. Covid-19 identification out of such clinical scans is exceedingly time-consuming, labor-intensive, and susceptible to silly intervention. As a result, radiography imaging approaches using Deep Learning (DL) are consistently employed to achieve great results. Various artificial intelligence-based systems have been developed for the early prediction of coronavirus using radiography pictures. Specific DL methods such as CNN and RNN noticeably extract extremely critical characteristics, primarily in diagnostic imaging. Recent coronavirus studies have used these techniques to utilize radiography image scans significantly. The disease, as well as the present pandemic, was studied using public and private data. A total of 64 pre-trained and custom DL models concerning imaging modality as taxonomies are selected from the studied articles. The constraints relevant to DL-based techniques are the sample selection, network architecture, training with minimal annotated database, and security issues. This includes evaluating causal agents, pathophysiology, immunological reactions, and epidemiological illness. DL-based Covid-19 detection systems are the key focus of this review article. Covid-19 work is intended to be accelerated as a result of this study.
Collapse
|
47
|
A multi-channel deep convolutional neural network for multi-classifying thyroid diseases. Comput Biol Med 2022; 148:105961. [PMID: 35985185 DOI: 10.1016/j.compbiomed.2022.105961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Revised: 07/28/2022] [Accepted: 08/06/2022] [Indexed: 11/03/2022]
Abstract
BACKGROUND AND OBJECTIVE Thyroid disease instances have been continuously increasing since the 1990s, and thyroid cancer has become the most rapidly rising disease among all the malignancies in recent years. Most existing studies focused on applying deep convolutional neural networks for detecting thyroid cancer. Despite their satisfactory performance on binary classification tasks, limited studies have explored multi-class classification of thyroid disease types; much less is known of the diagnosis of co-existence situation for different types of thyroid diseases. METHOD This study proposed a novel multi-channel convolutional neural network (CNN) architecture to address the multi-class classification task of thyroid disease. The multi-channel CNN merits from computed tomography characteristics to drive a comprehensive diagnostic decision for the overall thyroid gland, emphasizing the disease co-existence circumstance. Moreover, this study also examined alternative strategies to enhance the diagnostic accuracy of CNN models through concatenation of different scales of feature maps. RESULTS Benchmarking experiments demonstrate the improved performance of the proposed multi-channel CNN architecture compared with the standard single-channel CNN architecture. More specifically, the multi-channel CNN achieved an accuracy of 0.909±0.048, precision of 0.944±0.062, recall of 0.896±0.047, specificity of 0.994±0.001, and F1 of 0.917±0.057, in contrast to the single-channel CNN, which obtained 0.902±0.004, 0.892±0.005, 0.909±0.002, 0.993±0.001, 0.898±0.003, respectively. In addition, the proposed model was evaluated in different gender groups; it reached a diagnostic accuracy of 0.908 for the female group and 0.901 for the male group. CONCLUSION Collectively, the results highlight that the proposed multi-channel CNN has excellent generalization and has the potential to be deployed to provide computational decision support in clinical settings.
Collapse
|
48
|
Study and analysis of different segmentation methods for brain tumor MRI application. MULTIMEDIA TOOLS AND APPLICATIONS 2022; 82:7117-7139. [PMID: 35991584 PMCID: PMC9379244 DOI: 10.1007/s11042-022-13636-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 04/26/2022] [Accepted: 08/01/2022] [Indexed: 06/15/2023]
Abstract
Medical Resonance Imaging (MRI) is one of the preferred imaging methods for brain tumor diagnosis and getting detailed information on tumor type, location, size, identification, and detection. Segmentation divides an image into multiple segments and describes the separation of the suspicious region from pre-processed MRI images to make the simpler image that is more meaningful and easier to examine. There are many segmentation methods, embedded with detection devices, and the response of each method is different. The study article focuses on comparing the performance of several image segmentation algorithms for brain tumor diagnosis, such as Otsu's, watershed, level set, K-means, HAAR Discrete Wavelet Transform (DWT), and Convolutional Neural Network (CNN). All of the techniques are simulated in MATLAB using online images from the Brain Tumor Image Segmentation Benchmark (BRATS) dataset-2018. The performance of these methods is analyzed based on response time and measures such as recall, precision, F-measures, and accuracy. The measured accuracy of Otsu's, watershed, level set, K-means, DWT, and CNN methods is 71.42%, 78.26%, 80.45%, 84.34%, 86.95%, and 91.39 respectively. The response time of CNN is 2.519 s in the MATLAB simulation environment for the designed algorithm. The novelty of the work is that CNN has been proven the best algorithm in comparison to all other methods for brain tumor image segmentation. The simulated and estimated parameters provide the direction to researchers to choose the specific algorithm for embedded hardware solutions and develop the optimal machine-learning models, as the industries are looking for the optimal solutions of CNN and deep learning-based hardware models for the brain tumor.
Collapse
|
49
|
Improved COVID-19 detection with chest x-ray images using deep learning. MULTIMEDIA TOOLS AND APPLICATIONS 2022; 81:37657-37680. [PMID: 35968409 PMCID: PMC9361266 DOI: 10.1007/s11042-022-13509-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 10/18/2021] [Accepted: 07/13/2022] [Indexed: 06/15/2023]
Abstract
The novel coronavirus disease, which originated in Wuhan, developed into a severe public health problem worldwide. Immense stress in the society and health department was advanced due to the multiplying numbers of COVID carriers and deaths. This stress can be lowered by performing a high-speed diagnosis for the disease, which can be a crucial stride for opposing the deadly virus. A good large amount of time is consumed in the diagnosis. Some applications that use medical images like X-Rays or CT-Scans can pace up the time used in diagnosis. Hence, this paper aims to create a computer-aided-design system that will use the chest X-Ray as input and further classify it into one of the three classes, namely COVID-19, viral Pneumonia, and healthy. Since the COVID-19 positive chest X-Rays dataset was low, we have exploited four pre-trained deep neural networks (DNNs) to find the best for this system. The dataset consisted of 2905 images with 219 COVID-19 cases, 1341 healthy cases, and 1345 viral pneumonia cases. Out of these images, the models were evaluated on 30 images of each class for the testing, while the rest of them were used for training. It is observed that AlexNet attained an accuracy of 97.6% with an average precision, recall, and F1 score of 0.98, 0.97, and 0.98, respectively.
Collapse
|
50
|
Classification of breast cancer using a manta-ray foraging optimized transfer learning framework. PeerJ Comput Sci 2022; 8:e1054. [PMID: 36092017 PMCID: PMC9454783 DOI: 10.7717/peerj-cs.1054] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 07/07/2022] [Indexed: 06/15/2023]
Abstract
Due to its high prevalence and wide dissemination, breast cancer is a particularly dangerous disease. Breast cancer survival chances can be improved by early detection and diagnosis. For medical image analyzers, diagnosing is tough, time-consuming, routine, and repetitive. Medical image analysis could be a useful method for detecting such a disease. Recently, artificial intelligence technology has been utilized to help radiologists identify breast cancer more rapidly and reliably. Convolutional neural networks, among other technologies, are promising medical image recognition and classification tools. This study proposes a framework for automatic and reliable breast cancer classification based on histological and ultrasound data. The system is built on CNN and employs transfer learning technology and metaheuristic optimization. The Manta Ray Foraging Optimization (MRFO) approach is deployed to improve the framework's adaptability. Using the Breast Cancer Dataset (two classes) and the Breast Ultrasound Dataset (three-classes), eight modern pre-trained CNN architectures are examined to apply the transfer learning technique. The framework uses MRFO to improve the performance of CNN architectures by optimizing their hyperparameters. Extensive experiments have recorded performance parameters, including accuracy, AUC, precision, F1-score, sensitivity, dice, recall, IoU, and cosine similarity. The proposed framework scored 97.73% on histopathological data and 99.01% on ultrasound data in terms of accuracy. The experimental results show that the proposed framework is superior to other state-of-the-art approaches in the literature review.
Collapse
|