1
|
Deng C, Han Y, Zhao B. High-Performance Visual Tracking With Extreme Learning Machine Framework. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2781-2792. [PMID: 30624237 DOI: 10.1109/tcyb.2018.2886580] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In real-time applications, a fast and robust visual tracker should generally have the following important properties: 1) feature representation of an object that is not only efficient but also has a good discriminative capability and 2) appearance modeling which can quickly adapt to the variations of foreground and backgrounds. However, most of the existing tracking algorithms cannot achieve satisfactory performance in both of the two aspects. To address this issue, in this paper, we advocate a novel and efficient visual tracker by exploiting the excellent feature learning and classification capabilities of an emerging learning technique, that is, extreme learning machine (ELM). The contributions of the proposed work are as follows: 1) motivated by the simplicity and learning ability of the ELM autoencoder (ELM-AE), an ELM-AE-based feature extraction model is presented, and this model can provide a compact and discriminative representation of the inputs efficiently and 2) due to the fast learning speed of an ELM classifier, an ELM-based appearance model is developed for feature classification, and is able to rapidly distinguish the object of interest from its surroundings. In addition, in order to cope with the visual changes of the target and its backgrounds, the online sequential ELM is used to incrementally update the appearance model. Plenty of experiments on challenging image sequences demonstrate the effectiveness and robustness of the proposed tracker.
Collapse
|
2
|
Ghosh SK, Tripathy RK, Paternina MRA, Arrieta JJ, Zamora-Mendez A, Naik GR. Detection of Atrial Fibrillation from Single Lead ECG Signal Using Multirate Cosine Filter Bank and Deep Neural Network. J Med Syst 2020; 44:114. [PMID: 32388733 DOI: 10.1007/s10916-020-01565-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 03/31/2020] [Indexed: 12/14/2022]
Abstract
Atrial fibrillation (AF) is a cardiac arrhythmia which is characterized based on the irregsular beating of atria, resulting in, the abnormal atrial patterns that are observed in the electrocardiogram (ECG) signal. The early detection of this pathology is very helpful for minimizing the chances of stroke, other heart-related disorders, and coronary artery diseases. This paper proposes a novel method for the detection of AF pathology based on the analysis of the ECG signal. The method adopts a multi-rate cosine filter bank architecture for the evaluation of coefficients from the ECG signal at different subbands, in turn, the Fractional norm (FN) feature is evaluated from the extracted coefficients at each subband. Then, the AF detection is carried out using a deep learning approach known as the Hierarchical Extreme Learning Machine (H-ELM) from the FN features. The proposed method is evaluated by considering normal and AF pathological ECG signals from public databases. The experimental results reveal that the proposed multi-rate cosine filter bank based on FN features is effective for the detection of AF pathology with an accuracy, sensitivity and specificity values of 99.40%, 98.77%, and 100%, respectively. The performance of the proposed diagnostic features of the ECG signal is compared with other existing features for the detection of AF. The low-frequency subband FN features found to be more significant with a difference of the mean values as 0.69 between normal and AF classes.
Collapse
Affiliation(s)
- S K Ghosh
- MLR Institute of Technology, Hyderabad, India
| | - R K Tripathy
- Birla Institute of Technology and Science Pilani, Hyderabad, India.
| | - Mario R A Paternina
- National Autonomous University of Mexico (UNAM), Mexico City, Mex. 04510, Mexico
| | | | | | - Ganesh R Naik
- Biomedical Engineering and Neuromorphic Systems (BENS) Research Group, MARCS Institute, Western Sydney University, Penrith, New South Wales, Australia
| |
Collapse
|
3
|
Wang B, Zhao W, Gao P, Zhang Y, Wang Z. Crack Damage Detection Method via Multiple Visual Features and Efficient Multi-Task Learning Model. SENSORS 2018; 18:s18061796. [PMID: 29865256 PMCID: PMC6021921 DOI: 10.3390/s18061796] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Revised: 05/23/2018] [Accepted: 06/01/2018] [Indexed: 12/03/2022]
Abstract
This paper proposes an effective and efficient model for concrete crack detection. The presented work consists of two modules: multi-view image feature extraction and multi-task crack region detection. Specifically, multiple visual features (such as texture, edge, etc.) of image regions are calculated, which can suppress various background noises (such as illumination, pockmark, stripe, blurring, etc.). With the computed multiple visual features, a novel crack region detector is advocated using a multi-task learning framework, which involves restraining the variability for different crack region features and emphasizing the separability between crack region features and complex background ones. Furthermore, the extreme learning machine is utilized to construct this multi-task learning model, thereby leading to high computing efficiency and good generalization. Experimental results of the practical concrete images demonstrate that the developed algorithm can achieve favorable crack detection performance compared with traditional crack detectors.
Collapse
Affiliation(s)
- Baoxian Wang
- Structure Health Monitoring and Control Institute, Shijiazhuang Tiedao University, Shijiazhuang 050043, China.
| | - Weigang Zhao
- Structure Health Monitoring and Control Institute, Shijiazhuang Tiedao University, Shijiazhuang 050043, China.
| | - Po Gao
- School of Technology, Beijing Forestry University, Beijing 100083, China.
| | - Yufeng Zhang
- School of Electrical and Electronic Engineering, Shijiazhuang Tiedao University, Shijiazhuang 050043, China.
| | - Zhe Wang
- School of Electrical and Electronic Engineering, Shijiazhuang Tiedao University, Shijiazhuang 050043, China.
| |
Collapse
|
4
|
Barzegar R, Moghaddam AA, Deo R, Fijani E, Tziritis E. Mapping groundwater contamination risk of multiple aquifers using multi-model ensemble of machine learning algorithms. THE SCIENCE OF THE TOTAL ENVIRONMENT 2018; 621:697-712. [PMID: 29197289 DOI: 10.1016/j.scitotenv.2017.11.185] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Revised: 11/15/2017] [Accepted: 11/16/2017] [Indexed: 05/17/2023]
Abstract
Constructing accurate and reliable groundwater risk maps provide scientifically prudent and strategic measures for the protection and management of groundwater. The objectives of this paper are to design and validate machine learning based-risk maps using ensemble-based modelling with an integrative approach. We employ the extreme learning machines (ELM), multivariate regression splines (MARS), M5 Tree and support vector regression (SVR) applied in multiple aquifer systems (e.g. unconfined, semi-confined and confined) in the Marand plain, North West Iran, to encapsulate the merits of individual learning algorithms in a final committee-based ANN model. The DRASTIC Vulnerability Index (VI) ranged from 56.7 to 128.1, categorized with no risk, low and moderate vulnerability thresholds. The correlation coefficient (r) and Willmott's Index (d) between NO3 concentrations and VI were 0.64 and 0.314, respectively. To introduce improvements in the original DRASTIC method, the vulnerability indices were adjusted by NO3 concentrations, termed as the groundwater contamination risk (GCR). Seven DRASTIC parameters utilized as the model inputs and GCR values utilized as the outputs of individual machine learning models were served in the fully optimized committee-based ANN-predictive model. The correlation indicators demonstrated that the ELM and SVR models outperformed the MARS and M5 Tree models, by virtue of a larger d and r value. Subsequently, the r and d metrics for the ANN-committee based multi-model in the testing phase were 0.8889 and 0.7913, respectively; revealing the superiority of the integrated (or ensemble) machine learning models when compared with the original DRASTIC approach. The newly designed multi-model ensemble-based approach can be considered as a pragmatic step for mapping groundwater contamination risks of multiple aquifer systems with multi-model techniques, yielding the high accuracy of the ANN committee-based model.
Collapse
Affiliation(s)
- Rahim Barzegar
- Department of Earth Sciences, Faculty of Natural Sciences, University of Tabriz, Tabriz, Iran.
| | | | - Ravinesh Deo
- School of Agricultural Computational and Environmental Sciences, Institute of Agriculture and Environment (IAg&E), University of Southern Queensland, Springfield, Australia
| | - Elham Fijani
- School of Geology, College of Science, University of Tehran, Tehran, Iran
| | - Evangelos Tziritis
- Hellenic Agricultural Organization, Soil and Water Resources Institute, 57400 Sindos, Greece
| |
Collapse
|
5
|
Deep Spatial-Temporal Joint Feature Representation for Video Object Detection. SENSORS 2018; 18:s18030774. [PMID: 29510529 PMCID: PMC5876594 DOI: 10.3390/s18030774] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2018] [Revised: 02/26/2018] [Accepted: 02/27/2018] [Indexed: 11/26/2022]
Abstract
With the development of deep neural networks, many object detection frameworks have shown great success in the fields of smart surveillance, self-driving cars, and facial recognition. However, the data sources are usually videos, and the object detection frameworks are mostly established on still images and only use the spatial information, which means that the feature consistency cannot be ensured because the training procedure loses temporal information. To address these problems, we propose a single, fully-convolutional neural network-based object detection framework that involves temporal information by using Siamese networks. In the training procedure, first, the prediction network combines the multiscale feature map to handle objects of various sizes. Second, we introduce a correlation loss by using the Siamese network, which provides neighboring frame features. This correlation loss represents object co-occurrences across time to aid the consistent feature generation. Since the correlation loss should use the information of the track ID and detection label, our video object detection network has been evaluated on the large-scale ImageNet VID dataset where it achieves a 69.5% mean average precision (mAP).
Collapse
|
6
|
Multi-View Structural Local Subspace Tracking. SENSORS 2017; 17:s17040666. [PMID: 28333088 PMCID: PMC5419779 DOI: 10.3390/s17040666] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Revised: 03/18/2017] [Accepted: 03/21/2017] [Indexed: 11/20/2022]
Abstract
In this paper, we propose a multi-view structural local subspace tracking algorithm based on sparse representation. We approximate the optimal state from three views: (1) the template view; (2) the PCA (principal component analysis) basis view; and (3) the target candidate view. Then we propose a unified objective function to integrate these three view problems together. The proposed model not only exploits the intrinsic relationship among target candidates and their local patches, but also takes advantages of both sparse representation and incremental subspace learning. The optimization problem can be well solved by the customized APG (accelerated proximal gradient) methods together with an iteration manner. Then, we propose an alignment-weighting average method to obtain the optimal state of the target. Furthermore, an occlusion detection strategy is proposed to accurately update the model. Both qualitative and quantitative evaluations demonstrate that our tracker outperforms the state-of-the-art trackers in a wide range of tracking scenarios.
Collapse
|
7
|
Visual Object Tracking Based on Cross-Modality Gaussian-Bernoulli Deep Boltzmann Machines with RGB-D Sensors. SENSORS 2017; 17:s17010121. [PMID: 28075373 PMCID: PMC5298694 DOI: 10.3390/s17010121] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Revised: 01/05/2017] [Accepted: 01/05/2017] [Indexed: 11/17/2022]
Abstract
Visual object tracking technology is one of the key issues in computer vision. In this paper, we propose a visual object tracking algorithm based on cross-modality featuredeep learning using Gaussian-Bernoulli deep Boltzmann machines (DBM) with RGB-D sensors. First, a cross-modality featurelearning network based on aGaussian-Bernoulli DBM is constructed, which can extract cross-modality features of the samples in RGB-D video data. Second, the cross-modality features of the samples are input into the logistic regression classifier, andthe observation likelihood model is established according to the confidence score of the classifier. Finally, the object tracking results over RGB-D data are obtained using aBayesian maximum a posteriori (MAP) probability estimation algorithm. The experimental results show that the proposed method has strong robustness to abnormal changes (e.g., occlusion, rotation, illumination change, etc.). The algorithm can steadily track multiple targets and has higher accuracy.
Collapse
|
8
|
Real-Time Tracking Framework with Adaptive Features and Constrained Labels. SENSORS 2016; 16:s16091449. [PMID: 27618052 PMCID: PMC5038727 DOI: 10.3390/s16091449] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Revised: 08/22/2016] [Accepted: 08/22/2016] [Indexed: 11/16/2022]
Abstract
This paper proposes a novel tracking framework with adaptive features and constrained labels (AFCL) to handle illumination variation, occlusion and appearance changes caused by the variation of positions. The novel ensemble classifier, including the Forward-Backward error and the location constraint is applied, to get the precise coordinates of the promising bounding boxes. The Forward-Backward error can enhance the adaptation and accuracy of the binary features, whereas the location constraint can overcome the label noise to a certain degree. We use the combiner which can evaluate the online templates and the outputs of the classifier to accommodate the complex situation. Evaluation of the widely used tracking benchmark shows that the proposed framework can significantly improve the tracking accuracy, and thus reduce the processing time. The proposed framework has been tested and implemented on the embedded system using TMS320C6416 and Cyclone Ⅲ kernel processors. The outputs show that achievable and satisfying results can be obtained.
Collapse
|
9
|
Gao C, Shi H, Yu JG, Sang N. Enhancement of ELDA Tracker Based on CNN Features and Adaptive Model Update. SENSORS 2016; 16:s16040545. [PMID: 27092505 PMCID: PMC4851059 DOI: 10.3390/s16040545] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2016] [Revised: 04/11/2016] [Accepted: 04/11/2016] [Indexed: 12/04/2022]
Abstract
Appearance representation and the observation model are the most important components in designing a robust visual tracking algorithm for video-based sensors. Additionally, the exemplar-based linear discriminant analysis (ELDA) model has shown good performance in object tracking. Based on that, we improve the ELDA tracking algorithm by deep convolutional neural network (CNN) features and adaptive model update. Deep CNN features have been successfully used in various computer vision tasks. Extracting CNN features on all of the candidate windows is time consuming. To address this problem, a two-step CNN feature extraction method is proposed by separately computing convolutional layers and fully-connected layers. Due to the strong discriminative ability of CNN features and the exemplar-based model, we update both object and background models to improve their adaptivity and to deal with the tradeoff between discriminative ability and adaptivity. An object updating method is proposed to select the “good” models (detectors), which are quite discriminative and uncorrelated to other selected models. Meanwhile, we build the background model as a Gaussian mixture model (GMM) to adapt to complex scenes, which is initialized offline and updated online. The proposed tracker is evaluated on a benchmark dataset of 50 video sequences with various challenges. It achieves the best overall performance among the compared state-of-the-art trackers, which demonstrates the effectiveness and robustness of our tracking algorithm.
Collapse
Affiliation(s)
- Changxin Gao
- National Key Laboratory of Science and Technology on Multispectral Information Processing, School of Automation, Huazhong University of Science and Technology, Wuhan 430074, China.
| | - Huizhang Shi
- National Key Laboratory of Science and Technology on Multispectral Information Processing, School of Automation, Huazhong University of Science and Technology, Wuhan 430074, China.
| | - Jin-Gang Yu
- Department of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE 68503, USA.
| | - Nong Sang
- National Key Laboratory of Science and Technology on Multispectral Information Processing, School of Automation, Huazhong University of Science and Technology, Wuhan 430074, China.
| |
Collapse
|