1
|
Discriminative Deep Non-Linear Dictionary Learning for Visual Object Tracking. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-11025-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2022]
|
2
|
He X, Chen CYC. Exploring reliable visual tracking via target embedding network. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
3
|
|
4
|
Sheng B, Li P, Zhang Y, Mao L, Chen CLP. GreenSea: Visual Soccer Analysis Using Broad Learning System. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:1463-1477. [PMID: 32452777 DOI: 10.1109/tcyb.2020.2988792] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Modern soccer increasingly places trust in visual analysis and statistics rather than only relying on the human experience. However, soccer is an extraordinarily complex game that no widely accepted quantitative analysis methods exist. The statistics collection and visualization are time consuming which result in numerous adjustments. To tackle this issue, we developed GreenSea, a visual-based assessment system designed for soccer game analysis, tactics, and training. The system uses a broad learning system (BLS) to train the model in order to avoid the time-consuming issue that traditional deep learning may suffer. Users are able to apply multiple views of a soccer game, and visual summarization of essential statistics using advanced visualization and animation that are available. A marking system trained by BLS is designed to perform quantitative analysis. A novel recurrent discriminative BLS (RDBLS) is proposed to carry out long-term tracking. In our RDBLS, the structure is adjusted to have better performance on the binary classification problem of the discriminative model. Several experiments are carried out to verify that our proposed RDBLS model can outperform the standard BLS and other methods. Two studies were conducted to verify the effectiveness of our GreenSea. The first study was on how GreenSea assists a youth training coach to assess each trainee's performance for selecting most potential players. The second study was on how GreenSea was used to help the U20 Shanghai soccer team coaching staff analyze games and make tactics during the 13th National Games. Our studies have shown the usability of GreenSea and the values of our system to both amateur and expert users.
Collapse
|
5
|
|
6
|
Zhou T, Zhang C, Gong C, Bhaskar H, Yang J. Multiview Latent Space Learning With Feature Redundancy Minimization. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1655-1668. [PMID: 30571651 DOI: 10.1109/tcyb.2018.2883673] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Multiview learning has received extensive research interest and has demonstrated promising results in recent years. Despite the progress made, there are two significant challenges within multiview learning. First, some of the existing methods directly use original features to reconstruct data points without considering the issue of feature redundancy. Second, existing methods cannot fully exploit the complementary information across multiple views and meanwhile preserve the view-specific properties; therefore, the degraded learning performance will be generated. To address the above issues, we propose a novel multiview latent space learning framework with feature redundancy minimization. We aim to learn a latent space to mitigate the feature redundancy and use the learned representation to reconstruct every original data point. More specifically, we first project the original features from multiple views onto a latent space, and then learn a shared dictionary and view-specific dictionaries to, respectively, exploit the correlations across multiple views as well as preserve the view-specific properties. Furthermore, the Hilbert-Schmidt independence criterion is adopted as a diversity constraint to explore the complementarity of multiview representations, which further ensures the diversity from multiple views and preserves the local structure of the data in each view. Experimental results on six public datasets have demonstrated the effectiveness of our multiview learning approach against other state-of-the-art methods.
Collapse
|
7
|
Fan J, Cao X, Wang Q, Yap PT, Shen D. Adversarial learning for mono- or multi-modal registration. Med Image Anal 2019; 58:101545. [PMID: 31557633 PMCID: PMC7455790 DOI: 10.1016/j.media.2019.101545] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2019] [Revised: 06/16/2019] [Accepted: 08/19/2019] [Indexed: 11/29/2022]
Abstract
This paper introduces an unsupervised adversarial similarity network for image registration. Unlike existing deep learning registration methods, our approach can train a deformable registration network without the need of ground-truth deformations and specific similarity metrics. We connect a registration network and a discrimination network with a deformable transformation layer. The registration network is trained with the feedback from the discrimination network, which is designed to judge whether a pair of registered images are sufficiently similar. Using adversarial training, the registration network is trained to predict deformations that are accurate enough to fool the discrimination network. The proposed method is thus a general registration framework, which can be applied for both mono-modal and multi-modal image registration. Experiments on four brain MRI datasets and a multi-modal pelvic image dataset indicate that our method yields promising registration performance in accuracy, efficiency and generalizability compared with state-of-the-art registration methods, including those based on deep learning.
Collapse
Affiliation(s)
- Jingfan Fan
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Xiaohuan Cao
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Qian Wang
- Institute for Medical Imaging Technology, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Pew-Thian Yap
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| | - Dinggang Shen
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA; Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea.
| |
Collapse
|
8
|
Abstract
Object tracking has always been an interesting and essential research topic in the domain of computer vision, of which the model update mechanism is an essential work, therefore the robustness of it has become a crucial factor influencing the quality of tracking of a sequence. This review analyses on recent tracking model update strategies, where target model update occasion is first discussed, then we give a detailed discussion on update strategies of the target model based on the mainstream tracking frameworks, and the background update frameworks are discussed afterwards. The experimental performances of the trackers in recent researches acting on specific sequences are listed in this review, where the superiority and some failure cases on each of them are discussed, and conclusions based on those performances are then drawn. It is a crucial point that design of a proper background model as well as its update strategy ought to be put into consideration. A cascade update of the template corresponding to each deep network layer based on the contributions of them to the target recognition can also help with more accurate target location, where target saliency information can be utilized as a tool for state estimation.
Collapse
|
9
|
Zhou T, Liu M, Thung KH, Shen D. Latent Representation Learning for Alzheimer's Disease Diagnosis With Incomplete Multi-Modality Neuroimaging and Genetic Data. IEEE TRANSACTIONS ON MEDICAL IMAGING 2019; 38:2411-2422. [PMID: 31021792 PMCID: PMC8034601 DOI: 10.1109/tmi.2019.2913158] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
The fusion of complementary information contained in multi-modality data [e.g., magnetic resonance imaging (MRI), positron emission tomography (PET), and genetic data] has advanced the progress of automated Alzheimer's disease (AD) diagnosis. However, multi-modality based AD diagnostic models are often hindered by the missing data, i.e., not all the subjects have complete multi-modality data. One simple solution used by many previous studies is to discard samples with missing modalities. However, this significantly reduces the number of training samples, thus leading to a sub-optimal classification model. Furthermore, when building the classification model, most existing methods simply concatenate features from different modalities into a single feature vector without considering their underlying associations. As features from different modalities are often closely related (e.g., MRI and PET features are extracted from the same brain region), utilizing their inter-modality associations may improve the robustness of the diagnostic model. To this end, we propose a novel latent representation learning method for multi-modality based AD diagnosis. Specifically, we use all the available samples (including samples with incomplete modality data) to learn a latent representation space. Within this space, we not only use samples with complete multi-modality data to learn a common latent representation, but also use samples with incomplete multi-modality data to learn independent modality-specific latent representations. We then project the latent representations to the label space for AD diagnosis. We perform experiments using 737 subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database, and the experimental results verify the effectiveness of our proposed method.
Collapse
Affiliation(s)
- Tao Zhou
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
- Inception Institute of Artificial Intelligence, Abu Dhabi 51133, United Arab Emirates
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Kim-Han Thung
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Dinggang Shen
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
- Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea
| |
Collapse
|
10
|
Chen G, Dong B, Zhang Y, Lin W, Shen D, Yap PT. XQ-SR: Joint x-q space super-resolution with application to infant diffusion MRI. Med Image Anal 2019; 57:44-55. [PMID: 31279215 PMCID: PMC6764426 DOI: 10.1016/j.media.2019.06.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 06/16/2019] [Accepted: 06/20/2019] [Indexed: 12/30/2022]
Abstract
Diffusion MRI (DMRI) is a powerful tool for studying early brain development and disorders. However, the typically low spatio-angular resolution of DMRI diminishes structural details and limits quantitative analysis to simple diffusion models. This problem is aggravated for infant DMRI since (i) the infant brain is significantly smaller than that of an adult, demanding higher spatial resolution to capture subtle structures; and (ii) the typically limited scan time of unsedated infants poses significant challenges to DMRI acquisition with high spatio-angular resolution. Post-acquisition super-resolution (SR) is an important alternative for increasing the resolution of DMRI data without prolonging acquisition times. However, most existing methods focus on the SR of only either the spatial domain (x-space) or the diffusion wavevector domain (q-space). For more effective resolution enhancement, we propose a framework for joint SR in both spatial and wavevector domains. More specifically, we first establish the signal relationships in x-q space using a robust neighborhood matching technique. We then harness the signal relationships to regularize the ill-posed inverse problem associated with the recovery of high-resolution data from their low-resolution counterpart. Extensive experiments on synthetic, adult, and infant DMRI data demonstrate that our method is able to recover high-resolution DMRI data with remarkably improved quality.
Collapse
Affiliation(s)
- Geng Chen
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, NC, USA.
| | - Bin Dong
- Beijing International Center for Mathematical Research, Peking University, Beijing, China
| | - Yong Zhang
- Vancouver Research Center, Huawei, Burnaby, Canada
| | - Weili Lin
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, NC, USA
| | - Dinggang Shen
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, NC, USA; Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea.
| | - Pew-Thian Yap
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, NC, USA.
| |
Collapse
|
11
|
Zhang Y, Nam CS, Zhou G, Jin J, Wang X, Cichocki A. Temporally Constrained Sparse Group Spatial Patterns for Motor Imagery BCI. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:3322-3332. [PMID: 29994667 DOI: 10.1109/tcyb.2018.2841847] [Citation(s) in RCA: 139] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Common spatial pattern (CSP)-based spatial filtering has been most popularly applied to electroencephalogram (EEG) feature extraction for motor imagery (MI) classification in brain-computer interface (BCI) application. The effectiveness of CSP is highly affected by the frequency band and time window of EEG segments. Although numerous algorithms have been designed to optimize the spectral bands of CSP, most of them selected the time window in a heuristic way. This is likely to result in a suboptimal feature extraction since the time period when the brain responses to the mental tasks occurs may not be accurately detected. In this paper, we propose a novel algorithm, namely temporally constrained sparse group spatial pattern (TSGSP), for the simultaneous optimization of filter bands and time window within CSP to further boost classification accuracy of MI EEG. Specifically, spectrum-specific signals are first derived by bandpass filtering from raw EEG data at a set of overlapping filter bands. Each of the spectrum-specific signals is further segmented into multiple subseries using sliding window approach. We then devise a joint sparse optimization of filter bands and time windows with temporal smoothness constraint to extract robust CSP features under a multitask learning framework. A linear support vector machine classifier is trained on the optimized EEG features to accurately identify the MI tasks. An experimental study is implemented on three public EEG datasets (BCI Competition III dataset IIIa, BCI Competition IV datasets IIa, and BCI Competition IV dataset IIb) to validate the effectiveness of TSGSP in comparison to several other competing methods. Superior classification performance (averaged accuracies are 88.5%, 83.3%, and 84.3% for the three datasets, respectively) based on the experimental results confirms that the proposed algorithm is a promising candidate for performance improvement of MI-based BCIs.
Collapse
|
12
|
Fan J, Yang J, Wang Y, Yang S, Ai D, Huang Y, Song H, Wang Y, Shen D. Deep feature descriptor based hierarchical dense matching for X-ray angiographic images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2019; 175:233-242. [PMID: 31104711 DOI: 10.1016/j.cmpb.2019.04.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Revised: 03/09/2019] [Accepted: 04/07/2019] [Indexed: 06/09/2023]
Abstract
UNLABELLED Backgroud and Objective: X-ray angiography, a powerful technique for blood vessel visualization, is widely used for interventional diagnosis of coronary artery disease because of its fast imaging speed and perspective inspection ability. Matching feature points in angiographic images is a considerably challenging task due to repetitive weak-textured regions. METHODS In this paper, we propose an angiographic image matching method based on the hierarchical dense matching framework, where a novel deep feature descriptor is designed to compute multilevel correlation maps. In particular, the deep feature descriptor is computed by a deep learning model specifically designed and trained for angiographic images, thereby making the correlation maps more distinctive for corresponding feature points in different angiographic images. Moreover, point correspondences are further hierarchically extracted from multilevel correlation maps with the highest similarity response(s), which is relatively robust and accurate. To overcome the problem regarding the lack of training samples, the convolutional neural network (designed for deep feature descriptor) is initially trained on samples from natural images and then fine-tuned on manually annotated angiographic images. Finally, a dense matching completion method, based on the distance between deep feature descriptors, is proposed to generate dense matches between images. RESULTS The proposed method has been evaluated on the number and accuracy of extracted matches and the performance of subtraction images. Experiments on a variety of angiographic images show promising matching accuracy, compared with state-of-the-art methods. CONCLUSIONS The proposed angiographic image matching method is shown to be accurate and effective for feature matching in angiographic images, and further achieves good performance in image subtraction.
Collapse
Affiliation(s)
- Jingfan Fan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China; Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jian Yang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China.
| | - Yachen Wang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Siyuan Yang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Danni Ai
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Yong Huang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Hong Song
- School of Software, Beijing Institute of Technology, Beijing 100081, China
| | - Yongtian Wang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Dinggang Shen
- Department of Radiology and BRIC, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea.
| |
Collapse
|
13
|
Subspace structural constraint-based discriminative feature learning via nonnegative low rank representation. PLoS One 2019; 14:e0215450. [PMID: 31063497 PMCID: PMC6504107 DOI: 10.1371/journal.pone.0215450] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2018] [Accepted: 04/02/2019] [Indexed: 11/25/2022] Open
Abstract
Feature subspace learning plays a significant role in pattern recognition, and many efforts have been made to generate increasingly discriminative learning models. Recently, several discriminative feature learning methods based on a representation model have been proposed, which have not only attracted considerable attention but also achieved success in practical applications. Nevertheless, these methods for constructing the learning model simply depend on the class labels of the training instances and fail to consider the essential subspace structural information hidden in them. In this paper, we propose a robust feature subspace learning approach based on a low-rank representation. In our approach, the low-rank representation coefficients are considered as weights to construct the constraint item for feature learning, which can introduce a subspace structural similarity constraint in the proposed learning model for facilitating data adaptation and robustness. Moreover, by placing the subspace learning and low-rank representation into a unified framework, they can benefit each other during the iteration process to realize an overall optimum. To achieve extra discrimination, linear regression is also incorporated into our model to enforce the projection features around and close to their label-based centers. Furthermore, an iterative numerical scheme is designed to solve our proposed objective function and ensure convergence. Extensive experimental results obtained using several public image datasets demonstrate the advantages and effectiveness of our novel approach compared with those of the existing methods.
Collapse
|
14
|
Han Y, Deng C, Zhao B, Tao D. State-aware Anti-drift Object Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:4075-4086. [PMID: 30892207 DOI: 10.1109/tip.2019.2905984] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Correlation filter (CF) based trackers have aroused increasing attentions in visual tracking field due to the superior performance on several datasets while maintaining high running speed. For each frame, an ideal filter is trained in order to discriminate the target from its surrounding background. Considering that the target always undergoes external and internal interference during tracking procedure, the trained tracker should not only have the ability to judge the current state when failure occurs, but also to resist the model drift caused by challenging distractions. To this end, we present a State-aware Anti-drift Tracker (SAT) in this paper, which jointly model the discrimination and reliability information in filter learning. Specifically, global context patches are incorporated into filter training stage to better distinguish the target from backgrounds. Meanwhile, a color-based reliable mask is learned to encourage the filter to focus on more reliable regions suitable for tracking. We show that the proposed optimization problem could be efficiently solved using Alternative Direction Method of Multipliers and fully carried out in Fourier domain. Furthermore, a Kurtosis-based updating scheme is advocated to reveal the tracking condition as well as guarantee a high-confidence template updating. Extensive experiments are conducted on OTB-100 and UAV-20L datasets to compare the SAT tracker with other relevant state-of-the-art methods. Both quantitative and qualitative evaluations further demonstrate the effectiveness and robustness of the proposed work.
Collapse
|