1
|
Complementary Segmentation of Primary Video Objects with Reversible Flows. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12157781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Segmenting primary objects in a video is an important yet challenging problem in intelligent video surveillance, as it exhibits various levels of foreground/background ambiguities. To reduce such ambiguities, we propose a novel formulation via exploiting foreground and background context as well as their complementary constraint. Under this formulation, a unified objective function is further defined to encode each cue. For implementation, we design a complementary segmentation network (CSNet) with two separate branches, which can simultaneously encode the foreground and background information along with joint spatial constraints. The CSNet is trained on massive images with manually annotated salient objects in an end-to-end manner. By applying CSNet on each video frame, the spatial foreground and background maps can be initialized. To enforce temporal consistency effectively and efficiently, we divide each frame into superpixels and construct a neighborhood reversible flow that reflects the most reliable temporal correspondences between superpixels in far-away frames. With such a flow, the initialized foregroundness and backgroundness can be propagated along the temporal dimension so that primary video objects gradually pop out and distractors are well suppressed. Extensive experimental results on three video datasets show that the proposed approach achieves impressive performance in comparisons with 22 state-of-the-art models.
Collapse
|
2
|
Chen X, Huang H, Heidari AA, Sun C, Lv Y, Gui W, Liang G, Gu Z, Chen H, Li C, Chen P. An efficient multilevel thresholding image segmentation method based on the slime mould algorithm with bee foraging mechanism: A real case with lupus nephritis images. Comput Biol Med 2022; 142:105179. [DOI: 10.1016/j.compbiomed.2021.105179] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 12/24/2021] [Accepted: 12/24/2021] [Indexed: 02/01/2023]
|
3
|
Cuevas E, Becerra H, Luque A, Elaziz MA. Fast multi-feature image segmentation. APPLIED MATHEMATICAL MODELLING 2021; 90:742-757. [DOI: 10.1016/j.apm.2020.09.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
4
|
|
5
|
Fang Y, Zhang C, Li J, Lei J, Perreira Da Silva M, Le Callet P. Visual Attention Modeling for Stereoscopic Video: A Benchmark and Computational Model. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:4684-4696. [PMID: 28678707 DOI: 10.1109/tip.2017.2721112] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
In this paper, we investigate the visual attention modeling for stereoscopic video from the following two aspects. First, we build one large-scale eye tracking database as the benchmark of visual attention modeling for stereoscopic video. The database includes 47 video sequences and their corresponding eye fixation data. Second, we propose a novel computational model of visual attention for stereoscopic video based on Gestalt theory. In the proposed model, we extract the low-level features, including luminance, color, texture, and depth, from discrete cosine transform coefficients, which are used to calculate feature contrast for the spatial saliency computation. The temporal saliency is calculated by the motion contrast from the planar and depth motion features in the stereoscopic video sequences. The final saliency is estimated by fusing the spatial and temporal saliency with uncertainty weighting, which is estimated by the laws of proximity, continuity, and common fate in Gestalt theory. Experimental results show that the proposed method outperforms the state-of-the-art stereoscopic video saliency detection models on our built large-scale eye tracking database and one other database (DML-ITRACK-3D).
Collapse
|
6
|
|
7
|
Zhao X, Turk M, Li W, Lien KC, Wang G. A multilevel image thresholding segmentation algorithm based on two-dimensional K–L divergence and modified particle swarm optimization. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2016.07.016] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
8
|
Liu D, Chang F, Liu C. Salient object detection fusing global and local information based on nonsubsampled contourlet transform. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2016; 33:1430-1441. [PMID: 27505640 DOI: 10.1364/josaa.33.001430] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The nonsubsampled contourlet transform (NSCT) has properties of multiresolution, localization, directionality, and anisotropy. The directionality property permits it to resolve intrinsic directional features that characterize the analyzed image. In this paper, we present a bottom-up salient object detection approach fusing global and local information based on NSCT. Images are first decomposed by applying NSCT. The coefficients of bandpass subbands are categorized and optimized accordingly to get better representation. Then feature maps are obtained by performing the inverse NSCT on these optimized coefficients. The global and local saliency maps are generated from these feature maps. Global saliency is obtained by utilizing the likelihood of features, and local saliency is measured by calculating the local self-information. In the end, the final saliency map is computed by fusing the global and local saliency maps together. Experimental results on MSRA 10K demonstrate the effectiveness and promising performance of our proposed method.
Collapse
|
9
|
Zhang D, Han J, Li C, Wang J, Li X. Detection of Co-salient Objects by Looking Deep and Wide. Int J Comput Vis 2016. [DOI: 10.1007/s11263-016-0907-4] [Citation(s) in RCA: 241] [Impact Index Per Article: 30.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
10
|
Zhang L, Yang L, Luo T. Unified Saliency Detection Model Using Color and Texture Features. PLoS One 2016; 11:e0149328. [PMID: 26889826 PMCID: PMC4758633 DOI: 10.1371/journal.pone.0149328] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Accepted: 01/29/2016] [Indexed: 11/18/2022] Open
Abstract
Saliency detection attracted attention of many researchers and had become a very active area of research. Recently, many saliency detection models have been proposed and achieved excellent performance in various fields. However, most of these models only consider low-level features. This paper proposes a novel saliency detection model using both color and texture features and incorporating higher-level priors. The SLIC superpixel algorithm is applied to form an over-segmentation of the image. Color saliency map and texture saliency map are calculated based on the region contrast method and adaptive weight. Higher-level priors including location prior and color prior are incorporated into the model to achieve a better performance and full resolution saliency map is obtained by using the up-sampling method. Experimental results on three datasets demonstrate that the proposed saliency detection model outperforms the state-of-the-art models.
Collapse
Affiliation(s)
- Libo Zhang
- School of Computer and Control, University of Chinese Academy of Sciences, Beijing, China
- * E-mail:
| | - Lin Yang
- School of Computer and Control, University of Chinese Academy of Sciences, Beijing, China
| | - Tiejian Luo
- School of Computer and Control, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
11
|
Borji A, Cheng MM, Jiang H, Li J. Salient Object Detection: A Benchmark. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:5706-22. [PMID: 26452281 DOI: 10.1109/tip.2015.2487833] [Citation(s) in RCA: 294] [Impact Index Per Article: 32.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
We extensively compare, qualitatively and quantitatively, 41 state-of-the-art models (29 salient object detection, 10 fixation prediction, 1 objectness, and 1 baseline) over seven challenging data sets for the purpose of benchmarking salient object detection and segmentation methods. From the results obtained so far, our evaluation shows a consistent rapid progress over the last few years in terms of both accuracy and running time. The top contenders in this benchmark significantly outperform the models identified as the best in the previous benchmark conducted three years ago. We find that the models designed specifically for salient object detection generally work better than models in closely related areas, which in turn provides a precise definition and suggests an appropriate treatment of this problem that distinguishes it from other problems. In particular, we analyze the influences of center bias and scene complexity in model performance, which, along with the hard cases for the state-of-the-art models, provide useful hints toward constructing more challenging large-scale data sets and better saliency models. Finally, we propose probable solutions for tackling several open problems, such as evaluation scores and data set bias, which also suggest future research directions in the rapidly growing field of salient object detection.
Collapse
|
12
|
Li J, Duan LY, Chen X, Huang T, Tian Y. Finding the Secret of Image Saliency in the Frequency Domain. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2015; 37:2428-2440. [PMID: 26539848 DOI: 10.1109/tpami.2015.2424870] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
There are two sides to every story of visual saliency modeling in the frequency domain. On the one hand, image saliency can be effectively estimated by applying simple operations to the frequency spectrum. On the other hand, it is still unclear which part of the frequency spectrum contributes the most to popping-out targets and suppressing distractors. Toward this end, this paper tentatively explores the secret of image saliency in the frequency domain. From the results obtained in several qualitative and quantitative experiments, we find that the secret of visual saliency may mainly hide in the phases of intermediate frequencies. To explain this finding, we reinterpret the concept of discrete Fourier transform from the perspective of template-based contrast computation and thus develop several principles for designing the saliency detector in the frequency domain. Following these principles, we propose a novel approach to design the saliency detector under the assistance of prior knowledge obtained through both unsupervised and supervised learning processes. Experimental results on a public image benchmark show that the learned saliency detector outperforms 18 state-of-the-art approaches in predicting human fixations.
Collapse
|
13
|
Li Q, Chen X, Song Y, Zhang Y, Jin X, Zhao Q. Geodesic propagation for semantic labeling. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014; 23:4812-4825. [PMID: 25248182 DOI: 10.1109/tip.2014.2358193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
This paper presents a semantic labeling framework with geodesic propagation (GP). Under the same framework, three algorithms are proposed, including GP, supervised GP (SGP) for image, and hybrid GP (HGP) for video. In these algorithms, we resort to the recognition proposal map and select confident pixels with maximum probability as the initial propagation seeds. From these seeds, the GP algorithm iteratively updates the weights of geodesic distances until the semantic labels are propagated to all pixels. On the contrary, the SGP algorithm further exploits the contextual information to guide the direction of propagation, leading to better performance but higher computational complexity than the GP. For video labeling, we further propose the HGP algorithm, in which the geodesic metric is used in both spatial and temporal spaces. Experiments on four public data sets show that our algorithms outperform several state-of-the-art methods. With the GP framework, convincing results for both image and video semantic labeling can be obtained.
Collapse
|
14
|
Tian H, Fang Y, Zhao Y, Lin W, Ni R, Zhu Z. Salient region detection by fusing bottom-up and top-down features extracted from a single image. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014; 23:4389-4398. [PMID: 25163061 DOI: 10.1109/tip.2014.2350914] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Recently, some global contrast-based salient region detection models have been proposed based on only the low-level feature of color. It is necessary to consider both color and orientation features to overcome their limitations, and thus improve the performance of salient region detection for images with low-contrast in color and high-contrast in orientation. In addition, the existing fusion methods for different feature maps, like the simple averaging method and the selective method, are not effective sufficiently. To overcome these limitations of existing salient region detection models, we propose a novel salient region model based on the bottom-up and top-down mechanisms: the color contrast and orientation contrast are adopted to calculate the bottom-up feature maps, while the top-down cue of depth-from-focus from the same single image is used to guide the generation of final salient regions, since depth-from-focus reflects the photographer's preference and knowledge of the task. A more general and effective fusion method is designed to combine the bottom-up feature maps. According to the degree-of-scattering and eccentricities of feature maps, the proposed fusion method can assign adaptive weights to different feature maps to reflect the confidence level of each feature map. The depth-from-focus of the image as a significant top-down feature for visual attention in the image is used to guide the salient regions during the fusion process; with its aid, the proposed fusion method can filter out the background and highlight salient regions for the image. Experimental results show that the proposed model outperforms the state-of-the-art models on three public available data sets.
Collapse
|