1
|
Li S, Liu F, Jiao L, Chen P, Liu X, Li L. MFNet: A Novel GNN-Based Multi-Level Feature Network With Superpixel Priors. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:7306-7321. [PMID: 36383578 DOI: 10.1109/tip.2022.3220057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Since the superpixel segmentation method aggregates pixels based on similarity, the boundaries of some superpixels indicate the outline of the object and the superpixels provide prerequisites for learning structural-aware features. It is worthwhile to research how to utilize these superpixel priors effectively. In this work, by constructing the graph within superpixel and the graph among superpixels, we propose a novel Multi-level Feature Network (MFNet) based on graph neural network with the above superpixel priors. In our MFNet, we learn three-level features in a hierarchical way: from pixel-level feature to superpixel-level feature, and then to image-level feature. To solve the problem that the existing methods cannot represent superpixels well, we propose a superpixel representation method based on graph neural network, which takes the graph constructed by a single superpixel as input to extract the feature of the superpixel. To reflect the versatility of our MFNet, we apply it to an image-level prediction task and a pixel-level prediction task by designing different prediction modules. An attention linear classifier prediction module is proposed for image-level prediction tasks, such as image classification. An FC-based superpixel prediction module and a Decoder-based pixel prediction module are proposed for pixel-level prediction tasks, such as salient object detection. Our MFNet achieves competitive results on a number of datasets when compared with related methods. The visualization shows that the object boundaries and outline of the saliency maps predicted by our proposed MFNet are more refined and pay more attention to details.
Collapse
|
2
|
Yan T, Huang X, Zhao Q. Hierarchical Superpixel Segmentation by Parallel CRTrees Labeling. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:4719-4732. [PMID: 35797313 DOI: 10.1109/tip.2022.3187563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
This paper proposes a hierarchical superpixel segmentation by representing an image as a hierarchy of 1-nearest neighbor (1-NN) graphs with pixels/superpixels denoting the graph vertices. The 1-NN graphs are built from the pixel/superpixel adjacent matrices to ensure connectivity. To determine the next-level superpixel hierarchy, inspired by FINCH clustering, the weakly connected components (WCCs) of the 1-NN graph are labeled as superpixels. We reveal that the WCCs of a 1-NN graph consist of a forest of cycle-root-trees (CRTrees). The forest-like structure inspires us to propose a two-stage parallel CRTrees labeling which first links the child vertices to the cycle-roots and then labels all the vertices by the cycle-roots. We also propose an inter-inner superpixel distance penalization and a Lab color lightness penalization base on the property that the distance of a CRTree decreases monotonically from the child to root vertices. Experiments show the parallel CRTrees labeling is several times faster than recent advanced sequential and parallel connected components labeling algorithms. The proposed hierarchical superpixel segmentation has comparable performance to the best performer ETPS (state-of-the-arts) on the BSDS500, NYUV2, and Fash datasets. At the same time, it can achieve 200FPS for 480P video streams.
Collapse
|
3
|
Giraldo JH, Javed S, Bouwmans T. Graph Moving Object Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:2485-2503. [PMID: 33296300 DOI: 10.1109/tpami.2020.3042093] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Moving Object Segmentation (MOS) is a fundamental task in computer vision. Due to undesirable variations in the background scene, MOS becomes very challenging for static and moving camera sequences. Several deep learning methods have been proposed for MOS with impressive performance. However, these methods show performance degradation in the presence of unseen videos; and usually, deep learning models require large amounts of data to avoid overfitting. Recently, graph learning has attracted significant attention in many computer vision applications since they provide tools to exploit the geometrical structure of data. In this work, concepts of graph signal processing are introduced for MOS. First, we propose a new algorithm that is composed of segmentation, background initialization, graph construction, unseen sampling, and a semi-supervised learning method inspired by the theory of recovery of graph signals. Second, theoretical developments are introduced, showing one bound for the sample complexity in semi-supervised learning, and two bounds for the condition number of the Sobolev norm. Our algorithm has the advantage of requiring less labeled data than deep learning methods while having competitive results on both static and moving camera videos. Our algorithm is also adapted for Video Object Segmentation (VOS) tasks and is evaluated on six publicly available datasets outperforming several state-of-the-art methods in challenging conditions.
Collapse
|
4
|
Sky and Ground Segmentation in the Navigation Visions of the Planetary Rovers. SENSORS 2021; 21:s21216996. [PMID: 34770302 PMCID: PMC8588092 DOI: 10.3390/s21216996] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 10/15/2021] [Accepted: 10/18/2021] [Indexed: 11/16/2022]
Abstract
Sky and ground are two essential semantic components in computer vision, robotics, and remote sensing. The sky and ground segmentation has become increasingly popular. This research proposes a sky and ground segmentation framework for the rover navigation visions by adopting weak supervision and transfer learning technologies. A new sky and ground segmentation neural network (network in U-shaped network (NI-U-Net)) and a conservative annotation method have been proposed. The pre-trained process achieves the best results on a popular open benchmark (the Skyfinder dataset) by evaluating seven metrics compared to the state-of-the-art. These seven metrics achieve 99.232%, 99.211%, 99.221%, 99.104%, 0.0077, 0.0427, and 98.223% on accuracy, precision, recall, dice score (F1), misclassification rate (MCR), root mean squared error (RMSE), and intersection over union (IoU), respectively. The conservative annotation method achieves superior performance with limited manual intervention. The NI-U-Net can operate with 40 frames per second (FPS) to maintain the real-time property. The proposed framework successfully fills the gap between the laboratory results (with rich idea data) and the practical application (in the wild). The achievement can provide essential semantic information (sky and ground) for the rover navigation vision.
Collapse
|
5
|
Superpixel Segmentation Based on Grid Point Density Peak Clustering. SENSORS 2021; 21:s21196374. [PMID: 34640692 PMCID: PMC8512046 DOI: 10.3390/s21196374] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 09/14/2021] [Accepted: 09/22/2021] [Indexed: 11/17/2022]
Abstract
Superpixel segmentation is one of the key image preprocessing steps in object recognition and detection methods. However, the over-segmentation in the smoothly connected homogenous region in an image is the key problem. That would produce redundant complex jagged textures. In this paper, the density peak clustering will be used to reduce the redundant superpixels and highlight the primary textures and contours of the salient objects. Firstly, the grid pixels are extracted as feature points, and the density of each feature point will be defined. Secondly, the cluster centers are extracted with the density peaks. Finally, all the feature points will be clustered by the density peaks. The pixel blocks, which are obtained by the above steps, are superpixels. The method is carried out in the BSDS500 dataset, and the experimental results show that the Boundary Recall (BR) and Achievement Segmentation Accuracy (ASA) are 95.0% and 96.3%, respectively. In addition, the proposed method has better performance in efficiency (30 fps). The comparison experiments show that not only do the superpixel boundaries have good adhesion to the primary textures and contours of the salient objects, but they can also effectively reduce the redundant superpixels in the homogeneous region.
Collapse
|
6
|
Sahu A, Chowdhury AS. Together Recognizing, Localizing and Summarizing Actions in Egocentric Videos. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:4330-4340. [PMID: 33830922 DOI: 10.1109/tip.2021.3070732] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Analysis of egocentric video has recently drawn attention of researchers in the computer vision as well as multimedia communities. In this paper, we propose a weakly supervised superpixel level joint framework for localization, recognition and summarization of actions in an egocentric video. We first recognize and localize single as well as multiple action(s) in each frame of an egocentric video and then construct a summary of these detected actions. The superpixel level solution helps in precise localization of actions in addition to improving the recognition accuracy. Superpixels are extracted within the central regions of the egocentric video frames; these central regions being determined through a previously developed center-surround model. A sparse spatio-temporal video representation graph is constructed in the deep feature space with the superpixels as nodes. A weakly supervised solution using random walks yields action labels for each superpixel. After determining action label(s) for each frame from its constituent superpixels, we apply a fractional knapsack type formulation for obtaining a summary (of actions). Experimental comparisons on publicly available ADL, GTEA, EGTEA Gaze+, EgoGesture, and EPIC-Kitchens datasets show the effectiveness of the proposed solution.
Collapse
|
7
|
Chai D. Correction to: Rooted Spanning Superpixels. Int J Comput Vis 2020. [DOI: 10.1007/s11263-020-01391-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
8
|
Martins SB, Telea AC, Falcão AX. Investigating the impact of supervoxel segmentation for unsupervised abnormal brain asymmetry detection. Comput Med Imaging Graph 2020; 85:101770. [PMID: 32854021 DOI: 10.1016/j.compmedimag.2020.101770] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 07/27/2020] [Accepted: 07/31/2020] [Indexed: 11/26/2022]
Abstract
Several brain disorders are associated with abnormal brain asymmetries (asymmetric anomalies). Several computer-based methods aim to detect such anomalies automatically. Recent advances in this area use automatic unsupervised techniques that extract pairs of symmetric supervoxels in the hemispheres, model normal brain asymmetries for each pair from healthy subjects, and treat outliers as anomalies. Yet, there is no deep understanding of the impact of the supervoxel segmentation quality for abnormal asymmetry detection, especially for small anomalies, nor of the added value of using a specialized model for each supervoxel pair instead of a single global appearance model. We aim to answer these questions by a detailed evaluation of different scenarios for supervoxel segmentation and classification for detecting abnormal brain asymmetries. Experimental results on 3D MR-T1 brain images of stroke patients confirm the importance of high-quality supervoxels fit anomalies and the use of a specific classifier for each supervoxel. Next, we present a refinement of the detection method that reduces the number of false-positive supervoxels, thereby making the detection method easier to use for visual inspection and analysis of the found anomalies.
Collapse
Affiliation(s)
- Samuel B Martins
- Laboratory of Image Data Science (LIDS), Institute of Computing, University of Campinas, Brazil; Bernoulli Institute, University of Groningen, The Netherlands; Federal Institute of São Paulo, Campinas, Brazil
| | - Alexandru C Telea
- Department of Information and Computing Sciences, Utrecht University, The Netherlands
| | - Alexandre X Falcão
- Laboratory of Image Data Science (LIDS), Institute of Computing, University of Campinas, Brazil
| |
Collapse
|
9
|
Liu H, Wang H, Wu Y, Xing L. Superpixel Region Merging Based on Deep Network for Medical Image Segmentation. ACM T INTEL SYST TEC 2020. [DOI: 10.1145/3386090] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Automatic and accurate semantic segmentation of pathological structures in medical images is challenging because of noisy disturbance, deformable shapes of pathology, and low contrast between soft tissues. Classical superpixel-based classification algorithms suffer from edge leakage due to complexity and heterogeneity inherent in medical images. Therefore, we propose a deep U-Net with superpixel region merging processing incorporated for edge enhancement to facilitate and optimize segmentation. Our approach combines three innovations: (1) different from deep learning--based image segmentation, the segmentation evolved from superpixel region merging via U-Net training getting rich semantic information, in addition to gray similarity; (2) a bilateral filtering module was adopted at the beginning of the network to eliminate external noise and enhance soft tissue contrast at edges of pathogy; and (3) a normalization layer was inserted after the convolutional layer at each feature scale, to prevent overfitting and increase the sensitivity to model parameters. This model was validated on lung CT, brain MR, and coronary CT datasets, respectively. Different superpixel methods and cross validation show the effectiveness of this architecture. The hyperparameter settings were empirically explored to achieve a good trade-off between the performance and efficiency, where a four-layer network achieves the best result in precision, recall, F-measure, and running speed. It was demonstrated that our method outperformed state-of-the-art networks, including FCN-16s, SegNet, PSPNet, DeepLabv3, and traditional U-Net, both quantitatively and qualitatively. Source code for the complete method is available at https://github.com/Leahnawho/Superpixel-network.
Collapse
Affiliation(s)
- Hui Liu
- Shandong University of Finance and Economics and Stanford University, Jinan, Shandong Province, China
| | - Haiou Wang
- Shandong University of Finance and Economics, Jinan, Shandong Province, China
| | - Yan Wu
- Stanford University, CA, USA
| | | |
Collapse
|
10
|
Bejar HH, Ferzoli Guimaraes SJ, Miranda PA. Efficient hierarchical graph partitioning for image segmentation by optimum oriented cuts. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2020.01.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
11
|
Martins SB, Telea AC, Falcao AX. Extending Supervoxel-based Abnormal Brain Asymmetry Detection to the Native Image Space. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2019:450-453. [PMID: 31945935 DOI: 10.1109/embc.2019.8857447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Most neurological diseases are associated with abnormal brain asymmetries. Recent advances in automatic unsupervised techniques model normal brain asymmetries from healthy subjects only and treat anomalies as outliers. Outlier detection is usually done in a common standard coordinate space that limits its usability. To alleviate the problem, we extend a recent fully unsupervised supervoxel-based approach (SAAD) for abnormal asymmetry detection in the native image space of MR brain images. Experimental results using our new method, called N-SAAD, show that it can achieve higher accuracy in detection with considerably less false positives than a method based on unsupervised deep learning for a large set of MR-T1 images.
Collapse
|