1
|
Anthwal S, Ganotra D. An overview of optical flow-based approaches for motion segmentation. THE IMAGING SCIENCE JOURNAL 2019. [DOI: 10.1080/13682199.2019.1641316] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Shivangi Anthwal
- Department of Applied Science and Humanities, Indira Gandhi Delhi Technical University for Women, Delhi, India
| | - Dinesh Ganotra
- Department of Applied Science and Humanities, Indira Gandhi Delhi Technical University for Women, Delhi, India
| |
Collapse
|
2
|
Ullman S, Dorfman N, Harari D. A model for discovering 'containment' relations. Cognition 2019; 183:67-81. [PMID: 30419508 PMCID: PMC6331663 DOI: 10.1016/j.cognition.2018.11.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 10/28/2018] [Accepted: 11/02/2018] [Indexed: 11/30/2022]
Abstract
Rapid developments in the fields of learning and object recognition have been obtained by successfully developing and using methods for learning from a large number of labeled image examples. However, such current methods cannot explain infants' learning of new concepts based on their visual experience, in particular, the ability to learn complex concepts without external guidance, as well as the natural order in which related concepts are acquired. A remarkable example of early visual learning is the category of 'containers' and the notion of 'containment'. Surprisingly, this is one of the earliest spatial relations to be learned, starting already around 3 month of age, and preceding other common relations (e.g., 'support', 'in-between'). In this work we present a model, which explains infants' capacity of learning 'containment' and related concepts by 'just looking', together with their empirical development trajectory. Learning occurs in the model fast and without external guidance, relying only on perceptual processes that are present in the first months of life. Instead of labeled training examples, the system provides its own internal supervision to guide the learning process. We show how the detection of so-called 'paradoxical occlusion' provides natural internal supervision, which guides the system to gradually acquire a range of useful containment-related concepts. Similar mechanisms of using implicit internal supervision can have broad application in other cognitive domains as well as artificial intelligent systems, because they alleviate the need for supplying extensive external supervision, and because they can guide the learning process to extract concepts that are meaningful to the observer, even if they are not by themselves obvious, or salient in the input.
Collapse
Affiliation(s)
- Shimon Ullman
- Weizmann Institute of Science, Department of Computer Science and Applied Mathematics, 234 Herzl Street, Rehovot 7610001, Israel
| | - Nimrod Dorfman
- Weizmann Institute of Science, Department of Computer Science and Applied Mathematics, 234 Herzl Street, Rehovot 7610001, Israel
| | - Daniel Harari
- Weizmann Institute of Science, Department of Computer Science and Applied Mathematics, 234 Herzl Street, Rehovot 7610001, Israel.
| |
Collapse
|
3
|
Jie Chen, Lap-Pui Chau. A rain pixel recovery algorithm for videos with highly dynamic scenes. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014; 23:1097-1104. [PMID: 24240000 DOI: 10.1109/tip.2013.2290595] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Rain removal is a very useful and important technique in applications such as security surveillance and movie editing. Several rain removal algorithms have been proposed these years, where photometric, chromatic, and probabilistic properties of the rain have been exploited to detect and remove the rainy effect. Current methods generally work well with light rain and relatively static scenes, when dealing with heavier rainfall in dynamic scenes, these methods give very poor visual results. The proposed algorithm is based on motion segmentation of dynamic scene. After applying photometric and chromatic constraints for rain detection, rain removal filters are applied on pixels such that their dynamic property as well as motion occlusion clue are considered; both spatial and temporal informations are then adaptively exploited during rain pixel recovery. Results show that the proposed algorithm has a much better performance for rainy scenes with large motion than existing algorithms.
Collapse
|
4
|
Ayvaci A, Soatto S. Detachable object detection: segmentation and depth ordering from short-baseline video. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2012; 34:1942-1951. [PMID: 22201065 DOI: 10.1109/tpami.2011.271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
We describe an approach for segmenting a moving image into regions that correspond to surfaces in the scene that are partially surrounded by the medium. It integrates both appearance and motion statistics into a cost functional that is seeded with occluded regions and minimized efficiently by solving a linear programming problem. Where a short observation time is insufficient to determine whether the object is detachable, the results of the minimization can be used to seed a more costly optimization based on a longer sequence of video data. The result is an entirely unsupervised scheme to detect and segment an arbitrary and unknown number of objects. We test our scheme to highlight the potential, as well as limitations, of our approach.
Collapse
Affiliation(s)
- Alper Ayvaci
- Department of Computer Science, University of California, Los Angeles, Boelter Hall, 405 Hilgard Ave, Los Angeles, CA 90095, USA.
| | | |
Collapse
|
5
|
Jacobson N, Freund Y, Nguyen TQ. An online learning approach to occlusion boundary detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2012; 21:252-261. [PMID: 21788193 DOI: 10.1109/tip.2011.2162420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
We propose a novel online learning-based framework for occlusion boundary detection in video sequences. This approach does not require any prior training and instead "learns" occlusion boundaries by updating a set of weights for the online learning Hedge algorithm at each frame instance. Whereas previous training-based methods perform well only on data similar to the trained examples, the proposed method is well suited for any video sequence. We demonstrate the performance of the proposed detector both for the CMU data set, which includes hand-labeled occlusion boundaries, and for a novel video sequence. In addition to occlusion boundary detection, the proposed algorithm is capable of classifying occlusion boundaries by angle and by whether the occluding object is covering or uncovering the background.
Collapse
Affiliation(s)
- Natan Jacobson
- Department of Electrical and Computer Engineering, University of California at San Diego, La Jolla, CA 92093, USA.
| | | | | |
Collapse
|
6
|
Zhang Q, Ngan KN. Segmentation and tracking multiple objects under occlusion from multiview video. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2011; 20:3308-3313. [PMID: 21659028 DOI: 10.1109/tip.2011.2159228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
In this paper, we present a multiview approach to segment the foreground objects consisting of a group of people into individual human objects and track them across the video sequence. Depth and occlusion information recovered from multiple views of the scene is integrated into the object detection, segmentation, and tracking processes. Adaptive background penalty with occlusion reasoning is proposed to separate the foreground regions from the background in the initial frame. Multiple cues are employed to segment individual human objects from the group. To propagate the segmentation through video, each object region is independently tracked by motion compensation and uncertainty refinement, and the motion occlusion is tackled as layer transition. The experimental results implemented on both our sequences and other's sequence have demonstrated the algorithm's efficiency in terms of subjective performance. Objective comparison with a state-of-the-art algorithm validates the superior performance of our method quantitatively.
Collapse
|
7
|
Jung JH, Hong K, Park G, Chung I, Park JH, Lee B. Reconstruction of three-dimensional occluded object using optical flow and triangular mesh reconstruction in integral imaging. OPTICS EXPRESS 2010; 18:26373-26387. [PMID: 21164988 DOI: 10.1364/oe.18.026373] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
We proposed a reconstruction method for the occluded region of three-dimensional (3D) object using the depth extraction based on the optical flow and triangular mesh reconstruction in integral imaging. The depth information of sub-images from the acquired elemental image set is extracted using the optical flow with sub-pixel accuracy, which alleviates the depth quantization problem. The extracted depth maps of sub-image array are segmented by the depth threshold from the histogram based segmentation, which is represented as the point clouds. The point clouds are projected to the viewpoint of center sub-image and reconstructed by the triangular mesh reconstruction. The experimental results support the validity of the proposed method with high accuracy of peak signal-to-noise ratio and normalized cross-correlation in 3D image recognition.
Collapse
Affiliation(s)
- Jae-Hyun Jung
- School of Electrical Engineering, Seoul National University, Gwanak-Gu Gwanakro 599, Seoul 151-744, Korea
| | | | | | | | | | | |
Collapse
|
8
|
|
9
|
Beck C, Ognibeni T, Neumann H. Object segmentation from motion discontinuities and temporal occlusions--a biologically inspired model. PLoS One 2008; 3:e3807. [PMID: 19043613 PMCID: PMC2586919 DOI: 10.1371/journal.pone.0003807] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2008] [Accepted: 10/30/2008] [Indexed: 11/25/2022] Open
Abstract
Background Optic flow is an important cue for object detection. Humans are able to
perceive objects in a scene using only kinetic boundaries, and can
perform the task even when other shape cues are not provided. These
kinetic boundaries are characterized by the presence of motion
discontinuities in a local neighbourhood. In addition, temporal
occlusions appear along the boundaries as the object in front covers the
background and the objects that are spatially behind it. Methodology/Principal Findings From a technical point of view, the detection of motion boundaries for
segmentation based on optic flow is a difficult task. This is due to the
problem that flow detected along such boundaries is generally not
reliable. We propose a model derived from mechanisms found in visual
areas V1, MT, and MSTl of human and primate cortex that achieves robust
detection along motion boundaries. It includes two separate mechanisms
for both the detection of motion discontinuities and of occlusion
regions based on how neurons respond to spatial and temporal contrast,
respectively. The mechanisms are embedded in a biologically inspired
architecture that integrates information of different model components
of the visual processing due to feedback connections. In particular,
mutual interactions between the detection of motion discontinuities and
temporal occlusions allow a considerable improvement of the kinetic
boundary detection. Conclusions/Significance A new model is proposed that uses optic flow cues to detect motion
discontinuities and object occlusion. We suggest that by combining these
results for motion discontinuities and object occlusion, object
segmentation within the model can be improved. This idea could also be
applied in other models for object segmentation. In addition, we discuss
how this model is related to neurophysiological findings. The model was
successfully tested both with artificial and real sequences including
self and object motion.
Collapse
Affiliation(s)
- Cornelia Beck
- Institute for Neural Information Processing, University of Ulm, Ulm, Germany.
| | | | | |
Collapse
|
10
|
Kim NG. Dynamic Occlusion and Optical Flow From Corrugated Surfaces. ECOLOGICAL PSYCHOLOGY 2008. [DOI: 10.1080/10407410802189166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
11
|
Feldman D, Weinshall D. Motion segmentation and depth ordering using an occlusion detector. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2008; 30:1171-1185. [PMID: 18550901 DOI: 10.1109/tpami.2007.70766] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
We present a novel method for motion segmentation and depth ordering from a video sequence in general motion. We first compute motion segmentation based on differential properties of the spatio-temporal domain, and scale-space integration. Given a motion boundary, we describe two algorithms to determine depth ordering from two- and three- frame sequences. An remarkable characteristic of our method is its ability compute depth ordering from only two frames. The segmentation and depth ordering algorithms are shown to give good results on 6 real sequences taken in general motion. We use synthetic data to show robustness to high levels of noise and illumination changes; we also include cases where no intensity edge exists at the location of the motion boundary, or when no parametric motion model can describe the data. Finally, we describe human experiments showing that people, like our algorithm, can compute depth ordering from only two frames, even when the boundary between the layers is not visible in a single frame.
Collapse
Affiliation(s)
- Doron Feldman
- School of Computer Science and Engineering, Hebrew University of Jerusalem, Jerusalem, Israel.
| | | |
Collapse
|
12
|
Pundlik SJ, Birchfield ST. Real-time motion segmentation of sparse feature points at any speed. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS. PART B, CYBERNETICS : A PUBLICATION OF THE IEEE SYSTEMS, MAN, AND CYBERNETICS SOCIETY 2008; 38:731-42. [PMID: 18558538 DOI: 10.1109/tsmcb.2008.919229] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We present a real-time incremental approach to motion segmentation operating on sparse feature points. In contrast to previous work, the algorithm allows for a variable number of image frames to affect the segmentation process, thus enabling an arbitrary number of objects traveling at different relative speeds to be detected. Feature points are detected and tracked throughout an image sequence, and the features are grouped using a spatially constrained expectation-maximization (EM) algorithm that models the interactions between neighboring features using the Markov assumption. The primary parameter used by the algorithm is the amount of evidence that must accumulate before features are grouped. A statistical goodness-of-fit test monitors the change in the motion parameters of a group over time in order to automatically update the reference frame. Experimental results on a number of challenging image sequences demonstrate the effectiveness and computational efficiency of the technique.
Collapse
Affiliation(s)
- Shrinivas J Pundlik
- Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634-5124, USA
| | | |
Collapse
|
13
|
Liu S, Kang K, Tarel JP, Cooper DB. Free-form object reconstruction from silhouettes, occluding edges and texture edges: a unified and robust operator based on duality. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2008; 30:131-146. [PMID: 18000330 DOI: 10.1109/tpami.2007.1143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
In this paper, the duality in differential form is developed between a 3D primal surface and its dual manifold formed by the surface's tangent planes, i.e., each tangent plane of the primal surface is represented as a four-dimensional vector which constitutes a point on the dual manifold. The iterated dual theorem shows that each tangent plane of the dual manifold corresponds to a point on the original 3D surface, i.e., the dual of the dual goes back to the primal. This theorem can be directly used to reconstruct 3D surface from image edges by estimating the dual manifold from these edges. In this paper we further develop the work in our original conference papers resulting in the robust differential dual operator. We argue that the operator makes good use of the information available in the image data, by using both points of intensity discontinuity and their edge directions; we provide a simple physical interpretation of what the abstract algorithm is actually estimating and why it makes sense in terms of estimation accuracy; our algorithm operates on all edges in the images, including silhouette edges, self occlusion edges, and texture edges, without distinguishing their types (thus resulting in improved accuracy and handling locally concave surface estimation if texture edges are present); the algorithm automatically handles various degeneracies; and the algorithm incorporates new methodologies for implementing the required operations such as appropriately relating edges in pairs of images, evaluating and using the algorithm's sensitivity to noise to determine the accuracy of an estimated 3D point. Experiments with both synthetic and real images demonstrate that the operator is accurate, robust to degeneracies and noise, and general for reconstructing free-form objects from occluding edges and texture edges detected in calibrated images or video sequences.
Collapse
Affiliation(s)
- Shubao Liu
- Division of Engineering, Brown University, Providence, RI 02912, USA.
| | | | | | | |
Collapse
|
14
|
|