1
|
Pang Y, Cao J, Li X. Learning Sampling Distributions for Efficient Object Detection. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:117-129. [PMID: 26742154 DOI: 10.1109/tcyb.2015.2508603] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Object detection is an important task in computer vision and machine intelligence systems. Multistage particle windows (MPW), proposed by Gualdi et al., is an algorithm of fast and accurate object detection. By sampling particle windows (PWs) from a proposal distribution (PD), MPW avoids exhaustively scanning the image. Despite its success, it is unknown how to determine the number of stages and the number of PWs in each stage. Moreover, it has to generate too many PWs in the initialization step and it unnecessarily regenerates too many PWs around object-like regions. In this paper, we attempt to solve the problems of MPW. An important fact we used is that there is a large probability for a randomly generated PW not to contain the object because the object is a sparse event relative to the huge number of candidate windows. Therefore, we design a PD so as to efficiently reject the huge number of nonobject windows. Specifically, we propose the concepts of rejection, acceptance, and ambiguity windows and regions. Then, the concepts are used to form and update a dented uniform distribution and a dented Gaussian distribution. This contrasts to MPW which utilizes only on region of support. The PD of MPW is acceptance-oriented whereas the PD of our method (called iPW) is rejection-oriented. Experimental results on human and face detection demonstrate the efficiency and the effectiveness of the iPW algorithm. The source code is publicly accessible.
Collapse
|
2
|
Biswas SK, Milanfar P. One Shot Detection with Laplacian Object and Fast Matrix Cosine Similarity. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2016; 38:546-562. [PMID: 27046497 DOI: 10.1109/tpami.2015.2453950] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
One shot, generic object detection involves searching for a single query object in a larger target image. Relevant approaches have benefited from features that typically model the local similarity patterns. In this paper, we combine local similarity (encoded by local descriptors) with a global context (i.e., a graph structure) of pairwise affinities among the local descriptors, embedding the query descriptors into a low dimensional but discriminatory subspace. Unlike principal components that preserve global structure of feature space, we actually seek a linear approximation to the Laplacian eigenmap that permits us a locality preserving embedding of high dimensional region descriptors. Our second contribution is an accelerated but exact computation of matrix cosine similarity as the decision rule for detection, obviating the computationally expensive sliding window search. We leverage the power of Fourier transform combined with integral image to achieve superior runtime efficiency that allows us to test multiple hypotheses (for pose estimation) within a reasonably short time. Our approach to one shot detection is training-free, and experiments on the standard data sets confirm the efficacy of our model. Besides, low computation cost of the proposed (codebook-free) object detector facilitates rather straightforward query detection in large data sets including movie videos.
Collapse
|
3
|
Jedynak B, Frazier PI, Sznitman R. Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss. J Appl Probab 2016. [DOI: 10.1239/jap/1331216837] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We consider the problem of twenty questions with noisy answers, in which we seek to find a target by repeatedly choosing a set, asking an oracle whether the target lies in this set, and obtaining an answer corrupted by noise. Starting with a prior distribution on the target's location, we seek to minimize the expected entropy of the posterior distribution. We formulate this problem as a dynamic program and show that any policy optimizing the one-step expected reduction in entropy is also optimal over the full horizon. Two such Bayes optimal policies are presented: one generalizes the probabilistic bisection policy due to Horstein and the other asks a deterministic set of questions. We study the structural properties of the latter, and illustrate its use in a computer vision application.
Collapse
|
4
|
Moreno JC, Surya Prasath VB, Santos G, Proença H. Robust Periocular Recognition by Fusing Sparse Representations of Color and Geometry Information. JOURNAL OF SIGNAL PROCESSING SYSTEMS 2015. [DOI: 10.1007/s11265-015-1023-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
5
|
Kovashka A, Parikh D, Grauman K. WhittleSearch: Interactive Image Search with Relative Attribute Feedback. Int J Comput Vis 2015. [DOI: 10.1007/s11263-015-0814-0] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
6
|
|
7
|
Serradell E, Pinheiro MA, Sznitman R, Kybic J, Moreno-Noguer F, Fua P. Non-Rigid Graph Registration Using Active Testing Search. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2015; 37:625-638. [PMID: 26353266 DOI: 10.1109/tpami.2014.2343235] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
We present a new approach for matching sets of branching curvilinear structures that form graphs embedded in R2 or R3 and may be subject to deformations. Unlike earlier methods, ours does not rely on local appearance similarity nor does require a good initial alignment. Furthermore, it can cope with non-linear deformations, topological differences, and partial graphs. To handle arbitrary non-linear deformations, we use Gaussian process regressions to represent the geometrical mapping relating the two graphs. In the absence of appearance information, we iteratively establish correspondences between points, update the mapping accordingly, and use it to estimate where to find the most likely correspondences that will be used in the next step. To make the computation tractable for large graphs, the set of new potential matches considered at each iteration is not selected at random as with many RANSAC-based algorithms. Instead, we introduce a so-called Active Testing Search strategy that performs a priority search to favor the most likely matches and speed-up the process. We demonstrate the effectiveness of our approach first on synthetic cases and then on angiography data, retinal fundus images, and microscopy image stacks acquired at very different resolutions.
Collapse
|
8
|
Pang Y, Zhang K, Yuan Y, Wang K. Distributed object detection with linear SVMs. IEEE TRANSACTIONS ON CYBERNETICS 2014; 44:2122-2133. [PMID: 25330474 DOI: 10.1109/tcyb.2014.2301453] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In vision and learning, low computational complexity and high generalization are two important goals for video object detection. Low computational complexity here means not only fast speed but also less energy consumption. The sliding window object detection method with linear support vector machines (SVMs) is a general object detection framework. The computational cost is herein mainly paid in complex feature extraction and innerproduct-based classification. This paper first develops a distributed object detection framework (DOD) by making the best use of spatial-temporal correlation, where the process of feature extraction and classification is distributed in the current frame and several previous frames. In each framework, only subfeature vectors are extracted and the response of partial linear classifier (i.e., subdecision value) is computed. To reduce the dimension of traditional block-based histograms of oriented gradients (BHOG) feature vector, this paper proposes a cell-based HOG (CHOG) algorithm, where the features in one cell are not shared with overlapping blocks. Using CHOG as feature descriptor, we develop CHOG-DOD as an instance of DOD framework. Experimental results on detection of hand, face, and pedestrian in video show the superiority of the proposed method.
Collapse
|
9
|
The Ignorant Led by the Blind: A Hybrid Human–Machine Vision System for Fine-Grained Categorization. Int J Comput Vis 2014. [DOI: 10.1007/s11263-014-0698-4] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
10
|
Benitez-Quiroz CF, Rivera S, Gotardo PF, Martinez AM. Salient and Non-Salient Fiducial Detection using a Probabilistic Graphical Model. PATTERN RECOGNITION 2014; 47:10.1016/j.patcog.2013.06.013. [PMID: 24187386 PMCID: PMC3810992 DOI: 10.1016/j.patcog.2013.06.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Deformable shape detection is an important problem in computer vision and pattern recognition. However, standard detectors are typically limited to locating only a few salient landmarks such as landmarks near edges or areas of high contrast, often conveying insufficient shape information. This paper presents a novel statistical pattern recognition approach to locate a dense set of salient and non-salient landmarks in images of a deformable object. We explore the fact that several object classes exhibit a homogeneous structure such that each landmark position provides some information about the position of the other landmarks. In our model, the relationship between all pairs of landmarks is naturally encoded as a probabilistic graph. Dense landmark detections are then obtained with a new sampling algorithm that, given a set of candidate detections, selects the most likely positions as to maximize the probability of the graph. Our experimental results demonstrate accurate, dense landmark detections within and across different databases.
Collapse
Affiliation(s)
- C. Fabian Benitez-Quiroz
- Corresponding Author. (C. Fabian Benitez-Quiroz), (Samuel Rivera), (Paulo F.U. Gotardo), (Aleix M. Martinez)
| | - Samuel Rivera
- Corresponding Author. (C. Fabian Benitez-Quiroz), (Samuel Rivera), (Paulo F.U. Gotardo), (Aleix M. Martinez)
| | - Paulo F.U. Gotardo
- Corresponding Author. (C. Fabian Benitez-Quiroz), (Samuel Rivera), (Paulo F.U. Gotardo), (Aleix M. Martinez)
| | - Aleix M. Martinez
- Corresponding Author. (C. Fabian Benitez-Quiroz), (Samuel Rivera), (Paulo F.U. Gotardo), (Aleix M. Martinez)
| |
Collapse
|
11
|
Sznitman R, Richa R, Taylor RH, Jedynak B, Hager GD. Unified detection and tracking of instruments during retinal microsurgery. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2013; 35:1263-1273. [PMID: 23520263 DOI: 10.1109/tpami.2012.209] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Methods for tracking an object have generally fallen into two groups: tracking by detection and tracking through local optimization. The advantage of detection-based tracking is its ability to deal with target appearance and disappearance, but it does not naturally take advantage of target motion continuity during detection. The advantage of local optimization is efficiency and accuracy, but it requires additional algorithms to initialize tracking when the target is lost. To bridge these two approaches, we propose a framework for unified detection and tracking as a time-series Bayesian estimation problem. The basis of our approach is to treat both detection and tracking as a sequential entropy minimization problem, where the goal is to determine the parameters describing a target in each frame. To do this we integrate the Active Testing (AT) paradigm with Bayesian filtering, and this results in a framework capable of both detecting and tracking robustly in situations where the target object enters and leaves the field of view regularly. We demonstrate our approach on a retinal tool tracking problem and show through extensive experiments that our method provides an efficient and robust tracking solution.
Collapse
Affiliation(s)
- Raphael Sznitman
- EPFL IC ISIM CVLAB, BC 309 (Batiment BC), Station 14, Lausanne, Switzerland.
| | | | | | | | | |
Collapse
|
12
|
Active Testing Search for Point Cloud Matching. LECTURE NOTES IN COMPUTER SCIENCE 2013; 23:572-83. [DOI: 10.1007/978-3-642-38868-2_48] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
|
13
|
Abstract
We consider the problem of twenty questions with noisy answers, in which we seek to find a target by repeatedly choosing a set, asking an oracle whether the target lies in this set, and obtaining an answer corrupted by noise. Starting with a prior distribution on the target's location, we seek to minimize the expected entropy of the posterior distribution. We formulate this problem as a dynamic program and show that any policy optimizing the one-step expected reduction in entropy is also optimal over the full horizon. Two such Bayes optimal policies are presented: one generalizes the probabilistic bisection policy due to Horstein and the other asks a deterministic set of questions. We study the structural properties of the latter, and illustrate its use in a computer vision application.
Collapse
|
14
|
Sznitman R, Basu A, Richa R, Handa J, Gehlbach P, Taylor RH, Jedynak B, Hager GD. Unified detection and tracking in retinal microsurgery. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2011; 14:1-8. [PMID: 22003593 DOI: 10.1007/978-3-642-23623-5_1] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Traditionally, tool tracking involves two subtasks: (i) detecting the tool in the initial image in which it appears, and (ii) predicting and refining the configuration of the detected tool in subsequent images. With retinal microsurgery in mind, we propose a unified tool detection and tracking framework, removing the need for two separate systems. The basis of our approach is to treat both detection and tracking as a sequential entropy minimization problem, where the goal is to determine the parameters describing a surgical tool in each frame. The resulting framework is capable of both detecting and tracking in situations where the tool enters and leaves the field of view regularly. We demonstrate the benefits of this method in the context of retinal tool tracking. Through extensive experimentation on a phantom eye, we show that this method provides efficient and robust tool tracking and detection.
Collapse
|
15
|
Sznitman R, Gupta M, Hager GD, Arratia PE, Sznitman J. Multi-environment model estimation for motility analysis of Caenorhabditis elegans. PLoS One 2010; 5:e11631. [PMID: 20661478 PMCID: PMC2908547 DOI: 10.1371/journal.pone.0011631] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2010] [Accepted: 06/23/2010] [Indexed: 11/30/2022] Open
Abstract
The nematode Caenorhabditis elegans is a well-known model organism used to investigate fundamental questions in biology. Motility assays of this small roundworm are designed to study the relationships between genes and behavior. Commonly, motility analysis is used to classify nematode movements and characterize them quantitatively. Over the past years, C. elegans' motility has been studied across a wide range of environments, including crawling on substrates, swimming in fluids, and locomoting through microfluidic substrates. However, each environment often requires customized image processing tools relying on heuristic parameter tuning. In the present study, we propose a novel Multi-Environment Model Estimation (MEME) framework for automated image segmentation that is versatile across various environments. The MEME platform is constructed around the concept of Mixture of Gaussian (MOG) models, where statistical models for both the background environment and the nematode appearance are explicitly learned and used to accurately segment a target nematode. Our method is designed to simplify the burden often imposed on users; here, only a single image which includes a nematode in its environment must be provided for model learning. In addition, our platform enables the extraction of nematode ‘skeletons’ for straightforward motility quantification. We test our algorithm on various locomotive environments and compare performances with an intensity-based thresholding method. Overall, MEME outperforms the threshold-based approach for the overwhelming majority of cases examined. Ultimately, MEME provides researchers with an attractive platform for C. elegans' segmentation and ‘skeletonizing’ across a wide range of motility assays.
Collapse
Affiliation(s)
- Raphael Sznitman
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Manaswi Gupta
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Gregory D. Hager
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Paulo E. Arratia
- Department of Mechanical Engineering and Applied Mechanics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Josué Sznitman
- Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, New Jersey, United States of America
- * E-mail:
| |
Collapse
|