Learn More
This paper addresses unsupervised discovery and local-ization of dominant objects from a noisy image collection with multiple object classes. The setting of this problem is fully unsupervised, without even image-level annotations or any assumption of a single dominant class. This is far more general than typical colocalization, cosegmentation, or(More)
We propose a novel background subtraction algorithm for the videos captured by a moving camera. In our technique , foreground and background appearance models in each frame are constructed and propagated sequentially by Bayesian filtering. We estimate the posterior of appearance , which is computed by the product of the image likelihood in the current frame(More)
We propose a novel algorithm to detect occlusion for visual tracking through learning with observation likelihoods. In our technique, target is divided into regular grid cells and the state of occlusion is determined for each cell using a classifier. Each cell in the target is associated with many small patches, and the patch likelihoods observed during(More)
We propose an online visual tracking algorithm by learning discriminative saliency map using Convolutional Neural Network (CNN). Given a CNN pre-trained on a large-scale image repository in offline, our algorithm takes outputs from hidden layers of the network as feature descriptors since they show excellent representation performance in various general(More)
We propose a novel offline tracking algorithm based on model-averaged posterior estimation through patch matching across frames. Contrary to existing online and offline tracking methods, our algorithm is not based on temporally-ordered estimates of target state but attempts to select easy-to-track frames first out of the remaining ones without exploiting(More)
We present an online data association algorithm for multi-object tracking using structured prediction. This problem is formulated as a bipartite matching and solved by a generalized classification, specifically , Structural Support Vector Machines (S-SVM). Our structural clas-sifier is trained based on matching results given the similarities between all(More)
We present a novel approach to representing and recognizing composite video events. A composite event is specified by a scenario, which is based on primitive events and their temporal-logical relations, to constrain the arrangements of the primitive events in the composite event. We propose a new scenario description method to represent composite events(More)
This paper addresses the problem of automatically localizing dominant objects as spatio-temporal tubes in a noisy collection of videos with minimal or even no supervision. We formulate the problem as a combination of two complementary processes: discovery and tracking. The first one establishes correspondences between prominent regions across videos, and(More)
We present a joint estimation technique of event local-ization and role assignment when the target video event is described by a scenario. Specifically, to detect multi-agent events from video, our algorithm identifies agents involved in an event and assigns roles to the participating agents. Instead of iterating through all possible agent-role combinations(More)
We present an online video segmentation algorithm based on a novel nonparametric Bayesian clustering method called Bayesian Split-Merge Clustering (BSMC). BSMC can efficiently cluster dynamically changing data through split and merge processes at each time step, where the decision for splitting and merging is made by approximate posterior distributions over(More)