Learn More
In this paper we propose an approach for anomaly detection and localization, in video surveillance applications, based on spatio-temporal features that capture scene dynamic statistics together with appearance. Real-time anomaly detection is performed with an unsupervised approach using a non-parametric modeling, evaluating directly multi-scale local(More)
—Retrieval by content of 3-D models is becoming more and more important due to the advancements in 3-D hardware and software technologies for acquisition, authoring and display of 3-D objects, their ever-increasing availability at affordable costs, and the establishment of open standards for 3-D data interchange. In this paper, we present a new method,(More)
W hile understanding the semantic meaning of video content is immediate for humans, it's far from immediate for a computer. This discrepancy is commonly referred to as the semantic gap. A recent trend in the effort to bridge this gap is to define a large set of semantic concept detectors, each of which automatically detects the presence of a semantic(More)
Where previous reviews on content-based image retrieval emphasize what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems (i.e., image tag assignment, refinement, and tag-based image retrieval) is presented. While existing works vary in terms of(More)
Automatic semantic annotation of video streams allows both to extract significant clips for production logging and to index video streams for posterity logging. Automatic annotation for production logging is particularly demanding, as it is applied to non-edited video streams and must rely only on visual information. Moreover, annotation must be computed in(More)
Video databases require that clips are represented in a compact and discriminative way, in order to perform efficient matching and retrieval of documents of interest. We present a method to obtain a video representation suitable for this task, and show how to use this representation in a matching scheme. In contrast with existing works, the proposed(More)
—In semantic video adaptation measures of performance must consider the impact of the errors in the automatic annotation over the adaptation in relationship with the preferences and expectations of the user. In this paper, we define two new performance measures Viewing Quality Loss and Bit-rate Cost Increase, that are obtained from classical peak(More)
In this paper we propose a new method for human action categorization by using an effective combination of a new 3D gradient descriptor with an optic flow descriptor, to represent spatio-temporal interest points. These points are used to represent video sequences using a bag of spatio-temporal visual words, following the successful results achieved in(More)
Semantic detection and recognition of objects and events contained in a video stream has to be performed in order to provide content-based annotation and retrieval of videos. This annotation is done as a means to be able to reuse the video material at a later stage, e.g. to produce new TV programmes. A typical example is that of sports videos, where videos(More)