Marco Bertini

Learn More
Automatic semantic annotation of video streams allows both to extract significant clips for production logging and to index video streams for posterity logging. Automatic annotation for production logging is particularly demanding, as it is applied to non-edited video streams and must rely only on visual information. Moreover, annotation must be computed in(More)
In this paper we propose an approach for anomaly detection and localization, in video surveillance applications, based on spatio-temporal features that capture scene dynamic statistics together with appearance. Real-time anomaly detection is performed with an unsupervised approach using a nonparametric modeling, evaluating directly multi-scale local(More)
Retrieval by content of 3-D models is becoming more and more important due to the advancements in 3-D hardware and software technologies for acquisition, authoring and display of 3-D objects, their ever-increasing availability at affordable costs, and the establishment of open standards for 3-D data interchange. In this paper, we present a new method,(More)
E€ective retrieval and browsing by content of videos is based on the association of high-level information with visual data. Automatic extraction of high-level content descriptors requires to exploit the technical characteristics of video types. This paper comprises a complete system for content-based retrieval and browsing of news reports; the annotation(More)
In soccer videos, most significant actions are usually followed by close–up shots of players that take part in the action itself. Automatically annotating the identity of the players present in these shots would be considerably valuable for indexing and retrieval applications. Due to high variations in pose and illumination across shots however, current(More)
In this paper we describe a system for detection and retrieval of trademarks appearing in sports videos. We propose a compact representation of trademarks and video frame content based on SIFT feature points. This representation can be used to robustly detect, localize, and retrieve trademarks as they appear in a variety of different sports video types.(More)
Video databases require that clips are represented in a compact and discriminative way, in order to perform efficient matching and retrieval of documents of interest. We present a method to obtain a video representation suitable for this task, and show how to use this representation in a matching scheme. In contrast with existing works, the proposed(More)
Event recognition is a crucial task to provide high-level semantic description of the video content. The bag-of-words (BoW) approach has proven to be successful for the categorization of objects and scenes in images, but it is unable to model temporal information between consecutive frames. In this paper we present a method to introduce temporal information(More)