Learn More
This paper focuses on the integration of multimodal features for sport video structure analysis. The method relies on a statistical model which takes into account both the shot content and the interleaving of shots. This stochastic modelling is performed in the global framework of Hidden Markov Models (HMMs) that can be efficiently applied to merge audio(More)
This paper focuses on the use of Hidden Markov Models (HMMs) for structure analysis of videos, and demonstrates how they can be efficiently applied to merge audio and visual cues. Our approach is validated in the particular domain of tennis videos. The basic temporal unit is the video shot. Visual features are used to characterize the type of shot view.(More)
We introduce a new image coder which uses the Iteration Tuned and Aligned Dictionary (ITAD) as a transform to code image blocks taken over a regular grid. We establish experimentally that the ITAD structure results in lower-complexity representations that enjoy greater sparsity when compared to other recent dictionary structures. We show that this superior(More)
This work aims at recovering the temporal structure of a broadcast tennis video from an analysis of the raw footage. Our method relies on a statistical model of the interleaving of shots, in order to group shots into predefined classes representing structural elements of a tennis video. This stochastic modeling is performed in the global framework of Hidden(More)
Content-Based Image Retrieval Systems used in forensics related contexts require very good image recognition capabilities. Therefore they often use the SIFT local-feature description scheme as its robustness against a large spectrum of image distortions has been assessed. In contrast, the <i>security</i> of SIFT is still largely unexplored. We show in this(More)
Many content-based retrieval systems (CBIRS) describe images using the SIFT local features because of their very robust recognition capabilities. While SIFT features proved to cope with a wide spectrum of general purpose image distortions, its security has not fully been assessed yet. In one of their scenario, Hsu <i>et al.</i> in [2] show that very(More)
Facing a huge amount of multimedia information available today, it becomes inevitably necessary to develop efficient methods for accessing, searching, structuring, and representing it. Multimedia retrieval systems especially in the case of video should support users in all of these tasks. Therefore, specialized systems that focus on each of these aspects(More)
We present a new, block-based image codec based on sparse representations using a learned, structured dictionary called the Iteration-Tuned and Aligned Dictionary (ITAD). The question of selecting the number of atoms used in the representation of each image block is addressed with a new, global (image-wide), rate-distortion-based sparsity selection(More)