Learn More
We address the problem of person detection and tracking in crowded video scenes. While the detection of individual objects has been improved significantly over the recent years, crowd scenes remain particularly challenging for the detection and tracking tasks due to heavy occlusions, high person densities and significant variation in people's appearance. To(More)
How should a video be represented? We propose a new representation for videos based on mid-level discriminative spatio-temporal patches. These spatio-temporal patches might correspond to a primitive human action, a semantic object, or perhaps a random but informative spatio-temporal patch in the video. What defines these spatio-temporal patches is their(More)
This paper presents a target tracking framework for unstructured crowded scenes. Unstructured crowded scenes are defined as those scenes where the motion of a crowd appears to be random with different participants moving in different directions over time. This means each spatial location in such scenes supports more than one, or multi-modal, crowd behavior.(More)
In this work we present a new crowd analysis algorithm powered by behavior priors that are learned on a large database of crowd videos gathered from the Internet. The algorithm works by first learning a set of crowd behavior priors off-line. During testing, crowd patches are matched to the database and behavior priors are transferred. We adhere to the(More)
In this paper we present a method for 3D human body pose reconstruction from images and video, in the context of sports legacy recovery. The video and image legacy content can include camera motion, several players, considerable partial occlusions, motion blur and image noise, recorded with non-calibrated cameras, which increases even more the difficulty of(More)
Action recognition represents one of the most difficult problems in computer vision given that it embodies the combination of several uncertain attributes, such as the subtle variability associated with individual human behavior and the challenges that come with viewpoint variations, scale changes and different temporal extents. Nevertheless, action(More)
We present a demonstration of a multi-modal 3D capturing platform coupled to a motion comparison system. This work is focused on the preservation of Traditional Sports and Games, namely the Gaelic sports from Ireland and Basque sports from France and Spain. Users can learn, compare and compete in the performance of sporting gestures and compare themselves(More)
What is a good vector representation of an object? We believe that it should be generative in 3D, in the sense that it can produce new 3D objects; as well as be predictable from 2D, in the sense that it can be perceived from 2D images. We propose a novel architecture, called the TL-embedding network, to learn an embedding space with these properties. The(More)
Sports are a key part of cultural identity, and it is necessary to preserve them as important intangible Cultural Heritage, especially the human motion techniques specific to individual sports. In this paper we present a method for extracting 3D athlete motion from video broadcast sources, providing an important tool for preserving the heritage represented(More)