Michele Fenzi

Learn More
We present a novel method for multiple people tracking that leverages a generalized model for capturing interactions among individuals. At the core of our model lies a learned dictionary of interaction feature strings which capture relationships between the motions of targets. These feature strings, created from low-level image features, lead to a much(More)
In this paper, we propose a method for learning a class representation that can return a continuous value for the pose of an unknown class instance using only 2D data and weak 3D labeling information. Our method is based on generative feature models, i.e., regression functions learned from local descriptors of the same patch collected under different(More)
Pose estimation for object classes is central in many Computer Vision tasks. Many approaches have been proposed to estimate the pose of an unknown object from a given category, and those based on local features have shown to be very effective. While some use 3D information obtained through CAD models [4] or 3D reconstructions [2], others have shown that(More)
We present a feature-based framework that combines spatial feature clustering, guided sampling for pose generation, and model updating for 3D object recognition and pose estimation. Existing methods fails in case of repeated patterns or multiple instances of the same object, as they rely only on feature discriminability for matching and on the estimator(More)
In this paper, we treat the problem of continuous pose estimation for object categories as a regression problem on the basis of only 2D training information. While regression is a natural framework for continuous problems, regression methods so far achieved inferior results with respect to 3D-based and 2D-based classification-and-refinement approaches. This(More)
In this paper we propose a method to consistently recover the pose of an object from a known class in a video sequence. As individual poses estimated from monocular images are rather noisy, we optimally aggregate pose evidence over all video frames. We construct a graph where nodes are values sampled from the pose posterior distributions computed by a(More)
We present a feature-based surveillance pipeline which, in contrast to traditional image-based methods, allows to learn a detailed description of the observed background as well as of foreground objects. The pipeline consists of motion segmentation of feature trajectories and subsequent tracking-by-recognition with updates. Furthermore, 3D object(More)
Many complex maneuvers involving aircraft, vehicles and persons are carried out at airport aprons. Manual video surveillance used for safety and security purposes is inefficient and privacy protection must be guaranteed. In this paper, we propose a system named ASEV that automatically assesses situations for airport surveillance. It combines four main(More)