Alexandre Alahi

Learn More
A large number of vision applications rely on matching keypoints across images. The last decade featured an arms-race towards faster and more robust keypoints and association algorithms: Scale Invariant Feature Transform (SIFT)[17], Speed-up Robust Feature (SURF)[4], and more recently Binary Robust Invariant Scalable Keypoints (BRISK)[I6] to name a few.(More)
We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a per-pixel loss between the output and ground-truth images. Parallel work has shown that high-quality images can be generated by defining and optimizing(More)
Pedestrians follow different trajectories to avoid obstacles and accommodate fellow pedestrians. Any autonomous vehicle navigating such a scene should be able to foresee the future positions of pedestrians and accordingly adjust its path to avoid collisions. This problem of trajectory prediction can be viewed as a sequence generation task, where we are(More)
Online Multi-Object Tracking (MOT) has wide applications in time-critical video analysis scenarios, such as robot navigation and autonomous driving. In tracking-by-detection, a major challenge of online MOT is how to robustly associate noisy object detections on a new video frame with previously tracked objects. In this work, we formulate the online MOT(More)
This paper addresses the problem of localizing people in low and high density crowds with a network of heterogeneous cameras. The problem is recast as a linear inverse problem. It relies on deducing the discretized occupancy vector of people on the ground, from the noisy binary silhouettes observed as foreground pixels in each camera. This inverse problem(More)
Humans navigate crowded spaces such as a university campus by following common sense rules based on social etiquette. In this paper, we argue that in order to enable the design of new target tracking or trajectory forecasting methods that can take full advantage of these rules, we need to have access to better data in the first place. To that end, we(More)
We propose to evaluate our sparsity driven people localization framework on crowded complex scenes. The problem is recast as a linear inverse problem. It relies on deducing an occupancy vector, i.e. the discretized occupancy of people on the ground, from the noisy binary silhouettes observed as foreground pixels in each camera. This inverse problem is(More)
Most multi-camera systems assume a well structured environment to detect and track objects across cameras. Cameras need to be fixed and calibrated, or only objects within a training data can be detected (e.g. pedestrians only). In this work, a master-slave system is presented to detect and track any objects in a network of uncalibrated fixed and mobile(More)
A generic approach is presented to detect and track people with a network of fixed and omnidirectional cameras given severely degraded foreground silhouettes. The problem is formulated as a sparsity constrained inverse problem. A dictionary made of atoms representing the silhouettes of a person at a given location is used within the problem formulation. A(More)
Local Binary Descriptors are becoming more and more popular for image matching tasks, especially when going mobile. While they are extensively studied in this context, their ability to carry enough information in order to infer the original image is seldom addressed. In this work, we leverage an inverse problem approach to show that it is possible to(More)