Nicolas Thome

Learn More
In this paper, we address the challenging problem of categorizing video sequences composed of dynamic natural scenes. Contrarily to previous methods that rely on handcrafted descriptors, we propose here to represent videos using unsupervised learning of motion features. Our method encompasses three main contributions: 1) Based on the Slow Feature Analysis(More)
Automatic detection of a falling person in video sequences has interesting applications in video -surveillance and is an important part of future pervasive home monitoring systems. In this paper, we propose a multiview approach to achieve this goal, where motion is modeled using a layered hidden Markov model (LHMM). The posture classification is performed(More)
Recently, the coding of local features (e.g. SIFT) for image categorization tasks has been extensively studied. Incorporated within the Bag of Words (BoW) framework, these techniques optimize the projection of local features into the visual codebook, leading to state-of-theart performances in many benchmark datasets. In this work, we propose a novel visual(More)
In this work we introduced SnooperTrack, an algorithm for the automatic detection and tracking of text objects — such as store names, traffic signs, license plates, and advertisements — in videos of outdoor scenes. The purpose is to improve the performances of text detection process in still images by taking advantage of the temporal coherence(More)
In this work, we propose BossaNova, a novel representation for contentbased concept detection in images and videos, which enriches the Bag-of-Words model. Relying on the quantization of highly discriminant local descriptors by a codebook, and the aggregation of those quantized descriptors into a single pooled feature vector, the Bag-of-Words model has(More)
License Plate Recognition (LPR) is mainly regarded as a solved problem. However, robust solutions able to face real-world scenarios still need to be proposed. Country-specific systems are mostly, designed, which can (artificially) reach high-level recognition rates. This option, however, strictly limits their applicability. In this paper, we propose an(More)
In image classification, the most powerful statistical learning approaches are based on the Bag-of-Words paradigm. In this article, we propose an extension of this formalism. Considering the Bag-of-Features, dictionary coding and pooling steps, we propose to focus on the pooling step. Instead of using the classical sum or max pooling strategies, we(More)
We discuss the use of histogram of oriented gradients (HOG) descriptors as an effective tool for text description and recognition. Specifically, we propose a HOG-based texture descriptor (THOG) that uses a partition of the image into overlapping horizontal cells with gradual boundaries, to characterize single-line texts in outdoor scenes. The input of our(More)
This paper proposes a modification to the restricted Boltzmann machine (RBM) learning algorithm to incorporate inductive biases. These latent activation biases are ideal solutions of the latent activity and may be designed either by modeling neural phenomenon or inductive principles of the task. In this paper, we design activation biases for sparseness and(More)
Automatic detection of a falling person in video sequences is an important part of future pervasive home monitoring systems. We propose here a robust method to achieve this goal. Motion is modeled by a hierarchical hidden Markov model (HHMM) whose first layer states are related to the orientation of the tracked person. Finding a consistent way for robustly(More)