Clément Farabet

Learn More
Scene labeling consists of labeling each pixel in an image with the category of the object it belongs to. We propose a method that uses a multiscale convolutional network trained from raw pixels to extract dense feature vectors that encode regions of multiple sizes centered on each pixel. The method alleviates the need for engineered features, and produces(More)
Torch7 is a versatile numeric computing framework and machine learning library that extends Lua. Its goal is to provide a flexible environment to design and train learning machines. Flexibility is obtained via Lua, an extremely lightweight scripting language. High performance is obtained via efficient OpenMP/SSE and CUDA implementations of low-level numeric(More)
This work addresses multi-class segmentation of indoor scenes with RGB-D inputs. While this area of research has gained much attention recently, most works still rely on hand-crafted features. In contrast, we apply a multiscale convolutional network to learn features directly from the images and the depth information. We obtain state-of-the-art on the(More)
Intelligent tasks, such as visual perception, auditory perception, and language understanding require the construction of good internal representations of the world (or ”features”), which must be invariant to irrelevant variations of the input while, preserving relevant information. A major question for Machine Learning is how to learn such good features(More)
Convolutional Networks (ConvNets) are biologicallyinspired hierarchical architectures that can be trained to perform a variety of detection, recognition and segmentation tasks. ConvNets have a feed-forward architecture consisting of multiple linear convolution filters interspersed with pointwise non-linear squashing functions. This paper presents an(More)
Scene parsing, or semantic segmentation, consists in labeling each pixel in an image with the category of the object it belongs to. It is a challenging task that involves the simultaneous detection, segmentation and recognition of all the objects in the image. The scene parsing method proposed here starts by computing a tree of segments from a graph of(More)
In this paper we present a scalable dataflow hardware architecture optimized for the computation of generalpurpose vision algorithms—neuFlow—and a dataflow compiler—luaFlow—that transforms high-level flow-graph representations of these algorithms into machine code for neuFlow. This system was designed with the goal of providing real-time detection,(More)
This paper presents a bio-inspired vision systemon-a-chip neuFlow SoC implemented in the IBM 45 nm SOI process. The neuFlow SoC was designed to accelerate neural networks and other complex vision algorithms based on large numbers of convolutions and matrix-to-matrix operations. Postlayout characterization shows that the system delivers up to 320 GOPS with(More)
In this paper we present a scalable hardware architecture to implement large-scale convolutional neural networks and state-of-the-art multi-layered artificial vision systems. This system is fully digital and is a modular vision engine with the goal of performing real-time detection, recognition and segmentation of mega-pixel images. We present a performance(More)