Learn More
Many successful models for scene or object recognition transform low-level descriptors (such as Gabor filter responses , or SIFT descriptors) into richer representations of intermediate complexity. This process can often be broken down into two steps: (1) a coding step, which performs a pointwise transformation of the descriptors into a representation(More)
We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a point-wise sigmoid non-linearity, and a feature-pooling layer that computes the max of each filter output within adjacent windows. A(More)
Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g.(More)
Many modern visual recognition algorithms incorporate a step of spatial 'pooling', where the outputs of several nearby feature detectors are combined into a local or global 'bag of features', in a way that preserves task-related information while removing irrelevant details. Pooling is used to achieve invariance to image transformations , more compact(More)
We propose an unsupervised method for learning multi-stage hierarchies of sparse convolutional features. While sparse coding has become an increasingly popular method for learning visual features, it is most often trained at the patch level. Applying the resulting filters convolutionally results in highly redundant codes because overlapping patches are(More)
This supplemental material contains numerical results of experiments plotted in the main paper, and results from the Caltech-256 and Scenes datasets that were omitted due to space constraints. Accuracy as a function of whether clustering is performed before (Pre) or after (Post) the encoding, K: dictionary size, and P : number of configuration space bins.
Affective valence lies on a spectrum ranging from punishment to reward. The coding of such spectra in the brain almost always involves opponency between pairs of systems or structures. There is ample evidence for the role of dopamine in the appetitive half of this spectrum, but little agreement about the existence, nature, or role of putative aversive(More)
We introduce a view of unsupervised learning that integrates probabilistic and non-probabilistic methods for clustering, dimen-sionality reduction, and feature extraction in a unified framework. In this framework, an energy function associates low energies to input points that are similar to training samples , and high energies to unobserved points.(More)
Many different situations related to self control involve competition between two routes to decisions: default and frugal versus more resource-intensive. Examples include habits versus deliberative decisions, fatigue versus cognitive effort, and Pavlovian versus instrumental decision making. We propose that these situations are linked by a strikingly(More)