Acquiring Visual Classifiers from Human Imagination

@article{Vondrick2014AcquiringVC,
  title={Acquiring Visual Classifiers from Human Imagination},
  author={Carl Vondrick and Hamed Pirsiavash and Aude Oliva and Antonio Torralba},
  journal={ArXiv},
  year={2014},
  volume={abs/1410.4627}
}
Abstract : The human mind can remarkably imagine objects that it has never seen, touched, or heard, all in vivid detail. Motivated by the desire to harness this rich source of information from the human mind, this paper investigates how to extract classifiers from the human visual system and leverage them in a machine. We introduce a method that, inspired by wellknown tools in human psychophysics, estimates the classifier that the human visual system might use for recognition but in computer… 

Using human brain activity to guide machine learning

TLDR
This work demonstrates a new paradigm of “neurally-weighted” machine learning, which takes fMRI measurements of human brain activity from subjects viewing images, and infuses these data into the training process of an object recognition learning algorithm to make it more consistent with the human brain.

Biologically Inspired Keypoints

TLDR
This chapter describes methods to extract and represent biologically inspired keypoints and shows how to reconstruct a keypoint descriptor to qualitatively analyze its behavior.

Do Distributed Semantic Models Dream of Electric Sheep? Visualizing Word Representations through Image Synthesis

TLDR
This work introduces the task of visualizing distributed semantic representations by generating images from word vectors by means of a cross-modal mapping function, and proposes a baseline dream synthesis method based on averaging pictures whose visual representations are topologically close to the mapped vector.

Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation

TLDR
Language-driven image generation, the task of generating an image visualizing the semantic contents of a word embedding, is introduced, and a simple method based on two mapping functions is implemented to generate the target image.

Identification des indices acoustiques utilisés lors de la compréhension de la parole dégradée

Bien qu’il existe un large consensus de la communaute scientifique quant au role des indices acoustiques dans la comprehension de la parole, les mecanismes exacts permettant la transformation d’un

References

SHOWING 1-10 OF 42 REFERENCES

Visual Recognition with Humans in the Loop

TLDR
The results demonstrate that incorporating user input drives up recognition accuracy to levels that are good enough for practical applications, while at the same time, computer vision reduces the amount of human interaction required.

Culture Shapes How We Look at Faces

TLDR
These results demonstrate that face processing can no longer be considered as arising from a universal series of perceptual events, and the strategy employed to extract visual information from faces differs across cultures.

Undoing the Damage of Dataset Bias

TLDR
Overall, this work finds that it is beneficial to explicitly account for bias when combining multiple datasets, and proposes a discriminative framework that directly exploits dataset bias during training.

Cultural variation in eye movements during scene perception.

TLDR
Eye movements of American and Chinese participants while they viewed photographs with a focal object on a complex background indicated that Americans fixated more on focal objects than did the Chinese, and the Americans tended to look at the focal object more quickly.

HOGgles: Visualizing Object Detection Features

TLDR
Algorithms to visualize feature spaces used by object detectors allow a human to put on 'HOG goggles' and perceive the visual world as a HOG based object detector sees it, and allow us to analyze object detection systems in new ways and gain new insight into the detector's failures.

Classification images: A review.

  • R. Murray
  • Computer Science
    Journal of vision
  • 2011
TLDR
Key developments in classification image methods are described, including use of optimal weighted sums based on the linear observer model, formulation of classification images in terms of the generalized linear model, development of statistical tests, and use of priors to reduce dimensionality.

One-shot learning of object categories

TLDR
It is found that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition

TLDR
DeCAF, an open-source implementation of deep convolutional activation features, along with all associated network parameters, are released to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.

Unbiased look at dataset bias

TLDR
A comparison study using a set of popular datasets, evaluated based on a number of criteria including: relative data bias, cross-dataset generalization, effects of closed-world assumption, and sample value is presented.