From Large Scale Image Categorization to Entry-Level Categories

@article{Ordonez2013FromLS,
  title={From Large Scale Image Categorization to Entry-Level Categories},
  author={Vicente Ordonez and Jia Deng and Yejin Choi and Alexander C. Berg and Tamara L. Berg},
  journal={2013 IEEE International Conference on Computer Vision},
  year={2013},
  pages={2768-2775}
}
Entry level categories - the labels people will use to name an object - were originally defined and studied by psychologists in the 1980s. In this paper we study entry-level categories at a large scale and learn the first models for predicting entry-level categories for images. Our models combine visual recognition predictions with proxies for word "naturalness" mined from the enormous amounts of text on the web. We demonstrate the usefulness of our models for predicting nouns (entry-level… 

Figures and Tables from this paper

Predicting Entry-Level Categories
TLDR
Results for category mapping and entry-level category prediction for images show promise for producing more natural human-like labels and the potential applicability of the results to the task of image description generation is demonstrated.
Learning to name objects
TLDR
This paper looks at the problem of predicting category labels that mimic how human observers would name objects, related to the concept of entry-level categories first introduced by psychologists in the 1970s and 1980s.
Choosing Basic-Level Concept Names Using Visual and Language Context
TLDR
This study proposes methods for predicting basic-level names using a series of classification and ranking tasks, producing the first large scale catalogue of basic- level names for hundreds of thousands of images depicting thousands of visual concepts.
Open-vocabulary Object Retrieval
TLDR
This paper introduces a novel object retrieval method that can combine categoryand instance-level semantics in a common representation and shows that the approach can accurately retrieve objects based on extremely varied open-vocabulary queries.
Understanding object descriptions in robotics by open-vocabulary object retrieval and detection
TLDR
This work addresses the problem of retrieving and detecting objects based on open-vocabulary natural language queries by introducing a novel object retrieval method and proposes a method for handling open vocabularies, that is, words not contained in the training data.
Language and Perceptual Categorization in Computational Visual Recognition
VICENTE ORDÓÑEZ ROMÁN: LANGUAGE AND PERCEPTUAL CATEGORIZATION IN COMPUTATIONAL VISUAL RECOGNITION. (Under the direction of Tamara L. Berg.) Computational visual recognition or giving computers the
Diverse Concept-Level Features for Multi-Object Classification
TLDR
This work uses existing human knowledge, the application context itself and the human categorization mechanism to reflect complex relations between concepts to give good representation of image classification, even if some important concepts failed to be recognized.
Automatic Concept Discovery from Parallel Text and Visual Corpora
TLDR
An automatic visual concept discovery algorithm is proposed using parallel text and visual corpora, it filters text terms based on the visual discriminative power of the associated images, and groups them into concepts using visual and semantic similarities, and achieves the state-of-the-art performance in the retrieval task.
From Visual Attributes to Adjectives through Decompositional Distributional Semantics
TLDR
This work shows that it is possible to tag images with attribute-denoting adjectives even when no training data containing the relevant annotation are available, and automatically constructs attribute-centric representations that significantly improve performance in supervised object recognition.
Large Scale Retrieval and Generation of Image Descriptions
TLDR
The end result is two simple, but effective, methods for harnessing the power of big data to produce image captions that are altogether more general, relevant, and human-like than previous attempts.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 26 REFERENCES
What Does Classifying More Than 10, 000 Image Categories Tell Us?
TLDR
A study of large scale categorization including a series of challenging experiments on classification with more than 10,000 image classes finds that computational issues become crucial in algorithm design and conventional wisdom from a couple of hundred image categories does not necessarily hold when the number of categories increases.
Large scale visual recognition
TLDR
A novel learning technique is proposed that scales logarithmically with the number of classes in both training and testing, improving both accuracy and efficiency of the previous state of the art while reducing training time by 31 fold on 10 thousand classes.
BabyTalk: Understanding and Generating Simple Image Descriptions
TLDR
The proposed system to automatically generate natural language descriptions from images is very effective at producing relevant sentences for images and generates descriptions that are notably more true to the specific image content than previous work.
Baby talk: Understanding and generating simple image descriptions
TLDR
A system to automatically generate natural language descriptions from images that exploits both statistics gleaned from parsing large quantities of text data and recognition algorithms from computer vision that is very effective at producing relevant sentences for images.
SUN database: Large-scale scene recognition from abbey to zoo
TLDR
This paper proposes the extensive Scene UNderstanding (SUN) database that contains 899 categories and 130,519 images and uses 397 well-sampled categories to evaluate numerous state-of-the-art algorithms for scene recognition and establish new bounds of performance.
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition
TLDR
For certain classes that are particularly prevalent in the dataset, such as people, this work is able to demonstrate a recognition performance comparable to class-specific Viola-Jones style detectors.
LabelMe: A Database and Web-Based Tool for Image Annotation
TLDR
A web-based tool that allows easy image annotation and instant sharing of such annotations is developed and a large dataset that spans many object categories, often containing multiple instances over a wide variety of images is collected.
Pictures and names: Making the connection
ImageNet: A large-scale hierarchical image database
TLDR
A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Collective Generation of Natural Image Descriptions
TLDR
A holistic data-driven approach to image description generation, exploiting the vast amount of (noisy) parallel image data and associated natural language descriptions available on the web to generate novel descriptions for query images.
...
1
2
3
...