Full interpretation of minimal images

@article{Benyosef2018FullIO,
  title={Full interpretation of minimal images},
  author={Guy Ben-yosef and Liav Assif and Shimon Ullman},
  journal={Cognition},
  year={2018},
  volume={171},
  pages={65-84}
}

Structured learning and detailed interpretation of minimal object images

TLDR
This work model the process of human full interpretation of object images, namely the ability to identify and localize all semantic features and parts that are recognized by human observers by considering reduced local regions that are minimal in the sense that further reduction will turn them unrecognizable and uninterpretable.

Image interpretation above and below the object level

TLDR
Recent directions are described, based on human and computer vision studies, towards human-like image interpretation, beyond the reach of current schemes, both below the object level, as well as some aspects of image interpretation at the level of meaningful configurations beyond the recognition of individual objects, and in particular, interactions between two people in close contact.

A model for interpreting social interactions in local image regions

TLDR
This work discusses the integration of minimal configurations in recognizing social interactions in a detailed, high-resolution image.

What can human minimal videos tell us about dynamic recognition models?

On the Minimal Recognizable Image Patch

TLDR
This work proposes characterizing, empirically, the algorithmic limits by finding a minimal recognizable patch (MRP) that is by itself sufficient to recognize the image.

Oculo-retinal dynamics can explain the perception of minimal recognizable configurations

TLDR
This work recorded eye movements of participants attempting to recognize images that are just above and below the threshold of human recognition and modeled the activation patterns resulting from the continuous interactions of eye movements with the viewed image to suggest that vision is mediated by continuous interactions between eye movements and the environment, resulting in dynamic oculo-retinal coding.

Complex Relations in a Deep Structured Prediction Model for Fine Image Segmentation

TLDR
This work incorporates two relations that were shown to be useful to human object identification - containment and attachment - into the energy term of the CRF and shows that the segmentation of fine parts is positively affected by the addition of these two relations, and can be further influenced by complex structural features.

What takes the brain so long: Object recognition at the level of minimal images develops for up to seconds of presentation time

TLDR
The time trajectory of the recognition process at the level of minimal recognizable images (termed MIRC) is studied, finding that in the masked conditions, recognition rates develop gradually over an extended period, e.g. average of 18% for 200 ms exposure and 45% for 500 ms, increasing significantly with longer exposure even above 2 secs.

Robot Manipulation in Open Environments: New Perspectives

TLDR
The case for a new approach to the problem of performing everyday manipulation tasks robustly in open environments is presented, based on three mutually dependent ideas: highly transferable manipulation skills; choice of representation: a scene can be modeled in several different ways; and top-down processes by which the robot’s task can influence the bottom-up processes interpreting a scene.

A Study of Dramatic Action and Emotion Using a Systematic Scan of Stick Figure Configurations

Comprehending the meaning of body postures is essential for social organisms such as humans. For example, it is important to understand at a glance whether two people seen at a distance are in a

References

SHOWING 1-10 OF 87 REFERENCES

A model for full local image interpretation

TLDR
A computational model of humans' ability to provide a detailed interpretation of a scene’s components is described, which suggests that detailed interpretation is beyond the scope of current models of visual recognition.

Recognition-by-components: a theory of human image understanding.

TLDR
Recognition-by-components (RBC) provides a principled account of the heretofore undecided relation between the classic principles of perceptual organization and pattern recognition.

How many pixels make an image?

  • A. Torralba
  • Environmental Science
    Visual Neuroscience
  • 2009
TLDR
It is shown that very small thumbnail images at the spatial resolution of 32 × 32 color pixels provide enough information to identify the semantic category of real-world scenes and permit observers to report four to five of the objects that the scene contains, despite the fact that some of these objects are unrecognizable in isolation.

Visual routines

Extracting Subimages of an Unknown Category from a Set of Images

  • S. TodorovicN. Ahuja
  • Mathematics
    2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
  • 2006
TLDR
This paper aims at simultaneously solving the following related problems: unsupervised identification of photometric, geometric, and topological properties of multiscale regions defining objects in the category and learning a region-based structural model of the category in terms of these properties from a set of training images.

Representation and recognition of the spatial organization of three-dimensional shapes

  • D. MarrH. Nishihara
  • Computer Science
    Proceedings of the Royal Society of London. Series B. Biological Sciences
  • 1978
The human visual process can be studied by examining the computational problems associated with deriving useful information from retinal images. In this paper, we apply this approach to the problem

Formation of visual “objects” in the early computation of spatial relations

  • J. Feldman
  • Psychology
    Perception & psychophysics
  • 2007
TLDR
A study of the spatial factors and time-course of the development of objects over the course of the first few hundred milliseconds of visual processing to report a vivid picture of the chronology of object formation.

Atoms of recognition in human and computer vision

TLDR
This work shows by combining a novel method (minimal images) and simulations that the human recognition system uses features and learning processes, which are critical for recognition, but are not used by current models.

Pictorial Structures for Object Recognition

TLDR
A computationally efficient framework for part-based modeling and recognition of objects, motivated by the pictorial structure models introduced by Fischler and Elschlager, that allows for qualitative descriptions of visual appearance and is suitable for generic recognition problems.

Unsupervised Learning of Probabilistic Grammar-Markov Models for Object Categories

TLDR
A Probabilistic grammar-Markov model (PGMM) which couples probabilistic context free grammars and Markov random fields is introduced which is generally comparable with the current state of the art, and the inference is performed in less than five seconds.
...