• Corpus ID: 2273536

A model for full local image interpretation

@article{Benyosef2015AMF,
  title={A model for full local image interpretation},
  author={Guy Ben-yosef and Liav Assif and Daniel Harari and Shimon Ullman},
  journal={ArXiv},
  year={2015},
  volume={abs/2110.08744}
}
We describe a computational model of humans' ability to provide a detailed interpretation of a scene’s components. Humans can identify in an image meaningful components almost everywhere, and identifying these components is an essential part of the visual process, and of understanding the surrounding scene and its potential meaning to the viewer. Detailed interpretation is beyond the scope of current models of visual recognition. Our model suggests that this is a fundamental limitation, related… 
Full interpretation of minimal images
TLDR
This work model the process of 'full interpretation' of object images, which is the ability to identify and localize all semantic features and parts that are recognized by human observers, by identifying primitive components and relations that play a useful role in local interpretation by humans.
Image interpretation above and below the object level
TLDR
Recent directions are described, based on human and computer vision studies, towards human-like image interpretation, beyond the reach of current schemes, both below the object level, as well as some aspects of image interpretation at the level of meaningful configurations beyond the recognition of individual objects, and in particular, interactions between two people in close contact.
Memo No . 061 July 31 , 2017 Full interpretation of minimal images
The goal in this work is to model the process of ‘full interpretation’ of object images, which is the ability to identify and localize all semantic features and parts that are recognized by human
A model for interpreting social interactions in local image regions
TLDR
This work discusses the integration of minimal configurations in recognizing social interactions in a detailed, high-resolution image.
CBMM Memo No . 061 February 8 , 2017 Full interpretation of minimal images
The goal in this work is to model the process of ‘full interpretation’ of object images, which is the ability to identify and localize all semantic features and parts that are recognized by human
Vision by alignment
TLDR
This work proposes an alignment model for vision, in which computational specialists eagerly share state with their neighbors during ongoing computations, availing themselves of neighbors’ partial results in order to ll gaps in evolving descriptions, and predicts that this alignment process accounts for vision’s robust attributes.
OBJECT CONTOUR COMPLETION BY COMBINING OBJECT RECOGNITION AND LOCAL EDGE CUES
TLDR
It is argued that top-down bottom-up interaction architecture has plausible neurological correlates and has an advantage in that it does not require learning boundaries with large datasets.

References

SHOWING 1-10 OF 26 REFERENCES
A Boundary-Fragment-Model for Object Detection
TLDR
The BFM detector is able to represent and detect object classes principally defined by their shape, rather than their appearance, and to achieve this with less supervision (such as the number of training images).
HOP: Hierarchical object parsing
TLDR
This paper uses an hierarchical object model, that recursively decomposes an object into simple structures, and exploits its hierarchical object representation to efficiently compute a coarse solution to the object parsing problem, which is used to guide search at a finer level.
HOP: Hierarchical object parsing
  • I. Kokkinos, A. Yuille
  • Computer Science
    2009 IEEE Conference on Computer Vision and Pattern Recognition
  • 2009
TLDR
This paper uses an hierarchical object model, that recursively decomposes an object into simple structures, and exploits its hierarchical object representation to efficiently compute a coarse solution to the object parsing problem, which is used to guide search at a finer level.
Learning AND-OR Templates for Object Recognition and Detection
TLDR
This paper shows that both the structures and parameters of the AOT model can be learned in an unsupervised way from images using an information projection principle, and proposes a number of ways to evaluate the performance of the learned AOTs through both synthesized examples and real-world images.
Aggregating local descriptors into a compact image representation
TLDR
This work proposes a simple yet efficient way of aggregating local image descriptors into a vector of limited dimension, which can be viewed as a simplification of the Fisher kernel representation, and shows how to jointly optimize the dimension reduction and the indexing algorithm.
Semantic contours from inverse detectors
TLDR
A simple yet effective method for combining generic object detectors with bottom-up contours to identify object contours is presented and a principled way of combining information from different part detectors and across categories is provided.
ImageNet Large Scale Visual Recognition Challenge
TLDR
The creation of this benchmark dataset and the advances in object recognition that have been possible as a result are described, and the state-of-the-art computer vision accuracy with human accuracy is compared.
Responses to contour features in macaque area V4.
TLDR
The results suggest that V4 processes information about contour features as a step toward complex shape recognition, and a strong bias toward convex features implies a neural basis for the well-known perceptual dominance of convexity.
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
TLDR
This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF).
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
TLDR
This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.
...
1
2
3
...