Visual attention strategies for target object detection

  title={Visual attention strategies for target object detection},
  author={Ibrahim M. H. Rahman},
The human visual attention system (HVA) encompasses a set of interconnected neurological modules that are responsible for analyzing visual stimuli by attending to those regions that are salient. Two contrasting biological mechanisms exist in the HVA systems; bottom-up, data-driven attention and top-down, task-driven attention. The former is mostly responsible for low-level instinctive behaviors, while the latter is responsible for performing complex visual tasks such as target object detection… 
Salient Motion Features for Visual Attention Models
This paper has integrated motion features with the visual computational model to detect salient moving objects using arithmetic operation and concludes that multiplication is a better operation for feature integration.


An Attentional System Combining Top-Down and Bottom-Up Influences
A model that learns an optimal representation of the influences of task and context and thereby constructs a biased saliency map representing the top-down information and is applied to search tasks in single images as well as in real scenes, in the latter case using an active vision system capable of shifting its gaze.
Top-Down Biasing and Modulation for Object-Based Visual Attention
This work presents a new object-based visual attention model with bottom-up and top-down features, composed of five main modules which are responsible for the extraction of the visual features, image segmentation, object recognition, object-saliency map, and object selection.
Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention
  • R. Peters, L. Itti
  • Psychology, Biology
    2007 IEEE Conference on Computer Vision and Pattern Recognition
  • 2007
This study demonstrates the advantages of integrating BU factors derived from a saliency map and TD factors learned from image and task contexts in predicting where humans look while performing complex visually-guided behavior.
Bayesian Modeling of Visual Attention
This paper proposes a Bayesian approach that explains the optimal integration of top-down cues and bottom-up cues and demonstrates that the proposed visual saliency effectively predicts human gaze in free-viewing of natural scenes.
VOCUS: A Visual Attention System for Object Detection and Goal-Directed Search
  • S. Frintrop
  • Computer Science
    Lecture Notes in Computer Science
  • 2006
This monograph introduces the biologically motivated computational attention system VOCUS (Visual Object detection with a Computational attention System) that detects regions of interest in images that provides a powerful approach to improve existing vision systems by concentrating computational resources to regions that are more likely to contain relevant information.
Modeling Visual Attention via Selective Tuning
An Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed
  • Vidhya Navalpakkam, L. Itti
  • Psychology
    2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
  • 2006
Testing on 750 artificial and natural scenes shows that the model’s predictions are consistent with a large body of available literature on human psychophysics of visual search, suggesting that it may provide good approximation of how humans combine bottom-up and top-down cues.
A coherent computational approach to model bottom-up visual attention
This paper presents a coherent computational approach to the modeling of the bottom-up visual attention, mainly based on the current understanding of the HVS behavior, which includes Contrast sensitivity functions, perceptual decomposition, visual masking, and center-surround interactions.
Interactions of Top-Down and Bottom-Up Mechanisms in Human Visual Cortex
The findings suggest that the strength of attentional modulation in the visual system is constrained by the degree to which competitive interactions have been resolved by bottom-up processes related to the segmentation of scenes into candidate objects.
Predicting visual fixations on video based on low-level visual features