Multimodal saliency-based attention for object-based scene analysis
In this paper we propose a system architecture that extends the current state-of-the-art in computational visual attention by incorporating the biological concept of ventral attention. According to recent findings regarding the neurobiological foundations of attention, there exist two separate but interacting attention systems in the human brain: the dorsal attention system and the ventral attention system. As opposed to the well-known computational concepts of bottom-up and top-down saliency, which both correspond to the dorsal attention system, the ventral attention system is sensitive to behavior-relevant stimuli that are unexpected (i.e. not top-down salient), independent of their perceptual saliency (bottom-up saliency). This results in a dynamic interplay between top-down saliency, bottom-up saliency and ventral attention in the proposed system architecture, enabling the system to redirect its focus of attention to important stimuli while being absorbed in a task, even if their perceptual saliency is low. Our technical system instance implementing the proposed architecture integrates several state-of-the-art methods in a coherent system and concentrates on unexpected motion as a first technical account of ventral attention. In our experiments, we demonstrate that the ventral attention enables our system to detect and reorient to important situations in real-world traffic environments that are relevant for the behavior of driving.