Automatic foveation for video compression using a neurobiological model of visual attention

@article{Itti2004AutomaticFF,
  title={Automatic foveation for video compression using a neurobiological model of visual attention},
  author={Laurent Itti},
  journal={IEEE Transactions on Image Processing},
  year={2004},
  volume={13},
  pages={1304-1318}
}
  • L. Itti
  • Published 1 October 2004
  • Computer Science
  • IEEE Transactions on Image Processing
We evaluate the applicability of a biologically-motivated algorithm to select visually-salient regions of interest in video streams for multiply-foveated video compression. Regions are selected based on a nonlinear integration of low-level visual cues, mimicking processing in primate occipital, and posterior parietal cortex. A dynamic foveation filter then blurs every frame, increasingly with distance from salient locations. Sixty-three variants of the algorithm (varying number and shape of… 

Figures and Tables from this paper

Visual saliency guided video compression algorithm
Bayesian Integration of Face and Low-Level Cues for Foveated Video Coding
TLDR
A Bayesian model that allows to automatically generate flxations/foveations and that can be suitably exploited for compression purposes is presented, which has been evaluated with respect to both the perceived quality of foveated video clips and the compression gain.
Foveated mean squared error—a novel video quality metric
TLDR
A new video quality metric called Foveated Mean Squared Error (FMSE) is proposed that takes into account a variable resolution of the HVS across the visual field, and utilizes the effect of additional spatial acuity reduction due to motion in a video sequence.
Effect of compressed offline foveated video on viewing behavior and subjective quality
TLDR
Results showed that, although offline foveation prior to encoding with H.264 yielded data reductions up to 52% (20% average) on the tested videos, it had little or no effect on where people looked, their intersubject dispersion, fixation duration, saccade amplitude, or the experienced quality during first-time viewing.
Attention-based video streaming
Predicting the region of interest for dynamic foveated streaming
TLDR
A prediction model which uses streaming client's gaze locations on a set of frames to predict the fovea region on future frames is proposed and is achieved 10× higher prediction accuracy compared to the offline model.
Video Processing for Human Perceptual Visual Quality-Oriented Video Coding
TLDR
A video processing method that achieves human perceptual visual quality-oriented video coding and shows reliable improvements in the perceptual quality for various sequences and at various bandwidths, compared to existing saliency-based video coding methods.
Visual Saliency in Video Compression and Transmission
TLDR
A computationally-efficient method for visual saliency estimation in digital images and videos is developed, which approximates one of the most well-known visual Saliency models.
Spatiotemporal Visual Considerations for Video Coding
TLDR
A visual measure is proposed for the purpose of video compressions that combines the motion attention model, unconstrained eye-movement incorporated spatiovelocity visual sensitivity model, and visual masking model and exhibits the effectiveness in improving coding performance without picture quality degradation.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 74 REFERENCES
Real-time foveated multiresolution system for low-bandwidth video communication
TLDR
This work has developed a foveated multiresolution pyramid video coder/decoder which runs in real-time on a general purpose computer and includes zero-tree coding.
Automatic detection of regions of interest in complex video sequences
TLDR
A novel model of visual attention designed to provide an accurate and robust prediction of a viewer's locus of attention across a wide range of typical video content is described.
Foveation scalable video coding with automatic fixation selection
TLDR
A foveation scalable video coding (FSVC) algorithm which supplies good quality-compression performance as well as effective rate scalability, and is adaptable to different applications, such as knowledge-based video coding and video communications over time-varying, multiuser and interactive networks.
Feature combination strategies for saliency-based visual attention systems
TLDR
Four combination strategies are compared using three databases of natural color images and it is found that strategy (4) and its simplified, computationally efficient approximation yielded significantly better performance than (1), with up to fourfold improvement, while preserving generality.
Perceptually based quantization technique for MPEG encoding
TLDR
A technique for controlling the adaptive quantization process in an MPEG encoder, which improves upon the commonly used TM5 rate controller, and indicates a subjective improvement in picture quality, in comparison to the TM5 method.
Preattentive considerations for gaze-contingent image processing
TLDR
This paper presents a simple multiresolution image processing approach that can be utilized for gaze-contingent processing and evaluates three variants of this approach: a linear degradation function, a nonlinear function, and a function matching human visual system (HVS) acuity.
Implementation of a foveated image coding system for image bandwidth reduction
TLDR
A preliminary version of a foveated imaging system, implemented on a general purpose computer, which greatly reduces the transmission bandwidth of images, based on the fact that the spatial resolution of the human eye is space variant, decreasing with increasing eccentricity from the point of gaze.
Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixations
TLDR
This paper investigates and develops a methodology that serves to automatically identify a subset of aROIs (algorithmically detected ROIs) using different image processing algorithms (IPAs), and appropriate clustering procedures, and compares hROIs with hROI as a criterion for evaluating and selecting bottom-up, context-free algorithms.
Shifts in selective visual attention: towards the underlying neural circuitry.
TLDR
This study addresses the question of how simple networks of neuron-like elements can account for a variety of phenomena associated with this shift of selective visual attention and suggests a possible role for the extensive back-projection from the visual cortex to the LGN.
...
1
2
3
4
5
...