• Corpus ID: 5971386

CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research

@article{Borji2015CAT2000AL,
  title={CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research},
  author={Ali Borji and Laurent Itti},
  journal={ArXiv},
  year={2015},
  volume={abs/1505.03581}
}
  • A. Borji, L. Itti
  • Published 14 May 2015
  • Environmental Science, Computer Science
  • ArXiv
Saliency modeling has been an active research area in computer vision for about two decades. Existing state of the art models perform very well in predicting where people look in natural scenes. There is, however, the risk that these models may have been overfitting themselves to available small scale biased datasets, thus trapping the progress in a local minimum. To gain a deeper insight regarding current issues in saliency modeling and to better gauge progress, we recorded eye movements of… 

Figures from this paper

Where Should Saliency Models Look Next?
TLDR
It is argued that to continue to approach human-level performance, saliency models will need to discover higher-level concepts in images: text, objects of gaze and action, locations of motion, and expected locations of people in images.
Saliency Prediction in the Deep Learning Era: An Empirical Investigation
TLDR
A large number of image and video saliency models are reviewed and compared over two image benchmarks and two large scale video datasets and factors that contribute to the gap between models and humans are identified.
How is Gaze Influenced by Image Transformations? Dataset and Model
TLDR
A novel saliency model based on generative adversarial networks (dubbed GazeGAN) is introduced, which combines classic “skip connection” with a novel “center-surround connected” (CSC) module, which mitigates trivial artifacts while emphasizing semantic salient regions, and increases model nonlinearity, thus demonstrating better robustness against transformations.
Leverage eye-movement data for saliency modeling: Invariance Analysis and a Robust New Model
TLDR
A novel saliency model based on generative adversarial network (dubbed GazeGAN) is introduced, which combines classic skip connections with a novel center-surround connection in order to leverage multi level features, and a histogram loss based on Alternative Chi Square Distance (ACS HistLoss) is proposed to refine the saliency map in terms of luminance distribution.
Saliency Prediction in the Deep Learning Era: Successes and Limitations
  • A. Borji
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2021
TLDR
A large number of image and video saliency models are reviewed and compared over two image benchmarks and two large scale video datasets and factors that contribute to the gap between models and humans are identified.
Saliency Prediction in the Deep Learning Era: Successes, Limitations, and Future Challenges
TLDR
A large number of image and video saliency models are reviewed and compared over two image benchmarks and two large scale video datasets and factors that contribute to the gap between models and humans are identified.
GazeGAN: A Generative Adversarial Saliency Model based on Invariance Analysis of Human Gaze During Scene Free Viewing
TLDR
A novel saliency model based on generative adversarial network (dubbed GazeGAN) is introduced, which combines classic skip connections with a novel center-surround connection in order to leverage multi level features, and a histogram loss based on Alternative Chi Square Distance (ACS HistLoss) is proposed to refine the saliency map in terms of luminance distribution.
DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations
TLDR
This paper proposes DeepFix, a fully convolutional neural network, which models the bottom–up mechanism of visual attention via saliency prediction via Saliency prediction, and evaluates the model on multiple challenging saliency data sets and shows that it achieves the state-of-the-art results.
A Locally Weighted Fixation Density-Based Metric for Assessing the Quality of Visual Saliency Predictions
TLDR
A new metric is proposed that uses local weights based on fixation density, which overcomes shortcomings in the existing metrics and outperforms all other popular existing metrics at assessing the quality of saliency prediction.
Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model
TLDR
This paper presents a novel model which can predict accurate saliency maps by incorporating neural attentive mechanisms, and shows, through an extensive evaluation, that the proposed architecture outperforms the current state-of-the-art on public saliency prediction datasets.
...
...

References

SHOWING 1-10 OF 12 REFERENCES
Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study
TLDR
This study allows one to assess the state-of-the-art visual saliency modeling, helps to organizing this rapidly growing field, and sets a unified comparison framework for gauging future efforts, similar to the PASCAL VOC challenge in the object recognition and detection domains.
State-of-the-Art in Visual Attention Modeling
  • A. Borji, L. Itti
  • Psychology, Biology
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2013
TLDR
A taxonomy of nearly 65 models of attention provides a critical comparison of approaches, their capabilities, and shortcomings, and addresses several challenging issues with models, including biological plausibility of the computations, correlation with eye movement datasets, bottom-up and top-down dissociation, and constructing meaningful performance measures.
On the relationship between optical variability, visual saliency, and eye fixations: a computational approach.
TLDR
A hierarchical definition of optical variability is proposed that links physical magnitudes to visual saliency and yields a more reductionist interpretation than previous approaches and explains quantitative results related to a visual illusion observed for images of corners, which does not involve eye movements.
Unbiased look at dataset bias
TLDR
A comparison study using a set of popular datasets, evaluated based on a number of criteria including: relative data bias, cross-dataset generalization, effects of closed-world assumption, and sample value is presented.
Graph-Based Visual Saliency
TLDR
A new bottom-up visual saliency model, Graph-Based Visual Saliency (GBVS), is proposed, which powerfully predicts human fixations on 749 variations of 108 natural images, achieving 98% of the ROC area of a human-based control, whereas the classical algorithms of Itti & Koch achieve only 84%.
Saliency Detection: A Spectral Residual Approach
TLDR
A simple method for the visual saliency detection is presented, independent of features, categories, or other forms of prior knowledge of the objects, and a fast method to construct the corresponding saliency map in spatial domain is proposed.
Fixations on low-resolution images.
TLDR
It is found that fixations from lower resolution images can predict fixations on higher resolution images and human fixations are biased toward the center for all resolutions and this bias is stronger at lower resolutions.
How do humans sketch objects?
TLDR
This paper is the first large scale exploration of human sketches, developing a bag-of-features sketch representation and using multi-class support vector machines, trained on the sketch dataset, to classify sketches.
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis
TLDR
A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented, which breaks down the complex problem of scene understanding by rapidly selecting conspicuous locations to be analyzed in detail.
Human action recognition by learning bases of action attributes and parts
TLDR
This work proposes to use attributes and parts for recognizing human actions in still images by learning a set of sparse bases that are shown to carry much semantic meaning, and shows that this dual sparsity provides theoretical guarantee of the bases learning and feature reconstruction approach.
...
...