• Corpus ID: 32429255

Faster gaze prediction with dense networks and Fisher pruning

@article{Theis2018FasterGP,
  title={Faster gaze prediction with dense networks and Fisher pruning},
  author={Lucas Theis and Iryna Korshunova and Alykhan Tejani and Ferenc Husz{\'a}r},
  journal={ArXiv},
  year={2018},
  volume={abs/1801.05787}
}
Predicting human fixations from images has recently seen large improvements by leveraging deep representations which were pretrained for object recognition. However, as we show in this paper, these networks are highly overparameterized for the task of fixation prediction. We first present a simple yet principled greedy pruning method which we call Fisher pruning. Through a combination of knowledge distillation and Fisher pruning, we obtain much more runtime-efficient architectures for saliency… 

Figures and Tables from this paper

Saliency Prediction in the Deep Learning Era: Successes and Limitations
  • A. Borji
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2021
TLDR
A large number of image and video saliency models are reviewed and compared over two image benchmarks and two large scale video datasets and factors that contribute to the gap between models and humans are identified.
The impact of temporal regularisation in egocentric saliency prediction
TLDR
The results indicate that the NSS saliency metric improves during task-driven activities, but that it clearly drops during free-viewing.
Saliency Prediction in the Deep Learning Era: An Empirical Investigation
TLDR
A large number of image and video saliency models are reviewed and compared over two image benchmarks and two large scale video datasets and factors that contribute to the gap between models and humans are identified.
Saliency Prediction in the Deep Learning Era: Successes, Limitations, and Future Challenges
TLDR
A large number of image and video saliency models are reviewed and compared over two image benchmarks and two large scale video datasets and factors that contribute to the gap between models and humans are identified.
A Compact Deep Architecture for Real-time Saliency Prediction
Temporal Saliency Adaptation in Egocentric Videos
TLDR
This work adapts a deep neural model for image saliency prediction to the temporal domain of egocentric video, and indicates that the temporal adaptation is beneficial when the viewer is not moving and observing the scene from a narrow field of view.
On-Device Saliency Prediction Based on Pseudoknowledge Distillation
TLDR
A pseudoknowledge distillation distillation (PKD) training method for creating a compact real-time saliency prediction model that can effectively transfer knowledge from computationally expensive once-for-all to an early exit evolutionary algorithm network student model by utilizing knowledge distillation and pseudolabeling.
Composition of Saliency Metrics for Pruning with a Myopic Oracle
TLDR
It is shown how to compose a set of these saliency metrics to form a much more robust (albeit still heuristic) saliency to avoid cases where the different base metrics do well, and avoid the cases where they do poorly by switching to a different metric.
Composition of Saliency Metrics for Channel Pruning with a Myopic Oracle
TLDR
This work proposes a method to compose several primitive pruning saliencies, to exploit the cases where each saliency measure does well, and shows that the composition of saliencies avoids many poor pruning choices identified by individual saliencies.
Taxonomy of Saliency Metrics for Channel Pruning
TLDR
A taxonomy of saliency metrics based on four mostly-orthogonal principal components is proposed, and it is found that some of the constructed metrics can outperform the best existing state-of-the-art metrics for convolutional neural network channel pruning.
...
...

References

SHOWING 1-10 OF 35 REFERENCES
Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model
TLDR
This paper presents a novel model which can predict accurate saliency maps by incorporating neural attentive mechanisms, and shows, through an extensive evaluation, that the proposed architecture outperforms the current state-of-the-art on public saliency prediction datasets.
Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet
TLDR
This work presents a novel way of reusing existing neural networks that have been pretrained on the task of object recognition in models of fixation prediction that significantly outperforms all state-of-the-art models on the MIT Saliency Benchmark.
DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations
TLDR
This paper proposes DeepFix, a fully convolutional neural network, which models the bottom–up mechanism of visual attention via saliency prediction via Saliency prediction, and evaluates the model on multiple challenging saliency data sets and shows that it achieves the state-of-the-art results.
CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research
TLDR
This work records eye movements of 120 observers while they freely viewed a large number of naturalistic and artificial images, which opens new challenges for the next generation of saliency models and helps conduct behavioral studies on bottom-up visual attention.
Shallow and Deep Convolutional Networks for Saliency Prediction
TLDR
This paper addresses the problem with a completely data-driven approach by training a convolutional neural network (convnet) and proposes two designs: a shallow convnet trained from scratch, and a another deeper solution whose first three layers are adapted from another network trained for classification.
SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks
TLDR
This paper presents a focused study to narrow the semantic gap with an architecture based on Deep Neural Network (DNN), which leverages the representational power of high-level semantics encoded in DNNs pretrained for object recognition.
A Benchmark of Computational Models of Saliency to Predict Human Fixations
TLDR
A benchmark data set containing 300 natural images with eye tracking data from 39 observers is proposed to compare model performances and it is shown that human performance increases with the number of humans to a limit.
Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images
  • E. Vig, M. Dorr, D. Cox
  • Computer Science
    2014 IEEE Conference on Computer Vision and Pattern Recognition
  • 2014
TLDR
This work identifies those instances of a richly-parameterized bio-inspired model family (hierarchical neuromorphic networks) that successfully predict image saliency and uses automated hyperparameter optimization to efficiently guide the search.
DeepGaze II: Reading fixations from deep features trained on object recognition
TLDR
The model uses the features from the VGG-19 deep neural network trained to identify objects in images for saliency prediction with no additional fine-tuning and achieves top performance in area under the curve metrics on the MIT300 hold-out benchmark.
Learning to predict where humans look
TLDR
This paper collects eye tracking data of 15 viewers on 1003 images and uses this database as training and testing examples to learn a model of saliency based on low, middle and high-level image features.
...
...