• Corpus ID: 244773491

CYBORG: Blending Human Saliency Into the Loss Improves Deep Learning

@article{Boyd2021CYBORGBH,
  title={CYBORG: Blending Human Saliency Into the Loss Improves Deep Learning},
  author={Aidan Boyd and Patrick Tinsley and K. Bowyer and Adam Czajka},
  journal={ArXiv},
  year={2021},
  volume={abs/2112.00686}
}
. Can deep learning models achieve greater generalization if their training is guided by reference to human perceptual abilities? And how can we implement this in a practical manner? This paper proposes a training strategy to ConveY Brain Oversight to Raise Generalization (CYBORG). This new approach incorporates human-annotated saliency maps into a CYBORG loss function that guides the model’s learning towards features from image regions that humans find salient for the task. The Class Activation… 

References

SHOWING 1-10 OF 56 REFERENCES
Alias-Free Generative Adversarial Networks
TLDR
It is observed that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner, and small architectural changes are derived that guarantee that unwanted information cannot leak into the hierarchical synthesis process.
Progressive Growing of GANs for Improved Quality, Stability, and Variation
TLDR
A new training methodology for generative adversarial networks is described, starting from a low resolution, and adding new layers that model increasingly fine details as training progresses, allowing for images of unprecedented quality.
Densely Connected Convolutional Networks
TLDR
The Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion, and has several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.
Deep Residual Learning for Image Recognition
TLDR
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
A Study of the Human Perception of Synthetic Faces
TLDR
This is the first rigorous study of the effectiveness of synthetic face generation techniques grounded in experimental techniques from psychology and shows that humans are unable to distinguish synthetic faces from real faces under several different circumstances.
SREFI: Synthesis of realistic example face images
TLDR
A face image dataset can be expanded in terms of the number of identities represented and the numberof images per identity using this approach, without the identity-labeling and privacy complications that come from downloading images from the web.
Be Specific, Be Clear: Bridging Machine and Human Captions by Scene-Guided Transformer
TLDR
A Scene-Guided Transformer model that leverages the scene-level global context to generate more specific and descriptive image captions and can enrich object-level and scene graph visual representations in the encoder and generalize to both RNN- and Transformer-based architectures in the decoder.
Human-Aided Saliency Maps Improve Generalization of Deep Learning
TLDR
This work compares the accuracy and generalization of a state-of-the-art deep learning algorithm for a difficult problem in biometric presentation attack detection when trained on original images with typical data augmentations, and the same original images transformed to encode human judgement about salient image regions.
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
The architecture introduced in this paper learns a mapping function G : X 7→ Y using an adversarial loss such thatG(X) cannot be distinguished from Y , whereX and Y are images belonging to two
...
...