Testing Deep Neural Networks on the Same-Different Task

  title={Testing Deep Neural Networks on the Same-Different Task},
  author={Nicola Messina and Giuseppe Amato and Fabio Carrara and F. Falchi and Claudio Gennaro},
  journal={2019 International Conference on Content-Based Multimedia Indexing (CBMI)},
  • Nicola MessinaG. Amato C. Gennaro
  • Published 1 September 2019
  • Computer Science
  • 2019 International Conference on Content-Based Multimedia Indexing (CBMI)
Developing abstract reasoning abilities in neural networks is an important goal towards the achievement of human-like performances on many tasks. As of now, some works have tackled this problem, developing ad-hoc architectures and reaching overall good generalization performances. In this work we try to understand to what extent state-of-the-art convolutional neural networks for image classification are able to deal with a challenging abstract problem, the so-called same-different task. This… 

Figures and Tables from this paper

Solving the Same-Different Task with Convolutional Neural Networks

Recurrent Vision Transformer for Solving Visual Reasoning Problems

The Recurrent Vision Transformer (RViT) model is introduced, which achieves competitive results on the same-different visual reasoning problems from the SVRT dataset, allowing it to learn using far fewer free parameters, using only 28k training samples.

Evaluating the progress of deep learning for visual relational concepts

It will be hypothesised that iterative processing of the input, together with shifting attention between the iterations will be needed to efficiently and reliably solve real world relational concept learning.

The Notorious Difficulty of Comparing Human and Machine Perception

It is shown that, despite their ability to solve closed-contour tasks, the authors' neural networks use different decision-making strategies than humans, and that neural networks do experience a "recognition gap" on minimal recognizable images.

Combining EfficientNet and Vision Transformers for Video Deepfake Detection

This study focuses on video deep fake detection on faces, and combines various types of Vision Transformers with a convolutional EfficientNet B0 used as a feature extractor, obtaining comparable results with some very recent methods that use Vision Transformers.

Five points to check when comparing visual perception in humans and machines

A checklist for comparative studies of visual reasoning in humans and machines is presented and it is found that a previously observed difference in object recognition does not hold when adapting the experiment to make conditions more equitable between humans and machine.



A dataset and architecture for visual reasoning with a working memory

This work developed an artificial, configurable visual question and answer dataset (COG) to parallel experiments in humans and animals and proposes a deep learning architecture that performs competitively on other diagnostic VQA datasets (i.e. CLEVR) as well as easy settings of the COG dataset.

Going deeper with convolutions

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition

Not-So-CLEVR: Visual Relations Strain Feedforward Neural Networks

Motivated by the comparable success of biological vision, it is argued that feedback mechanisms including working memory and attention are the key computational components underlying abstract visual reasoning.

A simple neural network module for relational reasoning

This work shows how a deep learning architecture equipped with an RN module can implicitly discover and learn to reason about entities and their relations.

25 Years of CNNs: Can We Compare to Human Abstraction Capabilities?

The performance of LeNet is compared to that of GoogLeNet at classifying randomly generated images which are differentiated by an abstract property to show that there is still work to do in order to solve vision problems humans are able to solve without much difficulty.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Measuring abstract reasoning in neural networks

A dataset and challenge designed to probe abstract reasoning, inspired by a well-known human IQ test, is proposed and ways to both measure and induce stronger abstract reasoning in neural networks are introduced.

CORnet: Modeling the Neural Mechanisms of Core Object Recognition

The current best ANN model derived from this approach (CORnet-S) is among the top models on Brain-Score, a composite benchmark for comparing models to the brain, but is simpler than other deep ANNs in terms of the number of convolutions performed along the longest path of information processing in the model.

Very deep convolutional neural network based image classification using small training sample size

The results show that the very deep CNN can be used to fit small datasets with simple and proper modifications and don't need to re-design specific small networks.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.