• Corpus ID: 232092291

# Coordination Among Neural Modules Through a Shared Global Workspace

@article{Goyal2021CoordinationAN,
title={Coordination Among Neural Modules Through a Shared Global Workspace},
author={Anirudh Goyal and Aniket Didolkar and Alex Lamb and Kartikeya Badola and Nan Rosemary Ke and Nasim Rahaman and Jonathan Binas and Charles Blundell and Michael C. Mozer and Yoshua Bengio},
journal={ArXiv},
year={2021},
volume={abs/2103.01197}
}
• Published 1 March 2021
• Computer Science
• ArXiv
Deep learning has seen a movement away from representing examples with a monolithic hidden state towards a richly structured state. For example, Transformers segment by position, and object-centric architectures decompose images into entities. In all these architectures, interactions between different elements are modeled via pairwise interactions: Transformers make use of self-attention to incorporate information from other positions and object-centric architectures make use of graph neural…

## Figures and Tables from this paper

• Computer Science
NeurIPS
• 2021
Luna is proposed, a linear unified nested attention mechanism that approximates softmax attention with two nested linear attention functions, yielding only linear time and space complexity.
This survey includes systematic generalization and aory of how machine learning addresses it, and looks into sys- tematic generalization in language, vision, and VQA ﬁelds.
• Computer Science
ICLR
• 2022
This work proposes a novel attention mechanism, called Compositional Attention, that replaces the standard head structure, and demonstrates that it outperforms standard multi-head attention on a variety of tasks, including some out-of-distribution settings.
• Computer Science
ArXiv
• 2022
It is formally prove that the SEM representation leads to better generalization than an unnormalized representation and empirically demonstrate that SSL methods trained with SEMs have improved generalization on natural image datasets such as CIFAR-100 and ImageNet.
• Computer Science
NeurIPS
• 2021
This work introduces local module composition (LMC), an approach to modular CL where each module is provided a local structural component that estimates a module’s relevance to the input, and demonstrates that agnosticity to task identities (IDs) arises from (local) structural learning that is module-speciﬁc as opposed to the task- and/or model-species as in previous works.
• Economics
ArXiv
• 2022
We evaluate Our Efﬁcient coordination among the agents further aggravates the problem of learning in multi-agent settings.
• Computer Science
ArXiv
• 2022
The proposed approach hopes to gain the expressiveness of the Transformer, while encourag-ing better compression and structuring of representations in the slow stream and shows the beneﬁts of the proposed method in terms of improved sampleency and generalization performance as compared to various competitive baselines.
• Computer Science
ArXiv
• 2022
This paper proposes an alternative approach whereby agents communicate through an intelligent facilitator that learns to sift through and interpret signals provided by all agents to improve the agents’ collective performance.
• Computer Science
• 2021
The experiments show that discrete-valued neural communication (DVNC) substantially improves systematic generalization in a variety of architectures—transformers, modular architectures, and graph neural networks, and the DVNC is robust to the choice of hyperparameters, making the method useful in practice.
• Computer Science
NeurIPS
• 2021
The hypothesis that restricting the transmitted information among components to discrete representations is a beneficial bottleneck is explored and a theoretical justification of the discretization process is established, proving that it has the ability to increase noise robustness and reduce the underlying dimensionality of the model.

## References

SHOWING 1-10 OF 64 REFERENCES

• Computer Science
NeurIPS
• 2018
A new memory module -- a \textit{Relational Memory Core} (RMC) -- is used which employs multi-head dot product attention to allow memories to interact and achieves state-of-the-art results on the WikiText-103, Project Gutenberg, and GigaWord datasets.
• Computer Science
ICML
• 2018
This work generalizes a recently proposed model architecture based on self-attention, the Transformer, to a sequence modeling formulation of image generation with a tractable likelihood, and significantly increases the size of images the model can process in practice, despite maintaining significantly larger receptive fields per layer than typical convolutional neural networks.
• Computer Science
NIPS
• 2017
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
• Computer Science
ICLR
• 2015
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
• Computer Science
ArXiv
• 2021
This work proposes Transformers with Independent Mechanisms (TIM), a new Transformer layer which divides the hidden representation and parameters into multiple mechanisms, which only exchange information through attention, and proposes a competition mechanism which encourages these mechanisms to specialize over time steps, and thus be more independent.
• Computer Science
ICLR
• 2021
Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.
• Computer Science
NeurIPS
• 2019
This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.
• Computer Science
AAMAS
• 2019
The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened.
• Computer Science
ICML
• 2019
This work presents an attention-based neural network module, the Set Transformer, specifically designed to model interactions among elements in the input set, and reduces the computation time of self-attention from quadratic to linear in the number of Elements in the set.
• Biology, Psychology
Science
• 2017
It is argued that despite their recent successes, current machines are still mostly implementing computations that reflect unconscious processing in the human brain, and the word “consciousness” conflates two different types of information-processing computations in the brain.