# Neural Arithmetic Units

@article{Madsen2020NeuralAU, title={Neural Arithmetic Units}, author={Andreas Madsen and alexander rosenberg johansen}, journal={ArXiv}, year={2020}, volume={abs/2001.05016} }

Neural networks can approximate complex functions, but they struggle to perform exact arithmetic operations over real numbers. The lack of inductive bias for arithmetic operations leaves neural networks without the underlying logic necessary to extrapolate on tasks such as addition, subtraction, and multiplication. We present two new neural network components: the Neural Addition Unit (NAU), which can learn exact addition and subtraction; and the Neural Multiplication Unit (NMU) that can…

## Figures and Tables from this paper

## 32 Citations

### Neural Power Units

- Computer ScienceNeurIPS
- 2020

The Neural Power Unit (NPU) is introduced that operates on the full domain of real numbers and is capable of learning arbitrary power functions in a single layer and fixes the shortcomings of existing arithmetic units and extends their expressivity.

### Learning Division with Neural Arithmetic Logic Modules

- Computer ScienceArXiv
- 2021

It is shown that robustly learning division in a systematic manner remains a challenge even at the simplest level of dividing two numbers, and two novel approaches for division are proposed which are called the Neural Reciprocal Unit (NRU) and the Neural Multiplicative Reciproc Unit (NMRU).

### Neural Status Registers

- Computer ScienceArXiv
- 2020

The Neural Status Register is introduced, inspired by physical Status Registers, and at the heart of the NSR are arithmetic comparisons between inputs that allow end-to-end differentiation and learns such comparisons reliably.

### Exploring the Learning Mechanisms of Neural Division Modules

- Computer Science, Mathematics
- 2022

It is shown that robustly learning division in a systematic manner remains a challenge even at the simplest level of dividing two numbers, and a novel approach to division is proposed which is called the Neural Reciprocal Unit (NRU) and the Neural Multiplicative Reciproc Unit (NMRU).

### A Primer for Neural Arithmetic Logic Modules

- Computer ScienceArXiv
- 2021

Focusing on the shortcomings of NALU, an in-depth analysis is provided to reason about design choices of recent units to highlight inconsistencies in a fundamental experiment causing the inability to directly compare across papers.

### Fast Neural Models for Symbolic Regression at Scale

- Computer Science
- 2020

This work introduces OccamNet, a neural network model that finds interpretable, compact, and sparse solutions for fitting data, à la Occam’s razor, and introduces a two-step optimization method that samples functions and updates the weights with backpropagation based on cross-entropy matching in an evolutionary strategy.

### Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers

- Computer ScienceNat. Mach. Intell.
- 2020

The Neural Harvard Computer is presented, a memory-augmented network based architecture that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules to enable the learning of robust and scalable algorithmic solutions.

### How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks

- Computer ScienceICLR
- 2021

The success of GNNs in extrapolating algorithmic tasks to new data relies on encoding task-specific non-linearities in the architecture or features, and a hypothesis is suggested for which theoretical and empirical evidence is provided.

### Learning Arithmetic Operations With A Multistep Deep Learning

- Computer Science2020 International Joint Conference on Neural Networks (IJCNN)
- 2020

It is shown that this mechanism applied to a simple multilayer perceptron can significantly improve its performance when learning either a multi-digit addition or multiplication, which are simple but yet challenging operations to learn.

### Transformers discover an elementary calculation system exploiting local attention and grid-like problem representation

- Computer Science2022 International Joint Conference on Neural Networks (IJCNN)
- 2022

It is shown that universal transformers equipped with local attention and adaptive halting mechanisms can learn to exploit an external, grid-like memory to carry out multi-digit addition.

## References

SHOWING 1-10 OF 25 REFERENCES

### Neural Arithmetic Logic Units

- Computer ScienceNeurIPS
- 2018

Experiments show that NALU-enhanced neural networks can learn to track time, perform arithmetic over images of numbers, translate numerical language into real-valued scalars, execute computer code, and count objects in images.

### Measuring Arithmetic Extrapolation Performance

- Computer ScienceNeurIPS 2019
- 2019

It is found that consistently learning arithmetic extrapolation is challenging, in particular for multiplication, in the first extensive evaluation with respect to convergence of the NALU and its sub-units.

### Neural GPUs Learn Algorithms

- Computer ScienceICLR
- 2016

It is shown that the Neural GPU can be trained on short instances of an algorithmic task and successfully generalize to long instances, and a technique for training deep recurrent networks: parameter sharing relaxation is introduced.

### Understanding the difficulty of training deep feedforward neural networks

- Computer ScienceAISTATS
- 2010

The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.

### Neural Arithmetic Expression Calculator

- Computer ScienceArXiv
- 2018

This paper presents a pure neural solver for arithmetic expression calculation (AEC) problem, which includes the adding, subtracting, multiplying, dividing and bracketing operations, and regards the arithmetic expressions calculation as a hierarchical reinforcement learning problem.

### Grid Long Short-Term Memory

- Computer ScienceICLR
- 2016

The Grid LSTM is used to define a novel two-dimensional translation model, the Reencoder, and it is shown that it outperforms a phrase-based reference system on a Chinese-to-English translation task.

### Adam: A Method for Stochastic Optimization

- Computer ScienceICLR
- 2015

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

### On Evaluating the Generalization of LSTM Models in Formal Languages

- Computer ScienceArXiv
- 2018

This paper empirically evaluates the inductive learning capabilities of Long Short-Term Memory networks, a popular extension of simple RNNs, to learn simple formal languages.

### Improving the Neural GPU Architecture for Algorithm Learning

- Computer ScienceArXiv
- 2017

The proposed architecture is the first capable of learning decimal multiplication end-to-end and a new technique - hard nonlinearities with saturation costs- that has general applicability is introduced that can be applied to active-memory models.

### Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks

- Computer ScienceICML
- 2018

This paper introduces the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences, and tests the zero-shot generalization capabilities of a variety of recurrent neural networks trained on SCAN with sequence-to-sequence methods.