Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers

  title={Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers},
  author={Georgios P. Spithourakis and Sebastian Riedel},
Numeracy is the ability to understand and work with numbers. It is a necessary skill for composing and understanding documents in clinical, scientific, and other technical domains. In this paper, we explore different strategies for modelling numerals with language models, such as memorisation and digit-by-digit composition, and propose a novel neural architecture that uses a continuous probability density function to model numerals from an open vocabulary. Our evaluation on clinical and… 

Figures and Tables from this paper

Numeracy enhances the Literacy of Language Models

A significant improvement in MWP for sentences containing numbers is found, that exponent embeddings are the best number encoders, yielding over 2 points jump in prediction accuracy over a BERT baseline, and that these enhanced literacy skills also generalize to contexts without annotated numbers.

Predicting Numerals in Natural Language Text Using a Language Model Considering the Quantitative Aspects of Numerals

This paper measures the NCS acquired by existing neural language models using a masked numeral prediction task as an evaluation task and proposes methods to reflect not only the symbolic aspect but also the quantitative aspect of numerals in the training of language models, using a loss function that depends on the magnitudes of the numerals and a regression model for the masked numal prediction task.

NQuAD: 70,000+ Questions for Machine Comprehension of the Numerals in Text

A Numeral-related Question Answering Dataset, NQuAD, for fine-grained numeracy, is presented, and several baselines for future works are proposed and it is shown that N QuAD is more challenging than the numeral- related questions in other datasets.

Pre-training and evaluation of numeracy-oriented language model

This work proposes two numerical pre-training methods with objectives that encourage the LM to understand the magnitude and value of numbers and encode the dependency between a number and its context, and applies the proposed methods on BERT.

Do NLP Models Know Numbers? Probing Numeracy in Embeddings

This work investigates the numerical reasoning capabilities of a state-of-the-art question answering model on the DROP dataset and finds this model excels on questions that require numerical reasoning, i.e., it already captures numeracy.

Methods for Numeracy-Preserving Word Embeddings

DICE outperforms a wide range of pre-trained word embedding models across multiple examples of two tasks: evaluating the ability to capture numeration and magnitude; and to perform list maximum, decoding, and addition.

Exploring Numeracy in Word Embeddings

Inspired by cognitive studies on how humans perceive numbers, an analysis framework is developed to test how well word embeddings capture two essential properties of numbers: magnitude and numeration.

Just Add Functions: A Neural-Symbolic Language Model

A general methodology to enhance the inductive bias of NNLMs by incorporating simple functions into a neural architecture to form a hierarchical neural-symbolic language model (NSLM), which explicitly encode symbolic deterministic relationships to form probability distributions over words.

Representing Numbers in NLP: a Survey and a Vision

This work synthesizes best practices for representing numbers in text and articulate a vision for holistic numeracy in NLP, comprised of design trade-offs and a unified evaluation.

Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments

This paper attempts to answer the question of whether neural network models can learn numeracy, which is the ability to predict the magnitude of a numeral at some specific position in a text description, through comprehensive experiments.



Clinical Text Prediction with Numerically Grounded Conditional Language Models

This paper investigates how grounded and conditional extensions to standard neural language models can bring improvements in the tasks of word prediction and completion and performs a qualitative investigation of how models with lower perplexity occasionally fare better at the tasks.

Learning To Use Formulas To Solve Simple Arithmetic Problems

A novel method to learn to use formulas to solve simple arithmetic word problems and beats the state-of-the-art by 86.07% of the problems in a corpus of standard primary school test questions.

Strategies for Training Large Vocabulary Neural Language Models

A systematic comparison of strategies to represent and train large vocabularies, includingsoftmax, hierarchical softmax, target sampling, noise contrastive estimation and self normalization, and extends selfnormalization to be a proper estimator of likelihood and introduce an efficient variant of softmax.

GloVe: Global Vectors for Word Representation

A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

Learning to Solve Arithmetic Word Problems with Verb Categorization

The paper analyzes the arithmetic-word problems “genre”, identifying seven categories of verbs used in such problems, and reports the first learning results on this task without reliance on predefined templates and makes the data publicly available.

Neural Text Generation from Structured Data with Application to the Biography Domain

A neural model for concept-to-text generation that scales to large, rich domains and significantly out-performs a classical Kneser-Ney language model adapted to this task by nearly 15 BLEU is introduced.

Deep Neural Language Models for Machine Translation

It is demonstrated that deep NLMs with three or four layers outperform those with fewer layers in terms of both the perplexity and the translation quality.

Analysing a simple language model·some general conclusions for language models for speech recognition

  • J. Ueberla
  • Computer Science
    Comput. Speech Lang.
  • 1994
A method to measure which parts of a language model perform particularly well or poorly is introduced, which can be a valuable help in steering future research efforts and will be helpful in directing future work.

Numerically Grounded Language Models for Semantic Error Correction

Numerical grounding in language models grounded in numbers within the text improves perplexity and F1 for semantic error correction by 5 points when compared to ungrounded approaches and conditioning on a knowledge base yields further improvements.

Pointer Sentinel Mixture Models

The pointer sentinel-LSTM model achieves state of the art language modeling performance on the Penn Treebank while using far fewer parameters than a standard softmax LSTM and the freely available WikiText corpus is introduced.