# NaturalProofs: Mathematical Theorem Proving in Natural Language

@article{Welleck2021NaturalProofsMT, title={NaturalProofs: Mathematical Theorem Proving in Natural Language}, author={Sean Welleck and Jiacheng Liu and Ronan Le Bras and Hannaneh Hajishirzi and Yejin Choi and Kyunghyun Cho}, journal={ArXiv}, year={2021}, volume={abs/2104.01112} }

Understanding and creating mathematics using natural mathematical language – the mixture of symbolic and natural language used by humans – is a challenging and important problem for driving progress in machine learning. As a step in this direction, we develop NATURALPROOFS, a multi-domain corpus of mathematical statements and their proofs, written in natural mathematical language. NATURALPROOFS unifies broad coverage, deep coverage, and low-resource mathematical sources, allowing for evaluating…

## Figures and Tables from this paper

## 7 Citations

### NaturalProver: Grounded Mathematical Proof Generation with Language Models

- Computer ScienceArXiv
- 2022

N ATURAL P ROVER is capable of proving some theorems that require short (2-6 step) proofs, and providing next-step suggestions that are rated as correct and useful over 40% of the time, which is to the authors' knowledge the first demonstration of these capabilities using neural language models.

### Solving Quantitative Reasoning Problems with Language Models

- Computer ScienceArXiv
- 2022

Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding. Nevertheless, state-of-the-art models have generally struggled with tasks…

### Towards Grounded Natural Language Proof Generation

- Computer Science
- 2021

An initial study of two generation tasks in natural mathematical language: suggesting the next step in a proof, and full-proof generation, and finds that conditioning on retrieved or ground-truth knowledge greatly improves generations.

### On the Paradox of Learning to Reason from Data

- Computer ScienceArXiv
- 2022

This study pro-vides an explanation for this paradox: instead of learning to emulate the correct reasoning function, BERT has in fact learned statistical features that inherently exist in logical reasoning problems.

### AbductionRules: Training Transformers to Explain Unexpected Inputs

- Computer ScienceFINDINGS
- 2022

AbductionRules is presented, a group of natural language datasets designed to train and test generalisable abduction over natural-language knowledge bases and discuss their performance, finding that the models learned generalisable abductive techniques but also learned to exploit the structure of the data.

### A Survey in Mathematical Language Processing

- Computer ScienceArXiv
- 2022

This work tracks the development of informal mathematical language processing approaches across strategic sub-areas in recent years, high-lighting the prevailing successful methodological elements along with existing limitations.

### MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics

- Computer ScienceICLR
- 2022

The miniF2F benchmark currently targets Metamath, Lean, and Isabelle and consists of 488 problem statements drawn from the AIME, AMC, and the International Mathematical Olympiad, as well as material from high-school and undergraduate mathematics courses.

## References

SHOWING 1-10 OF 62 REFERENCES

### Proof Artifact Co-training for Theorem Proving with Language Models

- Computer ScienceICLR
- 2022

PACT is proposed, a general methodology for extracting abundant self-supervised data from kernel-level proof terms for co-training alongside the usual tactic prediction objective and applied to Lean, an interactive proof assistant which hosts some of the most sophisticated formalized mathematics to date.

### IsarStep: a Benchmark for High-level Mathematical Reasoning

- Computer ScienceICLR
- 2021

A benchmark for high-level mathematical reasoning is presented and the reasoning capabilities of neural sequence-to-sequence models are studied and a hierarchical transformer is designed that outperforms the transformer baseline.

### Premise Selection in Natural Language Mathematical Texts

- Computer ScienceACL
- 2020

This paper proposes an approach to solve the natural language premise selection task as a link prediction problem, using Deep Convolutional Graph Neural Networks and shows that a graph structure can provide higher F1-score, especially when considering multi-hop premise selection.

### ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language

- Computer ScienceFINDINGS
- 2021

This work shows that a generative model, called ProofWriter, can reliably generate both implications of a theory and the natural language proofs that support them, and shows that generative techniques can perform a type of abduction with high precision.

### Generative Language Modeling for Automated Theorem Proving

- Computer ScienceArXiv
- 2020

This work presents an automated prover and proof assistant, GPT-f, for the Metamath formalization language, and analyzes its performance, finding new short proofs that were accepted into the mainMetamath library, which is to this knowledge, the first time a deep-learning based system has contributed proofs that are adopted by a formal mathematics community.

### Exploration of neural machine translation in autoformalization of mathematics in Mizar

- Computer ScienceCPP
- 2020

A custom type-elaboration mechanism is developed and integrated in the supervised translation of informal mathematics into formal mathematics against three established neural network-based machine translation models that are known to deliver competitive results on translating between natural languages.

### Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems

- Computer ScienceACL
- 2017

Experimental results show that indirect supervision of program learning via answer rationales is a promising strategy for inducing arithmetic programs.

### Analysing Mathematical Reasoning Abilities of Neural Models

- Computer ScienceICLR
- 2019

This paper conducts a comprehensive analysis of models from two broad classes of the most powerful sequence-to-sequence architectures and finds notable differences in their ability to resolve mathematical problems and generalize their knowledge.

### Mathematical Reasoning via Self-supervised Skip-tree Training

- Computer ScienceICLR
- 2021

It is found that models trained on the skip-tree task show surprisingly strong mathematical reasoning abilities, and outperform modelstrained on standard skip-sequence tasks.

### Learning to Prove Theorems via Interacting with Proof Assistants

- Computer ScienceICML
- 2019

ASTactic, a deep learning-based model that generates tactics as programs in the form of abstract syntax trees (ASTs) can generate effective tactics and can be used to prove new theorems not previously provable by automated methods.