Corpus ID: 235422465

MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education

@article{Shen2021MathBERTAP,
  title={MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education},
  author={Jia Tracy Shen and Michiharu Yamashita and Ethan Prihar and Neil T. Heffernan and Xintao Wu and Dongwon Lee},
  journal={ArXiv},
  year={2021},
  volume={abs/2106.07340}
}
Since the introduction of the original BERT (i.e., BASE BERT), researchers have developed various customized BERT models with improved performance for specific domains and tasks by exploiting the benefits of transfer learning. Due to the nature of mathematical texts, which often use domain specific vocabulary along with equations and math symbols, we posit that the development of a new BERT model for mathematics would be useful for many mathematical downstream tasks. In this resource paper, we… Expand
Training Verifiers to Solve Math Word Problems
TLDR
It is demonstrated that verification significantly improves performance on GSM8K, and there is strong empirical evidence that verification scales more effectively with increased data than a finetuning baseline. Expand

References

SHOWING 1-10 OF 40 REFERENCES
Classifying Math Knowledge Components via Task-Adaptive Pre-Trained BERT
TLDR
This work significantly improves prior research on auto-label educational content by expanding the input types to include KC descriptions, instructional video titles, and problem descriptions, and proposing a simple evaluation measure by which it can recover 56-73% of mispredicted KC labels. Expand
Generalizability of Methods for Imputing Mathematical Skills Needed to Solve Problems from Texts
TLDR
This work identified two major issues that cause of the original model to overfit to the training set and were able to significantly improve the test accuracy, but the classification accuracy is still far from being usable in a real-world application. Expand
Mathematical Language Processing: Automatic Grading and Feedback for Open Response Mathematical Questions
TLDR
This paper study the problem of automatically grading the kinds of open response mathematical questions that figure prominently in STEM courses, and develops two different clustering approaches, one that leverages generic clustering algorithms and one based on Bayesian nonparametrics. Expand
Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks
TLDR
It is consistently found that multi-phase adaptive pretraining offers large gains in task performance, and it is shown that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable. Expand
E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce
TLDR
A unified pre-training framework, namely, E-BERT, is proposed, which allows the model to adaptively switch from learning preliminary word knowledge to learning complex phrases, based on the fitting progress of two modes, to preserve phrase-level knowledge. Expand
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
SciBERT: A Pretrained Language Model for Scientific Text
TLDR
SciBERT leverages unsupervised pretraining on a large multi-domain corpus of scientific publications to improve performance on downstream scientific NLP tasks and demonstrates statistically significant improvements over BERT. Expand
A Memory-Augmented Neural Model for Automated Grading
TLDR
An efficient memory networks-powered automated grading model for essay writing and open-ended assignments that learns to predict a score for an ungraded response by computing the relevance between the un graded response and each selected response in memory. Expand
Universal Language Model Fine-tuning for Text Classification
TLDR
This work proposes Universal Language Model Fine-tuning (ULMFiT), an effective transfer learning method that can be applied to any task in NLP, and introduces techniques that are key for fine- Tuning a language model. Expand
FinBERT: A Pre-trained Financial Language Representation Model for Financial Text Mining
TLDR
This work presents FinBERT (BERT for Financial Text Mining) that is a domain specific language model pre-trained on large-scale financial corpora, which outperforms all current state-of-the-art models. Expand
...
1
2
3
4
...