Pointer Sentinel Mixture Models

Abstract

Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies. Even then they struggle to predict rare or unseen words even if the context makes the prediction unambiguous. We introduce the pointer sentinel mixture architecture for neural… (More)

Topics

11 Figures and Tables

Statistics

050100201620172018
Citations per Year

89 Citations

Semantic Scholar estimates that this publication has 89 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Merity2016PointerSM, title={Pointer Sentinel Mixture Models}, author={Stephen Merity and Caiming Xiong and James Bradbury and Richard Socher}, journal={CoRR}, year={2016}, volume={abs/1609.07843} }