Strategies for Training Large Vocabulary Neural Language Models

  title={Strategies for Training Large Vocabulary Neural Language Models},
  author={Wenlin Chen and David Grangier and Michael Auli},
Training neural network language models over large vocabularies is still computationally very costly compared to count-based models such as Kneser-Ney. At the same time, neural language models are gaining popularity for many applications such as speech recognition and machine translation whose success depends on scalability. We present a systematic comparison of strategies to represent and train large vocabularies, including softmax, hierarchical softmax, target sampling, noise contrastive… CONTINUE READING
Highly Cited
This paper has 83 citations. REVIEW CITATIONS
Related Discussions
This paper has been referenced on Twitter 92 times. VIEW TWEETS


Publications citing this paper.
Showing 1-10 of 60 extracted citations

A Fast and Simple Model in Practice for Ranking

2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS) • 2018
View 1 Excerpt

83 Citations

Citations per Year
Semantic Scholar estimates that this publication has 83 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 49 references

A Neural Network Approach to Context - Sensitive Generation of Conversational Responses

Yangfeng Ji, Jianfeng Gao, Bill Dolan
Proc . of NAACL . Association for Computational Linguistics , May . [ Sutskever et al . 2014 ] • 2015

Jian-Yun Nie1

Alessandro Sordoni, Michel Galley, +3 authors Margaret Mitchell
Jianfeng Gao, and Bill Dolan. • 2015
View 1 Excerpt

Kyunghyun Cho

Dzmitry Bahdanau
and Yoshua Bengio. • 2015

Tuning as ranking

Jacob Devlin, Rabih Zbib, +3 authors John Makhoul
Proc . of EMNLP . Association for Computational Linguistics , Sep . [ Devlin et al . 2014 ] • 2015

and Alexander M

Sumit Chopra, Jason Weston
Rush. • 2015

Similar Papers

Loading similar papers…