Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing
@inproceedings{Fu2019CyclicalAS, title={Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing}, author={Hao Fu and C. Li and Xiaodong Liu and Jianfeng Gao and A. Çelikyilmaz and L. Carin}, booktitle={NAACL}, year={2019} }
Variational autoencoders (VAE) with an auto-regressive decoder have been applied for many natural language processing (NLP) tasks. VAE objective consists of two terms, the KL regularization term and the reconstruction term, balanced by a weighting hyper-parameter \beta. One notorious training difficulty is that the KL term tends to vanish. In this paper we study different scheduling schemes for \beta, and show that KL vanishing is caused by the lack of good latent codes in training decoder at… CONTINUE READING
Figures, Tables, and Topics from this paper
81 Citations
Discretized Bottleneck in VAE: Posterior-Collapse-Free Sequence-to-Sequence Learning
- Computer Science
- ArXiv
- 2020
- Highly Influenced
Enhancing Variational Autoencoders with Mutual Information Neural Estimation for Text Generation
- Computer Science
- EMNLP/IJCNLP
- 2019
- 2
- Highly Influenced
- PDF
Discrete Auto-regressive Variational Attention Models for Text Modeling
- Computer Science, Mathematics
- 2020
- PDF
Preventing Posterior Collapse with Levenshtein Variational Autoencoder
- Computer Science, Mathematics
- ArXiv
- 2020
- 1
- Highly Influenced
- PDF
Discrete Variational Attention Models for Language Generation
- Computer Science
- ArXiv
- 2020
- Highly Influenced
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
- Computer Science, Mathematics
- EMNLP
- 2020
- 18
- PDF
Preventing Posterior Collapse in Sequence VAEs with Pooling
- Computer Science
- ArXiv
- 2019
- 2
- Highly Influenced
A Surprisingly Effective Fix for Deep Latent Variable Modeling of Text
- Computer Science, Mathematics
- EMNLP/IJCNLP
- 2019
- 21
- Highly Influenced
- PDF
A Batch Normalized Inference Network Keeps the KL Vanishing Away
- Computer Science
- ACL
- 2020
- 4
- Highly Influenced
- PDF
References
SHOWING 1-10 OF 44 REFERENCES
Improved Variational Autoencoders for Text Modeling using Dilated Convolutions
- Computer Science
- ICML
- 2017
- 221
- PDF
InfoVAE: Information Maximizing Variational Autoencoders
- Mathematics, Computer Science
- ArXiv
- 2017
- 243
- PDF
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders
- Computer Science
- ACL
- 2017
- 377
- Highly Influential
- PDF
beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
- Computer Science
- ICLR
- 2017
- 1,659
- Highly Influential
Lagging Inference Networks and Posterior Collapse in Variational Autoencoders
- Computer Science, Mathematics
- ICLR
- 2019
- 113
- PDF
Isolating Sources of Disentanglement in Variational Autoencoders
- Computer Science, Mathematics
- NeurIPS
- 2018
- 445
- PDF
Generating Sentences from a Continuous Space
- Computer Science, Mathematics
- CoNLL
- 2016
- 1,282
- Highly Influential
- PDF
Language as a Latent Variable: Discrete Generative Models for Sentence Compression
- Computer Science
- EMNLP
- 2016
- 165
- PDF