# Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing

@inproceedings{Fu2019CyclicalAS,
title={Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing},
author={Hao Fu and C. Li and Xiaodong Liu and Jianfeng Gao and A. Çelikyilmaz and L. Carin},
booktitle={NAACL},
year={2019}
}
• Variational autoencoders (VAE) with an auto-regressive decoder have been applied for many natural language processing (NLP) tasks. VAE objective consists of two terms, the KL regularization term and the reconstruction term, balanced by a weighting hyper-parameter \beta. One notorious training difficulty is that the KL term tends to vanish. In this paper we study different scheduling schemes for \beta, and show that KL vanishing is caused by the lack of good latent codes in training decoder at… CONTINUE READING
