SideControl: Controlled Open-domain Dialogue Generation via Additive Side Networks

Transformer-based pre-trained language models boost the performance of open-domain dialogue systems. Prior works leverage Transformer-based pre-trained language models to generate texts with desired attributes in two general approaches: (1) gradient-based methods: updating all latent representations of pre-trained models with gradients from attribute models; (2) weighted-decoding methods: re-ranking beam candidates from pre-trained models with attribute functions. How-ever, gradient-based… 

We set the batch size to 2, the total training epoch to 10, and automatically evaluate the model on the validation set every 1000 iterations

