SideControl: Controlled Open-domain Dialogue Generation via Additive Side Networks

@article{Du2021SideControlCO,
  title={SideControl: Controlled Open-domain Dialogue Generation via Additive Side Networks},
  author={Wanyu Du and Yangfeng Ji},
  journal={ArXiv},
  year={2021},
  volume={abs/2109.01958}
}
Transformer-based pre-trained language models boost the performance of open-domain dialogue systems. Prior works leverage Transformer-based pre-trained language models to generate texts with desired attributes in two general approaches: (1) gradient-based methods: updating all latent representations of pre-trained models with gradients from attribute models; (2) weighted-decoding methods: re-ranking beam candidates from pre-trained models with attribute functions. How-ever, gradient-based… 

PLANET: Dynamic Content Planning in Autoregressive Transformers for Long-form Text Generation

PANET is proposed, a novel generation framework leveraging autoregressive self-attention mechanism to conduct content planning and surface realization dynamically and introduces a new coherence-based contrastive learning objective to further improve the coherence of output.

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

An end-to-end chatbot named the Few-Shot Bot is created, which automatically selects the most appropriate conversational skill, queries different knowledge bases or the internet, and uses the retrieved knowledge to generate a human-like response, all using only few dialogue examples per skill.

References

SHOWING 1-10 OF 37 REFERENCES

FUDGE: Controlled Text Generation With Future Discriminators

This work proposes Future Discriminators for Generation (FUDGE), a flexible and modular method for controlled text generation that enables conditioning on a desired attribute a while requiring access only to G’s output logits.

Recipes for Building an Open-Domain Chatbot

Human evaluations show the best models outperform existing approaches in multi-turn dialogue on engagingness and humanness measurements, and the limitations of this work are discussed by analyzing failure cases of the models.

Neural Conversation Model Controllable by Given Dialogue Act Based on Adversarial Learning and Label-aware Objective

An adversarial learning framework is introduced for the task of generating conditional responses with a new objective to a discriminator, which explicitly distinguishes sentences by using labels, which strongly encourages the generation of label-conditioned sentences.

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

The Plug and Play Language Model (PPLM) for controllable language generation is proposed, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM.

The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents

D dodecaDialogue is introduced, a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, and that the multi-tasking in general provides gains to both text and image-based tasks using several metrics in both the fine-tune and task transfer settings.

Decoupled Weight Decay Regularization

This work proposes a simple modification to recover the original formulation of weight decay regularization by decoupling the weight decay from the optimization steps taken w.r.t. the loss function, and provides empirical evidence that this modification substantially improves Adam's generalization performance.

Get To The Point: Summarization with Pointer-Generator Networks

A novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways, using a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

2019) with learning rate 0.0001. We set the batch size to 2, the total training epoch to 10, and automatically evaluate the model on the validation set every 1000 iterations

  • 2019

Plug-and-Blend: A Framework for Controllable Story Generation with Blended Control Codes

This work describes a Plug-and-Play controllable language generation framework that allows a human user to input multiple control codes (topics) and shows that this framework controls the generation towards given continuous-weighted control codes while keeping the generated sentences fluent, demonstrating strong blending capability.