Stochastic Language Generation in Dialogue using Factored Language Models

@article{Mairesse2014StochasticLG,
  title={Stochastic Language Generation in Dialogue using Factored Language Models},
  author={François Mairesse and Steve J. Young},
  journal={Computational Linguistics},
  year={2014},
  volume={40},
  pages={763-799}
}
Most previous work on trainable language generation has focused on two paradigms: (a) using a generation decisions of an existing generator. Both approaches rely on the existence of a handcrafted generation component, which is likely to limit their scalability to new domains. The first contribution of this article is to present Bagel, a fully data-driven generation method that treats the language generation task as a search for the most likely sequence of semantic concepts and realization… 
Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking
TLDR
Results of an evaluation by human judges indicate that the new statistical language generator based on a joint recurrent and convolutional neural network structure produces not only high quality but linguistically varied utterances which are preferred compared to n-gram and rule-based systems.
RNN Based Language Generation Models for a Hindi Dialogue System
TLDR
Recurrent Neural Network Language Generation (RNNLG) framework based models are presented along with their analysis of how they extract intended meaning in terms of content planning and surface realization on a proposed unaligned Hindi dataset.
Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems
TLDR
A statistical language generator based on a semantically controlled Long Short-term Memory (LSTM) structure that can learn from unaligned data by jointly optimising sentence planning and surface realisation using a simple cross entropy training criterion, and language variation can be easily achieved by sampling from output candidates.
Multi-domain Neural Network Language Generation for Spoken Dialogue Systems
TLDR
This paper proposes a procedure to train multi-domain, Recurrent Neural Network-based (RNN) language generators via multiple adaptation steps, and shows that the proposed procedure can achieve competitive performance in terms of BLEU score and slot error rate while significantly reducing the data needed to train generators in new, unseen domains.
A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation
TLDR
An ensemble neural language generator is described, and several novel methods for data representation and augmentation that yield improved results in the model are presented.
Data-driven natural language generation using statistical machine translation and discriminative learning. (L'approche discriminante à la génération de la parole)
TLDR
It is proved that automatic corpus extension by means of paraphrase extraction and validation is just as effective as crowd-sourcing and being at the same time less costly in terms of development time and resources.
Data-driven language understanding for spoken dialogue systems
TLDR
The proposed Neural Belief Tracking model forsakes the use of standard one-hot n-gram representations used in Natural Language Processing in favour of distributed representations of user utterances, dialogue context and domain ontologies, and the proposed ATTRACT-REPEL algorithm boosts the semantic content of existing word vectors while simultaneously inducing high-quality cross-lingual word vector spaces.
End-to-End Content and Plan Selection for Natural Language Generation
TLDR
This paper utilizes several extensions to the generalpurpose sequence-to-sequence (S2S) architecture to model the latent content selection process, particularly different variants of copy attention and coverage decoding and proposes a new training method based on diverse ensembling to encourage the model to learn latent plans in training.
Reinforcement adaptation of an attention-based neural natural language generator for spoken dialogue systems
TLDR
It is shown that by defining appropriately the rewards as a linear combination of expected payoffs and costs of acquiring the new data provided by the user, a system design can balance between improving the system's performance towards a better match with the user's preferences and the burden associated with it.
...
...

References

SHOWING 1-10 OF 65 REFERENCES
Stochastic Language Generation in a Dialogue System: Toward a Domain Independent Generator
TLDR
It is shown that a written text language model can be used to predict dialogue utterances from an over-generated word forest and from a human oriented evaluation in an emergency planning domain.
Phrase-Based Statistical Language Generation Using Graphical Models and Active Learning
TLDR
Bagel is presented, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators, and can generate natural and informative utterances from unseen inputs in the information presentation domain.
Natural Language Generation as Planning Under Uncertainty for Spoken Dialogue Systems
TLDR
A new model for Natural Language Generation (NLG) in Spoken Dialogue Systems is presented and evaluated, based on statistical planning, given noisy feedback from the current generation context, which significantly outperforms all the prior approaches.
Empirical Methods in Natural Language Generation: Data-oriented Methods and Empirical Evaluation
TLDR
Assessing the Trade-Off between System Building Cost and Output Quality in Data-to-Text Generation and the First Challenge on Generating Instructions in Virtual Environments.
A Statistical NLG Framework for Aggregated Planning and Realization
TLDR
It is argued that the statistical approach to NLG reduces the need for complicated knowledge-based architectures and readily adapts to different domains with reduced development time.
Controlling User Perceptions of Linguistic Style: Trainable Generation of Personality Traits
TLDR
Personage is described, a highly parameterizable language generator whose parameters are based on psychological findings about the linguistic reflexes of personality, and a novel SNLG method which uses parameter estimation models trained on personality-annotated data to predict the generation decisions required to convey any combination of scalar values along the five main dimensions of personality.
Instance-Based Natural Language Generation
TLDR
This work develops an efficient search technique for identifying the optimal candidate based on a novel extension of the A* algorithm and details the annotation scheme and grammar induction algorithm and the efficiency and output of the generator.
...
...