Mixture Content Selection for Diverse Sequence Generation

@article{Cho2019MixtureCS,
  title={Mixture Content Selection for Diverse Sequence Generation},
  author={Jaemin Cho and Minjoon Seo and Hannaneh Hajishirzi},
  journal={ArXiv},
  year={2019},
  volume={abs/1909.01953}
}
Generating diverse sequences is important in many NLP applications such as question generation or summarization that exhibit semantically one-to-many relationships between source and the target sequences. [...] Key Method The diversification stage uses a mixture of experts to sample different binary masks on the source sequence for diverse content selection. The generation stage uses a standard encoder-decoder model given each selected content from the source sequence.Expand
Diversify Question Generation with Continuous Content Selectors and Question Type Modeling
TLDR
This paper relates contextual focuses with content selectors, which are modeled by a continuous latent variable with the technique of conditional variational auto-encoder (CVAE) and achieves a better trade-off between generation quality and diversity compared with existing approaches. Expand
Exploring Explainable Selection to Control Abstractive Generation
TLDR
This paper targets using a select and generate paradigm to enhance the capability of selecting explainable contents and then guiding to control the abstract generation, and proposes a newly designed pair-wise extractor to capture the sentence pair interactions and their centrality. Expand
SA-HAVE: A Self-Attention based Hierarchical VAEs Network for Abstractive Summarization
  • Xia Wan, Shenggen Ju
  • Physics
  • Journal of Physics: Conference Series
  • 2021
The abstractive automatic summarization task is to summarize the main content of the article with short sentences, which is an important research direction in natural language generation. MostExpand
Simple or Complex? Complexity-Controllable Question Generation with Soft Templates and Deep Mixture of Experts Model
  • Sheng Bi, Xiya Cheng, +5 authors Yinlin Jiang
  • Computer Science
  • EMNLP
  • 2021
TLDR
This paper proposes an end-to-end neural complexitycontrollable question generation model, which incorporates a mixture of experts as the selector of soft templates to improve the accuracy of complexity control and the quality of generated questions. Expand
Focus Attention: Promoting Faithfulness and Diversity in Summarization
TLDR
Focus Attention Mechanism is introduced, a simple yet effective method to encourage decoders to proactively generate tokens that are similar or topical to the input document, and a Focus Sampling method is proposed to enable generation of diverse summaries, an area currently understudied in summarization. Expand
Focused Questions and Answer Generation by Key Content Selection
TLDR
This paper proposes a method of automatically generating answers and diversified sequences corresponding to those answers by introducing a new module called the "Focus Generator", which guides the decoder in an existing "encoder-decoder" model to generate questions based on selected focus contents. Expand
Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy
  • Shaolei Zhang, Yang Feng
  • Computer Science
  • EMNLP
  • 2021
TLDR
This paper proposes a universal SiMT model with Mixture- of-Experts Wait-k Policy to achieve the best translation quality under arbitrary latency with only one trained model, and outperforms all the strong baselines under different latency, including the state-of-the-art adaptive policy. Expand
Text Generation by Learning from Demonstrations
TLDR
It is found that GOLD outperforms the baselines according to automatic and human evaluation on summarization, question generation, and machine translation, including attaining state-of-the-art results for CNN/DailyMail summarization. Expand
Sentence-Permuted Paragraph Generation
TLDR
A novel framework PermGen is proposed whose objective is to maximize the expected log-likelihood of output paragraph distributions with respect to all possible sentence orders to improve the content diversity of multi-sentence paragraph. Expand
CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering
TLDR
A simple yet effective framework that leverages question generation (QG) to synthesize QA pairs on new clinical contexts and boosts QA models without requiring manual annotations is proposed, and a seq2seq-based question phrase prediction (QPP) module is introduced that can be used together with most existing QG models to diversify their generation. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 62 REFERENCES
Mixture Models for Diverse Machine Translation: Tricks of the Trade
TLDR
It is found that disabling dropout noise in responsibility computation is critical to successful training and certain types of mixture models are more robust and offer the best trade-off between translation quality and diversity compared to variational models and diverse decoding approaches. Expand
Get To The Point: Summarization with Pointer-Generator Networks
TLDR
A novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways, using a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator. Expand
Guiding Generation for Abstractive Text Summarization Based on Key Information Guide Network
TLDR
A guiding generation model that combines the extractive method and the abstractive method, and introduces a Key Information Guide Network (KIGN), which encodes the keywords to the key information representation, to guide the process of generation. Expand
Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations
TLDR
This work proposes an objective that transfers supervision from neighboring examples, and develops a method to evaluate using standard task-specific metrics and measures of output diversity, finding consistent improvements over standard maximum likelihood training and other baselines. Expand
Bottom-Up Abstractive Summarization
TLDR
This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary, and shows that this approach improves the ability to compress text, while still generating fluent summaries. Expand
Diverse Beam Search for Improved Description of Complex Scenes
TLDR
Diverse Beam Search is proposed, a diversity promoting alternative to BS for approximate inference that produces sequences that are significantly different from each other by incorporating diversity constraints within groups of candidate sequences during decoding; moreover, it achieves this with minimal computational or memory overhead. Expand
Sequence to Sequence Mixture Model for Diverse Machine Translation
TLDR
A novel sequence to sequence mixture (S2SMIX) model that improves both translation diversity and quality by adopting a committee of specialized translation models rather than a single translation model is developed. Expand
Analyzing Uncertainty in Neural Machine Translation
TLDR
This study proposes tools and metrics to assess how uncertainty in the data is captured by the model distribution and how it affects search strategies that generate translations and shows that search works remarkably well but that models tend to spread too much probability mass over the hypothesis space. Expand
Multiple Choice Learning: Learning to Produce Multiple Structured Outputs
TLDR
This work addresses the problem of generating multiple hypotheses for structured prediction tasks that involve interaction with users or successive components in a cascaded architecture by formulating this task as a multiple-output structured-output prediction problem with a loss-function that effectively captures the setup of the problem. Expand
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders
TLDR
A novel framework based on conditional variational autoencoders that captures the discourse-level diversity in the encoder and uses latent variables to learn a distribution over potential conversational intents and generates diverse responses using only greedy decoders is presented. Expand
...
1
2
3
4
5
...