• Corpus ID: 237605155

Conditional Poisson Stochastic Beam Search

  title={Conditional Poisson Stochastic Beam Search},
  author={Clara Meister and Afra Amini and Tim Vieira and Ryan Cotterell},
Beam search is the default decoding strategy for many sequence generation tasks in NLP. The set of approximate K-best items returned by the algorithm is a useful summary of the distribution for many applications; however, the candidates typically exhibit high overlap and may give a highly biased estimate for expectations under our model. These problems can be addressed by instead using stochastic decoding strategies. In this work, we propose a new method for turning beam search into a… 

Mastering Spatial Graph Prediction of Road Networks

A graph-based framework that simulates the addition of sequences of graph edges using a reinforcement learning (RL) approach is introduced that demonstrates enhanced performance and increased high-level reasoning about the graph topology when using a tree-based search.



Best-First Beam Search

This work shows that the standard implementation of beam search can be made up to 10x faster in practice, and devise effective monotonic approximations to popular nonmonontic scoring functions, including length normalization and mutual information decoding.

Determinantal Beam Search

Determinantal beam search is proposed, a reformulation of beam search that offers competitive performance against other diverse set generation strategies in the context of language generation, while providing a more general approach to optimizing for diversity.

Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement

It is shown that sequences sampled without replacement can be used to construct low-variance estimators for expected sentence-level BLEU score and model entropy.

If Beam Search Is the Answer, What Was the Question?

It is found that beam search enforces uniform information density in text, a property motivated by cognitive science, and suggests a set of decoding objectives that explicitly enforce this property and finds that exact decoding with these objectives alleviates the problems encountered when decoding poorly calibrated language generation models.

Diverse Beam Search for Improved Description of Complex Scenes

Diverse Beam Search is proposed, a diversity promoting alternative to BS for approximate inference that produces sequences that are significantly different from each other by incorporating diversity constraints within groups of candidate sequences during decoding; moreover, it achieves this with minimal computational or memory overhead.

Pareto Sampling versus Sampford and Conditional Poisson Sampling

Abstract.  Pareto sampling was introduced by Rosén in the late 1990s. It is a simple method to get a fixed size πps sample though with inclusion probabilities only approximately as desired. Sampford

Determinantal Point Processes for Machine Learning

Determinantal Point Processes for Machine Learning provides a comprehensible introduction to DPPs, focusing on the intuitions, algorithms, and extensions that are most relevant to the machine learning community, and shows how they can be applied to real-world applications.

Incremental Sampling Without Replacement for Sequence Models

It is shown that incremental sampling without replacement is applicable to many domains, e.g., program synthesis and combinatorial optimization, and is efficient even for exponentially-large output spaces.

Analyzing Uncertainty in Neural Machine Translation

This study proposes tools and metrics to assess how uncertainty in the data is captured by the model distribution and how it affects search strategies that generate translations and shows that search works remarkably well but that models tend to spread too much probability mass over the hypothesis space.

Algorithms to Find Exact Inclusion Probabilities for Conditional Poisson Sampling and Pareto πps Sampling Designs

Conditional Poisson Sampling Design as developed by Haje´k may be defined as a Poisson sampling conditioned by the requirement that the sample has fixed size. In this paper, an algorithm is