Approximation-Aware Dependency Parsing by Belief Propagation

  title={Approximation-Aware Dependency Parsing by Belief Propagation},
  author={Matthew R. Gormley and Mark Dredze and Jason Eisner},
  journal={Transactions of the Association for Computational Linguistics},
We show how to train the fast dependency parser of Smith and Eisner (2008) for improved accuracy. This parser can consider higher-order interactions among edges while retaining O(n3) runtime. It outputs the parse with maximum expected recall—but for speed, this expectation is taken under a posterior distribution that is constructed only approximately, using loopy belief propagation through structured factors. We show how to adjust the model parameters to compensate for the errors introduced by… 

Edge-Linear First-Order Dependency Parsing with Undirected Minimum Spanning Tree Inference

This work proposes an inference algorithm for first-order models, which encodes the problem as a minimum spanning tree (MST) problem in an undirected graph, whose run time is O(m) at expectation and with a very high probability.

Inside-Outside and Forward-Backward Algorithms Are Just Backprop (tutorial paper)

This pedagogical paper carefully spells out the construction of an algorithm such as inside-outside or forward-backward that relates it to traditional and nontraditional views of these algorithms.

Effective Greedy Inference for Graph-based Non-Projective Dependency Parsing

This work proposes a simple greedy search approximation for high-order graph-based non-projective dependency parsing which improves the run time of the parser by a factor of 1.43 while losing 1% in UAS on average across languages.

Modeling Label Correlations for Second-Order Semantic Dependency Parsing with Mean-Field Inference

This work leverages tensor decomposition techniques, and interestingly, shows that the large second-order score tensors have no need to be materialized during mean-field inference, thereby reducing the computational complexity from cubic to quadratic.

TensorLog: A Differentiable Deductive Database

A probabilistic deductive database, called TensorLog, in which reasoning uses a differentiable process, and it is shown that these functions can be composed recursively to perform inference in non-trivial logical theories containing multiple interrelated clauses and predicates.

Second-Order Semantic Dependency Parsing with End-to-End Neural Networks

This paper proposes a second-order semantic dependency parser, which takes into consideration not only individual dependency edges but also interactions between pairs of edges, and shows that second- order parsing can be approximated using mean field variational inference or loopy belief propagation.

Graphical Models with Structured Factors, Neural Factors, and Approximation-aware Training

This thesis introduces a general framework for modeling with four ingredients: latent variables, structural constraints, learned (neural) feature representations of the inputs, and training that takes the approximations made during inference into account.

Structured Attention Networks

This work shows that structured attention networks are simple extensions of the basic attention procedure, and that they allow for extending attention beyond the standard soft-selection approach, such as attending to partial segmentations or to subtrees.

Structured Alignment Networks for Matching Sentences

This work introduces a model of structured alignments between sentences, showing how to compare two sentences by matching their latent structures, and finds that modeling latent tree structures results in superior performance.

Headed-Span-Based Projective Dependency Parsing

The score of a dependency tree is decompose into the scores of the headed spans of a projective dependency tree and a novel O(n^3) dynamic programming algorithm is designed to enable global training and exact inference.



Turbo Parsers: Dependency Parsing by Approximate Variational Inference

A unified view of two state-of-the-art non-projective dependency parsers, both approximate, is presented and a new aggressive online algorithm to learn the model parameters is proposed, which makes use of the underlying variational representation.

Relaxed Marginal Inference and its Application to Dependency Parsing

This work shows how to extend the relaxation approach to marginal inference used in conditional likelihood training, posterior decoding, confidence estimation, and other tasks, and is general enough to be applied with any marginal inference method in the inner loop.

Neural CRF Parsing

A parsing model that combines the exact dynamic programming of CRF parsing with the rich nonlinear featurization of neural net approaches, using nonlinear potentials computed via a feedforward neural network.

Fast Inference in Phrase Extraction Models with Belief Propagation

This work first shows that their model can be approximated using structured belief propagation, with a gain in alignment quality stemming from the use of marginals in decoding, and considers a more flexible, non-ITG matching constraint which is less efficient for exact inference but more efficient for BP.

Structured Learning for Taxonomy Induction with Belief Propagation

This model incorporates heterogeneous relational evidence about both hypernymy and siblinghood, captured by semantic features based on patterns and statistics from Web n-grams and Wikipedia abstracts, and uses loopy belief propagation along with a directed spanning tree algorithm for the core hypernym factor.

Loopy Belief Propagation for Approximate Inference: An Empirical Study

This paper compares the marginals computed using loopy propagation to the exact ones in four Bayesian network architectures, including two real-world networks: ALARM and QMR, and finds that the loopy beliefs often converge and when they do, they give a good approximation to the correct marginals.

Grammarless Parsing for Joint Inference

This paper proposes an alternative and novel method in which constituency parse constraints are imposed on the model via combinatorial factors in a Markov random field, guaranteeing that a variable configuration forms a valid tree.

A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing

This work designs a single model with both supertagging and parsing features, rather than separating them into distinct models chained together in a pipeline, to overcome the resulting increase in complexity.

Vine Pruning for Efficient Multi-Pass Dependency Parsing

A multi-pass coarse-to-fine architecture for dependency parsing using linear-time vine pruning and structured prediction cascades is proposed, which achieves accuracies comparable to those of their unpruned counterparts, while exploring only a fraction of the search space.

Experiments with a Higher-Order Projective Dependency Parser

In the multilingual exercise of the CoNLL-2007 shared task (Nivre et al., 2007), the system obtains the best accuracy for English, and the second best accuracies for Basque and Czech.