• Corpus ID: 235422251

Constraining Linear-chain CRFs to Regular Languages

@article{Papay2021ConstrainingLC,
  title={Constraining Linear-chain CRFs to Regular Languages},
  author={Sean Papay and Roman Klinger and Sebastian Pad{\'o}},
  journal={ArXiv},
  year={2021},
  volume={abs/2106.07306}
}
A major challenge in structured prediction is to represent the interdependencies within output structures. When outputs are structured as sequences, linear-chain conditional random fields (CRFs) are a widely used model class which can learn local dependencies in the output. However, the CRF’s Markov assumption makes it impossible for CRFs to represent distributions with nonlocal dependencies, and standard CRFs are unable to respect nonlocal constraints of the data (such as global arity… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 49 REFERENCES

Weighting Finite-State Transductions With Neural Context

This work proposes to keep the traditional architecture, which uses a finite-state transducer to score all possible output strings, but to augment the scoring function with the help of recurrent networks, and defines a probability distribution over aligned output strings in the form of a weighted finite- state automaton.

A Linear Programming Formulation for Global Inference in Natural Language Tasks

This work develops a linear programing formulation for this problem and evaluates it in the context of simultaneously learning named entities and relations to efficiently incorporate domain and task specific constraints at decision time, resulting in significant improvements in the accuracy and the "human-like" quality of the inferences.

Semi-Markov Conditional Random Fields for Information Extraction

Intuitively, a semi-CRF on an input sequence x outputs a "segmentation" of x, in which labels are assigned to segments rather than to individual elements of xi, and transitions within a segment can be non-Markovian.

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes

The OntoNotes annotation (coreference and other layers) is described and the parameters of the shared task including the format, pre-processing information, evaluation criteria, and presents and discusses the results achieved by the participating systems.

Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search

Experiments show that GBS can provide large improvements in translation quality in interactive scenarios, and that, even without any user input, it can be used to achieve significant gains in performance in domain adaptation scenarios.

Interactive Information Extraction with Constrained Conditional Random Fields

This work applies a constrained Viterbi decoding which finds the optimal field assignments consistent with the fields explicitly specified or corrected by the user; and a mechanism for estimating the confidence of each extracted field, so that low-confidence extractions can be highlighted.

SparseMAP: Differentiable Sparse Structured Inference

This work introduces SparseMAP, a new method for sparse structured inference, and its natural loss function, which reveals competitive accuracy, improved interpretability, and the ability to capture natural language ambiguities, which is attractive for pipeline systems.

Dependency or Span, End-to-End Uniform Semantic Role Labeling

This paper presents an end-to-end model for both dependency and span SRL with a unified argument representation to deal with two different types of argument annotations in a uniform fashion and jointly predict all predicates and arguments.