Improved Latent Tree Induction with Distant Supervision via Span Constraints

@article{Xu2021ImprovedLT,
  title={Improved Latent Tree Induction with Distant Supervision via Span Constraints},
  author={Zhiyang Xu and Andrew Drozdov and Jay Yoon Lee and Timothy J. O'Gorman and Subendhu Rongali and Dylan Finkbeiner and Shilpa Suresh and Mohit Iyyer and Andrew McCallum},
  journal={ArXiv},
  year={2021},
  volume={abs/2109.05112}
}
For over thirty years, researchers have developed and analyzed methods for latent tree induction as an approach for unsupervised syntactic parsing. Nonetheless, modern systems still do not perform well enough compared to their supervised counterparts to have any practical use as structural annotation of text. In this work, we present a technique that uses distant supervision in the form of span constraints (i.e. phrase bracketing) to improve performance in unsupervised constituency parsing… 

Figures and Tables from this paper

Learning with Latent Structures in Natural Language Processing: A Survey

This work surveys three main families of methods to learn surrogate gradients, continuous relaxation, and marginal likelihood maximization via sampling to incorporate better inductive biases for improved end-task performance and better interpretability.

Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs

This work uses tensor rank decomposition (aka. CPD) to decrease inference computational complexities for a subset of FGGs subsuming HMMs and PCFGs, and conducts experiments on HMM language modeling and unsupervised PCFG parsing, showing better performance.

References

SHOWING 1-10 OF 73 REFERENCES

Do latent tree learning models identify meaningful structure in sentences?

This paper replicates two latent tree learning models in a shared codebase and finds that only one of these models outperforms conventional tree-structured models on sentence classification, and its parsing strategies are not especially consistent across random restarts.

Distant supervision for relation extraction without labeled data

This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.

Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing

It is demonstrated that derived constraints aid grammar induction by training Klein and Manning's Dependency Model with Valence (DMV) on this data set: parsing accuracy on Section 23 (all sentences) of the Wall Street Journal corpus jumps to 50.4%, beating previous state-of-the-art by more than 5%.

Unsupervised Parsing via Constituency Tests

An unsupervised parser is designed by specifying a set of transformations and using an unsuper supervised neural acceptability model to make grammaticality decisions, and the refined model achieves 62.8 F1 on the Penn Treebank test set, an absolute improvement of 7.6 points over the previous best published result.

Gradient-Based Inference for Networks with Output Constraints

This paper presents an inference method for neural networks that enforces deterministic constraints on outputs without performing rule-based post-processing or expensive discrete search, and studies the efficacy of GBI on three tasks with hard constraints: semantic role labeling, syntactic parsing, and sequence transduction.

Grammar Induction with Neural Language Models: An Unusual Replication

It is found that this model represents the first empirical success for latent tree learning, and that neural network language modeling warrants further study as a setting for grammar induction.

Unsupervised Multilingual Grammar Induction

A generative Bayesian model is formulated which seeks to explain the observed parallel data through a combination of bilingual and monolingual parameters, and loosely binds parallel trees while allowing language-specific syntactic structure.

Syntactic Structure Distillation Pretraining for Bidirectional Encoders

A knowledge distillation strategy for injecting syntactic biases into BERT pretraining, by distilling the syntactically informative predictions of a hierarchical—albeit harder to scale—syntactic language model.

Unsupervised Parsing with S-DIORA: Single Tree Encoding for Deep Inside-Outside Recursive Autoencoders

S-DIORA, an improved variant of DIORA that encodes a single tree rather than a softly-weighted mixture of trees by employing a hard argmax operation and a beam at each cell in the chart, is introduced.

Structured learning with constrained conditional models

This paper presents Constrained Conditional Models (CCMs), a framework that augments linear models with declarative constraints as a way to support decisions in an expressive output space while maintaining modularity and tractability of training and proposes CoDL, a constraint-driven learning algorithm, which makes use of constraints to guide semi-supervised learning.
...