Probing for Labeled Dependency Trees

@inproceedings{MullerEberstein2022ProbingFL,
  title={Probing for Labeled Dependency Trees},
  author={Max Muller-Eberstein and Rob van der Goot and Barbara Plank},
  booktitle={ACL},
  year={2022}
}
Probing has become an important tool for analyzing representations in Natural Language Processing (NLP). For graphical NLP tasks such as dependency parsing, linear probes are currently limited to extracting undirected or unlabeled parse trees which do not capture the full task. This work introduces DepProbe, a linear probe which can extract labeled and directed dependency parse trees from embeddings while using fewer parameters and compute than prior methods. Leveraging its full task coverage… 

References

SHOWING 1-10 OF 42 REFERENCES
Deep Biaffine Attention for Neural Dependency Parsing
TLDR
This paper uses a larger but more thoroughly regularized parser than other recent BiLSTM-based approaches, with biaffine classifiers to predict arcs and labels, and shows which hyperparameter choices had a significant effect on parsing accuracy, allowing it to achieve large gains over other graph-based approach.
A Structural Probe for Finding Syntax in Word Representations
TLDR
A structural probe is proposed, which evaluates whether syntax trees are embedded in a linear transformation of a neural network’s word representation space, and shows that such transformations exist for both ELMo and BERT but not in baselines, providing evidence that entire syntax Trees are embedded implicitly in deep models’ vector geometry.
A Tale of a Probe and a Parser
TLDR
To explore whether syntactic probes would do better to make use of existing techniques, this work compares the structural probe to a more traditional parser with an identical lightweight parameterisation.
Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing - A Tale of Two Parsers Revisited
TLDR
It is argued that deep contextualized word embeddings, which allow parsers to pack information about global sentence structure into local feature representations, benefit transition-based Parsers more than graph-based parsers, making the two approaches virtually equivalent in terms of both accuracy and error profile.
BERT Rediscovers the Classical NLP Pipeline
TLDR
This work finds that the model represents the steps of the traditional NLP pipeline in an interpretable and localizable way, and that the regions responsible for each step appear in the expected sequence: POS tagging, parsing, NER, semantic roles, then coreference.
A Root of a Problem: Optimizing Single-Root Dependency Parsing
TLDR
Two approaches to single-root dependency parsing that yield significant speed ups in such parsing and one of them is fully correct and identifies the highest-scoring parse tree.
Universal Dependencies for Finnish
TLDR
This work details the mapping of previously introduced annotation to the UD standard, describing specific challenges and their resolution, and presents parsing experiments comparing the performance of a state of theart parser trained on a languagespecific annotation schema to performance on the corresponding UD annotation.
Finding Universal Grammatical Relations in Multilingual BERT
TLDR
An unsupervised analysis method is presented that provides evidence mBERT learns representations of syntactic dependency labels, in the form of clusters which largely agree with the Universal Dependencies taxonomy, suggesting that even without explicit supervision, multilingual masked language models learn certain linguistic universals.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
A Gold Standard Dependency Corpus for English
TLDR
It is shown that training a dependency parser on a mix of newswire and web data leads to better performance on that type of data without hurting performance on newswire text, and therefore gold standard annotations for non-canonical text can be a valuable resource for parsing.
...
1
2
3
4
5
...