• Corpus ID: 12438177

A Framework for (Under)specifying Dependency Syntax without Overloading Annotators

  title={A Framework for (Under)specifying Dependency Syntax without Overloading Annotators},
  author={Nathan Schneider and Brendan T. O'Connor and Naomi Saphra and David Bamman and Manaal Faruqui and Noah A. Smith and Chris Dyer and Jason Baldridge},
We introduce a framework for lightweight dependency syntax annotation. Our formalism builds upon the typical representation for unlabeled dependencies, permitting a simple notation and annotation workflow. Moreover, the formalism encourages annotators to underspecify parts of the syntax if doing so would streamline the annotation process. We demonstrate the e cacy of this annotation on three languages and develop algorithms to evaluate and compare underspecified annotations. 

Figures and Tables from this paper

Parse Imputation for Dependency Annotations

This work describes a method for imputing missing dependencies from sentences that have been partially annotated using the Graph Fragment Language, such that a standard dependency parser can then be trained on all annotations.

Simplified Dependency Annotations with GFL-Web

GFL-Web, a web-based interface for syntactic dependency annotation with the lightweight FUDG/GFL formalism, is presented, showing that even novices were, with a bit of training, able to rapidly annotate the syntax of English Twitter messages.

Edinburgh Simplified dependency annotations with GFL-Web

GFL-Web, a web-based interface for syntactic dependency annotation with the lightweight FUDG/GFL formalism, is presented, showing that even novices were, with a bit of training, able to rapidly annotate the syntax of English Twitter messages.

Fill it up: Exploiting partial dependency annotations in a minimum spanning tree parser

This work adapts the unsupervised ConvexMST dependency parser to learn from partial dependencies expressed in the Graph Fragment Language, and shows that obtaining small amounts of direct supervision - here, partial dependency annotations - provides a strong balance between zero and full supervision.

Edinburgh Research Explorer A Dependency Parser for Tweets

A new dependency parser for English tweets, T WEEBOPARSER, with conventions informed by the domain; adaptations to a statistical parsing algorithm; and a new approach to exploiting out-of-domain Penn Treebank data are described.

EasyTree: A Graphical Tool for Dependency Tree Annotation

EasyTree, a dynamic graphical tool for dependency tree annotation built in JavaScript using the popular D3 data visualization library, allows annotators to construct and label trees entirely by manipulating graphics, and then export the corresponding data in JSON format.

Shift-Reduce CCG Parsing with a Dependency Model

This paper presents the first dependency model for a shift-reduce CCG parser, and develops a novel training technique using a dependency oracle, in which all derivations are hidden.

Parsing Tweets into Universal Dependencies

It is shown that it is challenging to deliver consistent annotation due to ambiguity in understanding and explaining tweets and proposed a new method to distill an ensemble of 20 transition-based parsers into a single one that achieves an improvement of 2.2 in LAS over the un-ensembled baseline and outperforms parsers that are state-of-the-art on other treebanks in both accuracy and speed.

IMST: A Revisited Turkish Dependency Treebank

An attempt at reannotating the treebank from the ground up using the proposed schemes is described, and the consistencies of the two versions of the original treebank are compared via cross-validation using a dependency parser.

Transforming Dependencies into Phrase Structures

This work presents a new algorithm for transforming dependency parse trees into phrase-structure parse trees that is faster than traditional phrasestructure parsing and achieves near the state of the art on both benchmarks.



The Prague Dependency Treebank : Annotation Structure and Support

The contents of the Prague Dependency Treebank are described, from morphology to surface syntax to the deep (underlying) syntax layers of annotation, followed by a more detailed description of the annotation scheme.

Partial Training for a Lexicalized-Grammar Parser

A solution to the annotation bottleneck for statistical parsing, by exploiting the lexicalized nature of Combinatory Categorial Grammar, which results in high precision, yet incomplete and noisy data.

Extended Constituent-to-Dependency Conversion for English

A new method to convert English constituent trees using the Penn Treebank annotation style into dependency trees was described, which used a richer set of edge labels and introduced links to handle long-distance phenomena such as wh-movement and topicalization.

The Penn Discourse Treebank 2.0 Annotation Manual

This report contains the guidelines for the annotation of discourse relations in the Penn Discourse Treebank (http://www.seas.upenn.edu/~pdtb), PDTB. Discourse relations in the PDTB are annotated in

Dependency Parsing

How different parsers and annotation schemes influence the overall NLP pipeline in regards to machine translation as well as the baseline parsing accuracy is examined.

CoNLL-X Shared Task on Multilingual Dependency Parsing

How treebanks for 13 languages were converted into the same dependency format and how parsing performance was measured is described and general conclusions about multi-lingual parsing are drawn.

Dependency Grammar and Dependency Parsing

This paper will review the state of the art in dependency-based parsing, starting with the theoretical foundations of dependency grammar and moving on to consider both grammar-driven and data-driven methods for dependency parsing.

Supervised Grammar Induction using Training Data with Limited Constituent Information

It is shown that the most informative linguistic constituents are the higher nodes in the parse trees, typically denoting complex noun phrases and sentential clauses, and an adaptation strategy is proposed, which produces grammars that parse almost as well as Grammars induced from fully labeled corpora.

Mildly Non-Projective Dependency Structures

This paper reviews and compares the different constraints theoretically, and provides an experimental evaluation using data from two treebanks, investigating how large a proportion of the structures found in the treebanks are permitted under different constraints.

An English dependency treebank à la Tesnière

It is shown how it is possible to transform a PS English treebank to a DS notation that is closer to the one proposed by Tesniere, which the paper will refer to as TDS, and how this representation can incorporate all main advantages of modern DS.