Corpus ID: 13374927

The Penn Discourse TreeBank 2.0.

@inproceedings{Prasad2008ThePD,
  title={The Penn Discourse TreeBank 2.0.},
  author={Rashmi Prasad and Nikhil Dinesh and Alan Lee and Eleni Miltsakaki and Livio Robaldo and Aravind K. Joshi and Bonnie Lynn Webber},
  booktitle={LREC},
  year={2008}
}
We present the second version of the Penn Discourse Treebank, PDTB-2.0, describing its lexically-grounded annotations of discourse relations and their two abstract object arguments over the 1 million word Wall Street Journal corpus. We describe all aspects of the annotation, including (a) the argument structure of discourse relations, (b) the sense annotation of the relations, and (c) the attribution of discourse relations and each of their arguments. We list the differences between PDTB-1.0… Expand

Figures, Tables, and Topics from this paper

Introducing the Prague Discourse Treebank 1.0
TLDR
The theoretical background is presented, the annotation was performed directly on top of syntactic trees (from the previous project of the Prague Dependency Treebank 2.5), benefiting thus from the linguistic information already existing on the same data. Expand
The Penn Discourse Treebank: An Annotated Corpus of Discourse Relations
TLDR
This chapter presents a case study of the Penn Discourse Treebank, focusing in particular on the problem of characterizing and identifying, via annotation, explicit as well as implicit signals of discourse relations, and of designing the overall annotation workflow. Expand
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank
TLDR
An implicit discourse relation classifier is presented in the Penn Discourse Treebank that considers the context of the two arguments, word pair information, as well as the arguments' internal constituent and dependency parses. Expand
Mapping PDTB-style connective annotation to RST-style discourse annotation
TLDR
A procedure for mapping systematically from the first layer to the second with a suitable independently-annotated corpus for a corpus annotated with both types of information is described. Expand
Realization of Discourse Relations by Other Means: Alternative Lexicalizations
TLDR
This paper describes how the lexicalized discourse relation annotations of the Penn Discourse Treebank led to the discovery of a wide range of additional expressions, annotated as AltLex (alternative lexicalizations) in the PDTB 2.0. Expand
Experiments with Annotating Discourse Relations in the Hindi Discourse Relation Bank
TLDR
This work describes the initial annotation experiments, which have led to the discovery of additional connective classes and the development of a modified sense classification scheme, and proposes some insightful cross-linguistic generalizations by comparisons with the discourse relation distributions of English texts in the PDTB. Expand
The Chinese Discourse TreeBank: a Chinese corpus annotated with discourse relations
TLDR
The paper first characterize the syntactic and statistical distributions of Chinese discourse connectives as well as the role of Chinese punctuation marks in discourse annotation, and then describes how the annotation strategy procedure is designed based on this characterization. Expand
Realization of Discourse Relations by Other Means : Alternative
Studies of discourse relations have not, in the past, attempted to characterize what serves as evidence for them, beyond lists of frozen expressions, or markers, drawn from a few well-definedExpand
PDTB-style Discourse Annotation of Chinese Text
TLDR
A discourse annotation scheme for Chinese inspired by the Penn Discourse TreeBank, which makes adaptations based on the linguistic and statistical characteristics of Chinese text and affords a broader perspective on how the generalized lexically grounded approach can flesh itself out in the context of cross-linguistic annotation of discourse relations. Expand
Reflections on the Penn Discourse TreeBank, Comparable Corpora, and Complementary Annotation
TLDR
A comprehensive introduction to the Penn Discourse Treebank is provided to correct some wrong (or perhaps inadvertent) assumptions about the PDTB and its annotation and to explain variations seen in the annotation of comparable resources in other languages and genres to allow developers of future comparable resources to recognize whether the variations are relevant to them. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 35 REFERENCES
The Penn Discourse Treebank 2.0 Annotation Manual
This report contains the guidelines for the annotation of discourse relations in the Penn Discourse Treebank (http://www.seas.upenn.edu/~pdtb), PDTB. Discourse relations in the PDTB are annotated inExpand
The Penn Discourse Treebank
TLDR
A preliminary analysis of inter-annotator agreement is presented – both the level of agreement and the types of inter -annotator variation. Expand
Annotating Discourse Connectives and Their Arguments
TLDR
This paper presents an approach to annotating a level of discourse structure that is based on identifying discourse connectives and their arguments, and provides a detailed preliminary analysis of inter-annotator agreement. Expand
Attribution and the (Non-)Alignment of Syntactic and Discourse Arguments of Connectives
TLDR
Compared syntactic and discourse annotation of the Penn Discourse Treebank has revealed significant differences between syntactic structure and discourse structure, in terms of the arguments of connectives, due in large part to attribution. Expand
The Penn Discourse TreeBank as a Resource for Natural Language Generation
TLDR
How the Penn Discourse TreeBank (PDTB) can serve as a valuable large scale annotated corpus resource for furthering research in NLG and for inducing models for the development of NLG systems is described. Expand
Experiments on Sense Annotations and Sense Disambiguation of Discourse Connectives
Discourse connectives can be analyzed as discourse level predicates which projectpredicate-argument structure on a par with verbs at the sentence level. The PennDiscourse Treebank (PDTB) reflects thisExpand
Annotating Attribution in the Penn Discourse TreeBank
TLDR
An annotation scheme for marking the attribution of abstract objects such as propositions, facts and eventualities associated with discourse relations and their arguments annotated in the Penn Discourse TreeBank is described. Expand
The Proposition Bank: An Annotated Corpus of Semantic Roles
TLDR
An automatic system for semantic role tagging trained on the corpus is described and the effect on its performance of various types of information is discussed, including a comparison of full syntactic parsing with a flat representation and the contribution of the empty trace categories of the treebank. Expand
Building a Large Annotated Corpus of English: The Penn Treebank
TLDR
As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus. Expand
Representing Discourse Coherence: A Corpus-Based Study
TLDR
A method for annotating discourse coherence structures that was used to manually annotate a database of 135 texts from the Wall Street Journal and the AP Newswire and found many different kinds of crossed dependencies, as well as many nodes with multiple parents. Expand
...
1
2
3
4
...