• Publications
  • Influence
The GUM corpus: creating multilayer resources in the classroom
  • Amir Zeldes
  • Computer Science
  • Lang. Resour. Evaluation
  • 1 September 2017
This paper presents the methodology, design principles and detailed evaluation of a new freely available multilayer corpus, collected and edited via classroom annotation using collaborative software.Expand
  • 56
  • 9
ANNIS3: A new architecture for generic corpus query and visualization
This article is concerned with the data structures, properties of query languages, and visualization facilities required for the generic representation of richly annotated, heterogeneous linguisticExpand
  • 82
  • 8
RIDGES Herbology: designing a diachronic multi-layer corpus
This paper introduces a multi-layer corpus architecture with multiple tokenizations using the open source historical, diachronic corpus of German called Register in Diachronic German Science. TheExpand
  • 11
  • 4
Universal Dependencies 2.1
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development,Expand
  • 31
  • 3
CityU corpus of essay drafts of English language learners: a corpus of textual revision in second language writing
Abstract Learner corpora consist of texts produced by non-native speakers. In addition to these texts, some learner corpora also contain error annotations, which can reveal common errors made byExpand
  • 14
  • 1
rstWeb - A Browser-based Annotation Interface for Rhetorical Structure Theory and Discourse Relations
This paper presents rstWeb, a new browserbased interface for Rhetorical Structure Theory and other discourse relation annotations. Expanding on previous tools for RST, rstWeb allows annotators toExpand
  • 20
  • 1
When Annotation Schemes Change Rules Help: A Configurable Approach to Coreference Resolution beyond OntoNotes
  • Amir Zeldes, Shuo Zhang
  • Computer Science
  • CORBON@HLT-NAACL
  • 2016
This paper approaches the challenge of adapting coreference resolution to different coreference phenomena and mention-border definitions when there is no access to large training data in the desiredExpand
  • 9
  • 1
An NLP Pipeline for Coptic
The Coptic language of Hellenistic era Egypt in the first millennium C.E. is a treasure trove of information for History, Religious Studies, Classics, Linguistics and many other HumanitiesExpand
  • 6
  • 1
[tiger2] As a standardized serialisation for ISO 24615 - SynAF
This paper presents the application of the format to various linguistic scenarios with the aim of making it the standard serialisation for the ISO 24615 (SynAF) standard. After outlining the mainExpand
  • 9
  • 1
Building and Using a Richly Annotated Interlinear Diachronic Corpus: The Case of Old High German Tatian
The present paper reports on the development and evaluation of a historical corpus designed to support detailed empirical studies on the inter action of information structure and syntax in Old HighExpand
  • 8
  • 1