• Publications
  • Influence
#hardtoparse: POS Tagging and Parsing the Twitterverse
We evaluate the statistical dependency parser, Malt, on a new dataset of sentences taken from tweets. We use a version of Malt which is trained on gold standard phrase structure Wall Street JournalExpand
  • 122
  • 17
  • PDF
From News to Comment: Resources and Benchmarks for Parsing the Language of Web 2.0
We investigate the problem of parsing the noisy language of social media. We evaluate four Wall-Street-Journal-trained statistical parsers (Berkeley, Brown, Malt and MST) on a new dataset containingExpand
  • 79
  • 11
  • PDF
Building a wordnet for Turkish
This paper summarizes the development process of a wordnet for Turkish as part of the Balkanet project. After discussing the basic method-ological issues that had to be resolved during the course ofExpand
  • 72
  • 10
  • PDF
(Re)ranking Meets Morphosyntax: State-of-the-art Results from the SPMRL 2013 Shared Task
This paper describes the IMS-SZEGED-CIS contribution to the SPMRL 2013 Shared Task. We participate in both the constituency and dependency tracks, and achieve state-of-theart for all languages. ForExpand
  • 46
  • 10
  • PDF
Challenges of Computational Processing of Code-Switching
This paper addresses challenges of Natural Language Processing (NLP) on non-canonical multilingual data in which two or more languages are mixed. It refers to code-switching which has become moreExpand
  • 41
  • 5
  • PDF
Introducing the IMS-Wrocław-Szeged-CIS entry at the SPMRL 2014 Shared Task: Reranking and Morpho-syntax meet Unlabeled Data
We summarize our approach taken in the SPMRL 2014 Shared Task on parsing morphologically rich languages. Our approach builds upon our contribution from last year, with a number of modifications andExpand
  • 19
  • 5
  • PDF
Lemmatization and Lexicalized Statistical Parsing of Morphologically-Rich Languages: the Case of French
This paper shows that training a lexicalized parser on a lemmatized morphologically-rich treebank such as the French Treebank slightly improves parsing results. We also show that lemmatizing aExpand
  • 27
  • 3
  • PDF
ParGramBank: The ParGram Parallel Treebank
This paper discusses the construction of a parallel treebank currently involving ten languages from six language families. The treebank is based on deep LFG (LexicalFunctional Grammar) grammars thatExpand
  • 27
  • 2
  • PDF
LFG without C-structures
We explore the use of two dependency parsers, Malt and MST, in a Lexical Functional Grammar parsing pipeline. We compare this to the traditional LFG parsing pipeline which uses constituency parsers.Expand
  • 19
  • 2
Irish Treebanking and Parsing: A Preliminary Evaluation
Language resources are essential for linguistic research and the development of NLP applications. Low- density languages, such as Irish, therefore lack significant research in this area. This paperExpand
  • 13
  • 2
  • PDF