• Publications
  • Influence
Domain Adaptation for Parsing
TLDR
We compare two different methods in domain adaptation applied to constituent parsing: parser combination and cotraining, each used to transfer information from the source domain of news to the target domain of natural dialogs, in a setting without annotated data. Expand
  • 47
  • 3
  • PDF
The IUCL+ System: Word-Level Language Identification via Extended Markov Models
TLDR
We describe the IUCL+ system for the shared task of the First Workshop on Computational Approaches to Code Switching (Solorio et al., 2014), in which participants were challenged to label each word in Twitter texts as a named entity or one of two candidate languages. Expand
  • 16
  • 2
  • PDF
Adding Context Information to Part Of Speech Tagging for Dialogues
TLDR
We investigate the performance of Markov model and maximum entropy POS taggers given a small data set of spontaneous dialogues in a collaborative search task. Expand
  • 11
  • 1
  • PDF
Mirroring the real world in social media: twitter, geolocation, and sentiment analysis
TLDR
In recent years social media has been used to characterize and predict real world events, and we seek to investigate how closely Twitter mirrors the real world. Expand
  • 32
  • PDF
Fast Domain Adaptation for Part of Speech Tagging for Dialogues
TLDR
We investigate a fast method for domain adaptation, which provides additional in-domain training data from an unannotated data set by applying POS taggers with different biases to the data set and then choosing the set of sentences on which the taggers agree. Expand
  • 16
  • PDF
Parallel Syntactic Annotation in CReST
TLDR
In this paper, we introduce the syntactic annotation of the CReST corpus, a corpus of natural language dialogues obtained from humans performing a cooperative, remote search task. Expand
  • 2
  • PDF
Projecting Farsi POS Data To Tag Pashto
TLDR
We present our findings on projecting part of speech (POS) information from a well resourced language, Farsi, to help tag a lower resourcing language, Pashto, following Feldman and Hana. Expand