• Publications
  • Influence
Domain Adaptation for Parsing
We compare two different methods in domain adaptation applied to constituent parsing: parser combination and cotraining, each used to transfer information from the source domain of news to the targetExpand
The IUCL+ System: Word-Level Language Identification via Extended Markov Models
TLDR
The IUCL+ system combines character n-gram probabilities, lexical probabilities, word label transition probabilities and existing named entity recognitiontools within a Markovmodel framework that weights these components and assigns a label. Expand
Word-level language identification in The Chymistry of Isaac Newton
TLDR
The task of word-based language identification in multilingual texts, in which every word needs to be classified with regard to its language, is introduced and a novel method based on character n-grams in combination with a weighting scheme that allows to model the probability of language switches at different points in sentences is presented. Expand
Shallow Semantic Analysis of Interactive Learner Sentences
TLDR
This paper collects data from a task which models some aspects of interaction, namely a picture description task (PDT), and uses a decision tree to classify sentences into syntactic types and extract the logical subject, verb, and object. Expand
Leveraging known Semantics for Spelling Correction
TLDR
This work explores the use of spelling correction tools and language modeling to correct misspellings that often lead to errors in obtaining semantic forms, and shows that such tools can significantly reduce the number of unanalyzable cases. Expand
Shallow Semantic Reasoning from an Incomplete Gold Standard for Learner Language
TLDR
Different models of representing and scoring non-native speaker responses to a picture, including bags of dependencies, automatically determining the relevant parts of an image from a set of native speaker (NS) responses are explored. Expand
Annotating picture description task responses for content analysis
TLDR
By examining the decisions made in this corpus development, this work highlights the questions facing anyone working with learner language properties like variability, acceptability and native-likeness. Expand
IUCL: Combining Information Sources for SemEval Task 5
TLDR
The Indiana University system for SemEval Task 5, the L2 writing assistant task, is described, incorporating phrase tables extracted from bitexts, an L2 language model, a multilingual dictionary, and dependency-based collocational models derived from large samples of targetlanguage text. Expand