Learn More
This paper reports our ongoing project for constructing an English multiword expression (MWE) dictionary and NLP tools based on the developed dictionary. We extracted functional MWEs from the English part of Wik-tionary, annotated the Penn Treebank (PTB) with MWE information, and conducted POS tagging experiments. We report how the MWE annotation is done on(More)
We propose a framework to model human comprehension of discourse connec-tives. Following the Bayesian pragmatic paradigm, we advocate that discourse con-nectives are interpreted based on a simulation of the production process by the speaker, who, in turn, considers the ease of interpretation for the listener when choosing connectives. Evaluation against the(More)
Discourse relations can either be implicit or explicitly expressed by markers, such as 'therefore' and 'but'. How a speaker makes this choice is a question that is not well understood. We propose a psy-cholinguistic model that predicts whether a speaker will produce an explicit marker given the discourse relation s/he wishes to express. Based on the(More)
Professional human translators usually do not employ the concept of word alignments , producing translations 'sense-for-sense' instead of 'word-for-word'. This suggests that unalignable words may be prevalent in the parallel text used for machine translation (MT). We analyze this phenomenon in-depth for Chinese-English translation. We further propose a(More)
  • 1