UdS-(retrain|distributional|surface): Improving POS Tagging for OOV Words in German CMC and Web Data

We present in this paper our three system submissions for the POS tagging subtask of the Empirist Shared Task: Our baseline systemUdS-retrain extends a standard training dataset with in-domain training data; UdSdistributional and UdS-surface add two different ways of handling OOV words on top of the baseline system by using either distributional information or a combination of surface similarity and language model information. We reach the best performance using the distributional model. 

