Selection Bias, Label Bias, and Bias in Ground Truth

  title={Selection Bias, Label Bias, and Bias in Ground Truth},
  author={Anders S\ogaard and Barbara Plank and Dirk Hovy},
Language technology is biased toward English newswire. In POS tagging, we get 97–98 words right out of a 100 in English newswire, but results drop to about 8 out of 10 when running the same technology on Twitter data. In dependency parsing, we are able to identify the syntactic head of 9 out of 10 words in English newswire, but only 6–7 out of 10 in tweets… CONTINUE READING