Cognate and Misspelling Features for Natural Language Identification

  title={Cognate and Misspelling Features for Natural Language Identification},
  author={Garrett Nicolai and Bradley Hauer and Mohammad Salameh and Lei Yao and Grzegorz Kondrak},
We apply Support Vector Machines to differentiate between 11 native languages in the 2013 Native Language Identification Shared Task. We expand a set of common language identification features to include cognate interference and spelling mistakes. Our best results are obtained with a classifier which includes both the cognate and the misspelling features, as well as word unigrams, word bigrams, character bigrams, and syntax production rules. 
Highly Cited
This paper has 25 citations. REVIEW CITATIONS

Similar Papers

Loading similar papers…