Topic Modeling for Native Language Identification

  title={Topic Modeling for Native Language Identification},
  author={Sze-Meng Jojo Wong and Mark Dras and Mark Johnson},
Native language identification (NLI) is the task of determining the native language of an author writing in a second language. Several pieces of earlier work have found that features such as function words, part-of-speech n-grams and syntactic structure are helpful in NLI, perhaps representing characteristic errors of different native language speakers. This paper looks at the idea of using Latent Dirichlet Allocation as a feature clustering technique over lexical features to see whether there… CONTINUE READING
Highly Cited
This paper has 17 citations. REVIEW CITATIONS


Publications citing this paper.


Publications referenced by this paper.
Showing 1-10 of 27 references

International Corpus of Learner English (Version 2)

  • Sylviane Granger, Estelle Dagneaux, Fanny Meunier, Magali Paquot.
  • Presses Universitaires de Louvain, Louvian-la…
  • 2009
Highly Influential
6 Excerpts

A New Horizon in Learner Corpus Studies: The Aim of the ICNALE Project

  • Shun’ichiro Ishikawa
  • Corpora and Language Technologies in Teaching…
  • 2011
2 Excerpts

Similar Papers

Loading similar papers…