Automatic Language Identification in Texts: A Survey


Language identification (LI) is the problem of determining the natural language that a document or part thereof is written in. Automatic LI has been extensively researched for over fifty years. Today, LI is a key part of many text processing pipelines, as text processing techniques generally assume that the language of the input text is known. Research in… (More)

12 Figures and Tables


  • Presentations referencing similar topics