Language Recognition for Mono-and Multi-lingual Documents

  title={Language Recognition for Mono-and Multi-lingual Documents},
  • Published 2007
In this paper we describe language recognition algorithms for monoand multi-lingual documents that are based on mixed-order n-grams, Markov chains, maximum likelihood, and dynamic programming. We compare the monolingual algorithm to those suggested by other researchers. This comparison suggests that this algorithm significantly outperforms commonly used language recognition algorithms. We then describe the multilingual algorithm, which allows for segmenting a multilingual document into single… CONTINUE READING

From This Paper

Topics from this paper.


Publications referenced by this paper.
Showing 1-4 of 4 references

Bigram and trigram models for language identification and character recognition

Proceedings of the 1994 AISB Workshop on Computational Linguistics for Speech and Handwriting Recognition • 1994

A language identification table. The incorporated linguist

View 1 Excerpt

Similar Papers

Loading similar papers…