A Model for Overlapping Trigram Technique for Telugu Script

@inproceedings{Vardhan2007AMF,
  title={A Model for Overlapping Trigram Technique for Telugu Script},
  author={B. Vishnu Vardhan and L. Pratap Reddy and A. VinayBabu},
  year={2007}
}
N-grams are consecutive overlapping N-character sequences formed from an input stream. N-grams are used as alternatives to word-based retrieval in a number of systems. In this paper we propose a model applicable to categorization of Telugu document. Telugu is an official language derived from ancient Brahmi script and also the official language of the state of Andhra Pradesh. Brahmi based script is noted for complex conjunct formations. The canonical structure is described as ((C) C) CV. The… CONTINUE READING