Web page classification using n-gram based URL features

@article{Rajalakshmi2013WebPC,
  title={Web page classification using n-gram based URL features},
  author={Ramachandran Rajalakshmi and Chandrabose Aravindan},
  journal={2013 Fifth International Conference on Advanced Computing (ICoAC)},
  year={2013},
  pages={15-21}
}
Exponential increase in the number of web pages in the World Wide Web poses a great challenge in information filtering and also makes topic focused crawling a time consuming process in searching for relevant information. We propose an URL based web page classification method that does not need either the web page content or its link structure. In the proposed approach, character n-gram based features are extracted from URLs alone and classification is done by Support Vector Machines and Maximum… CONTINUE READING
8 Citations
8 References
Similar Papers

References

Publications referenced by this paper.
Showing 1-8 of 8 references

Selvakuberan, "Machine Learning Techniques for automated web page classification using URL features,

  • M. 1. Devi, R. Rajaram
  • Proceedings of the International Conference on…
  • 2007

The Role of URLs in objectionable web content categorization," in IEEEfWIC/ACM

  • J. Zhang, J Q., Q. Yan
  • International Conference on Web Intelligence,
  • 2006

Joachims , " Making large - scale SVM learning practical , " in Advances in Kernel Methods - Support Vector Learning , B . Sch " olkopf, C . Burges , and

  • A. McCallum
  • IJCAI - 99 Workshop on Machine Learning for…

Similar Papers

Loading similar papers…