Web spam identification through content and hyperlinks

Abstract

We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it <i>simultaneously</i> exploits the structure of the Web graph as well as page contents and features. The method is efficient, scalable, and provides state-of-the-art accuracy on a standard Web spam benchmark. 
DOI: 10.1145/1451983.1451994

Topics

3 Figures and Tables

Statistics

0102020082009201020112012201320142015201620172018
Citations per Year

93 Citations

Semantic Scholar estimates that this publication has 93 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Abernethy2008WebSI, title={Web spam identification through content and hyperlinks}, author={Jacob D. Abernethy and Olivier Chapelle and Carlos Castillo}, booktitle={AIRWeb}, year={2008} }