Collective Classification of Social Network Spam

Abstract

Unsolicited or unwanted messages is a byproduct of virtually every popular social media website. Spammers have become increasingly proficient at bypassing conventional spam filters, prompting a stronger effort to develop new methods that accurately detect spam while simultaneously acting as a more robust classifier against users that modify their behavior in order to avoid detection. This paper shows the usefulness of a relational model that works in conjunction with an independent model. First, an independent model is built using features that characterize individual comments and users, capturing the cases where spam is obvious. Second, a relational model is built, taking advantage of the interconnected nature of users and their comments. By feeding our initial predictions from the independent model into the relational model, we can start to propagate information about spammers and spam comments to jointly infer the labels of all spam comments at the same time. This allows us to capture the obfuscated spam comments missed by the independent model that are only found by looking at the relational structure of the social network. The results from our experiments demonstrates the viability of our method, and shows that models utilizing the underlying structure of the social network are more effective at detecting spam than ones that do not.

10 Figures and Tables

Cite this paper

@inproceedings{Brophy2016CollectiveCO, title={Collective Classification of Social Network Spam}, author={Jonathan Brophy and Daniel Lowd}, year={2016} }