Fraudulent Support Telephone Number Identification Based on Co-Occurrence Information on the Web

Abstract

“Fraudulent support phones” refers to the misleading telephone numbers placed on Web pages or other media that claim to provide services with which they are not associated. Most fraudulent support phone information is found on search engine result pages (SERPs), and such information substantially degrades the search engine user experience. In this paper, we propose an approach to identify fraudulent support telephone numbers on the Web based on the co-occurrence relations between telephone numbers that appear on SERPs. We start from a small set of seed official support phone numbers and seed fraudulent numbers. Then, we construct a co-occurrence graph according to the co-occurrence relationships of the telephone numbers that appear on Web pages. Additionally, we take the page layout information into consideration on the assumption that telephone numbers that appear in nearby page blocks should be regarded as more closely related. Finally, we develop a propagation algorithm to diffuse the trust scores of seed official support phone numbers and the distrust scores of the seed fraudulent numbers on the co-occurrence graph to detect additional fraudulent numbers. Experimental results based on over 1.5 million SERPs produced by a popular Chinese commercial search engine indicate that our approach outperforms TrustRank, Anti-TrustRank and Good-Bad Rank algorithms by achieving an AUC value of over 0.90.

Extracted Key Phrases

5 Figures and Tables

Cite this paper

@inproceedings{Li2014FraudulentST, title={Fraudulent Support Telephone Number Identification Based on Co-Occurrence Information on the Web}, author={Xin Li and Yiqun Liu and Min Zhang and Shaoping Ma}, booktitle={AAAI}, year={2014} }