URL2Vec: URL Modeling with Character Embeddings for Fast and Accurate Phishing Website Detection

@article{Yuan2018URL2VecUM,
  title={URL2Vec: URL Modeling with Character Embeddings for Fast and Accurate Phishing Website Detection},
  author={Huaping Yuan and Zhenguo Yang and Feng Shi and Yukun Li and Wenyin Liu},
  journal={2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)},
  year={2018},
  pages={265-272}
}
A deep learning-based approach to phishing detection is proposed. Specifically, websites' URLs and the characters in these URLs are mapped to documents and words, respectively, in the context of word2vec-based word embedding learning. Consequently, character embedding can be achieved from a corpus of URLs in an unsupervised manner. Furthermore, we combine character embedding with the structures of URLs to obtain the vector representations of the URLs. In particular, an URL is partitioned into… CONTINUE READING

Similar Papers

Figures, Tables, Results, and Topics from this paper.

Key Quantitative Results

  • Extensive experiments conducted on two real-world datasets show the effectiveness of the proposed approach, which achieves an accuracy of 99.69% with 0.40% false positive and 99.79% true positives on the 1M-PD dataset.

References

Publications referenced by this paper.
SHOWING 1-10 OF 26 REFERENCES

Word2Vec Tutorial-The Skip-Gram

C. McCormick
  • 2016
VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL

Spam and phishing reports

D. Gudkova, M. Vergelis, T. Shcherbakova, N. Demidova
  • 2017. [Online]. Available: https://securelist.com/spam-and-phishing-in-q2-2017/81537/
  • 2017
VIEW 2 EXCERPTS
HIGHLY INFLUENTIAL

Comparative analysis of features based machine learning approaches for phishing detection

  • 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom)
  • 2016
VIEW 1 EXCERPT