CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites
@article{Xiang2011CANTINAAF, title={CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites}, author={Guang Xiang and Jason I. Hong and Carolyn Penstein Ros{\'e} and Lorrie Faith Cranor}, journal={ACM Trans. Inf. Syst. Secur.}, year={2011}, volume={14}, pages={21:1-21:28} }
Phishing is a plague in cyberspace. Typically, phish detection methods either use human-verified URL blacklists or exploit Web page features via machine learning techniques. However, the former is frail in terms of new phish, and the latter suffers from the scarcity of effective features and the high false positive rate (FP). To alleviate those problems, we propose a layered anti-phishing solution that aims at (1) exploiting the expressiveness of a rich set of features with machine learning to…
Figures and Tables from this paper
400 Citations
PhishMon: A Machine Learning Framework for Detecting Phishing Webpages
- Computer Science2018 IEEE International Conference on Intelligence and Security Informatics (ISI)
- 2018
Through extensive evaluation on a dataset consisting of 4,800 distinct phishing and 17,500 distinct benign webpages, it is shown that PhishMon can distinguish unseen phishing from legitimate webpages with a very high degree of accuracy.
An Adaptive Machine Learning Based Approach for Phishing Detection Using Hybrid Features
- Computer Science2019 5th International Conference on Web Research (ICWR)
- 2019
This work develops a reliable detection system which can adaptively match the changing environment and phishing websites and does not require any service from the third-party.
Efficient deep learning techniques for the detection of phishing websites
- Computer Science
- 2020
Novel phishing URL detection models using Deep Neural Network, Long Short-Term Memory, and Convolution Neural Network are proposed using only 10 features of earlier work, which achieves an accuracy of 99.52% for DNN, 99.57% for LSTM and 99.43% for CNN.
Machine LearningTechniquesfor Detection of Website Phishing: A Review for Promises and Challenges
- Computer Science2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC)
- 2021
It is suggested that Internet users should know about phishing to avoid cyber-attacks and identify deep learning-based techniques with better performance for detecting phishing websites compared to the conventional ML techniques.
A comprehensive and efficacious architecture for detecting phishing webpages
- Computer ScienceComput. Secur.
- 2014
Phishing Website Detection Based on Multidimensional Features Driven by Deep Learning
- Computer ScienceIEEE Access
- 2019
A multidimensional feature phishing detection approach based on a fast detection method by using deep learning that can reduce the detection time for setting a threshold and the experimental results show that the detection efficiency can be improved.
Towards detection of phishing websites on client-side using machine learning based approach
- Computer ScienceTelecommun. Syst.
- 2018
A machine learning based novel anti-phishing approach that extracts the features from client side only that has relatively high accuracy in detection of phishing websites as it achieved 99.39% true positive rate and 99.09% of overall detection accuracy.
Boosting the phishing detection performance by semantic analysis
- Computer Science2017 IEEE International Conference on Big Data (Big Data)
- 2017
This work extracts a series of semantic features through word2vec to better describe the features of phishing sites, and further fuse them with other multi-scale statistical features to construct a more robust phishing detection model.
Building Robust Phishing Detection System: an Empirical Analysis
- Computer ScienceProceedings 2020 Workshop on Measurements, Attacks, and Defenses for the Web
- 2020
This work proposes a simple approach to build a robust phishing page detection system, based on voting, that performs close to the native model when there is no adversarial attack, and is more robust against evasion attacks than thenative model.
An Automated Framework for Real-time Phishing URL Detection
- Computer Science2020 10th Annual Computing and Communication Workshop and Conference (CCWC)
- 2020
This work has validated their framework with a real dataset achieving 87% accuracy in a real-time setup and formulating a robust framework for fast and automated detection of phishing URLs.
References
SHOWING 1-10 OF 41 REFERENCES
A Hierarchical Adaptive Probabilistic Approach for Zero Hour Phish Detection
- Computer ScienceESORICS
- 2010
The key insight behind the detection algorithm is to leverage existing human-verified blacklists and apply the shingling technique, a popular near-duplicate detection algorithm used by search engines, to detect phish in a probabilistic fashion with very high accuracy.
A hybrid phish detection approach by identity discovery and keywords retrieval
- Computer ScienceWWW '09
- 2009
A novel hybrid phish detection method based on information extraction (IE) and information retrieval (IR) techniques that requires no training data, no prior knowledge of phishing signatures and specific implementations, and is able to adapt quickly to constantly appearing new phishing patterns.
PhishDef: URL names say it all
- Computer Science2011 Proceedings IEEE INFOCOM
- 2011
This paper proposes PhishDef, a phishing detection system that uses only URL names and combines the above three elements, a highly accurate method, lightweight (thus appropriate for online and client-side deployment), proactive (based on online classification rather than blacklists), and resilient to training data inaccuracies.
A layout-similarity-based approach for detecting phishing pages
- Computer Science2007 Third International Conference on Security and Privacy in Communications Networks and the Workshops - SecureComm 2007
- 2007
An extension of the AntiPhish system (called DOMAntiPhish) is presented, which leverages layout similarity information to distinguish between malicious and benign web pages and significantly reduces the false alarm rate.
Learning to detect phishing emails
- Computer ScienceWWW '07
- 2007
This method is applicable, with slight modification, to detection of phishing websites, or the emails used to direct victims to these sites, and correctly identify over 96% of the phishing emails while only mis-classifying on the order of 0.1%" of the legitimate emails.
Anomaly Based Web Phishing Page Detection
- Computer Science2006 22nd Annual Computer Security Applications Conference (ACSAC'06)
- 2006
The idea is to examine the anomalies in Web pages, in particular, the discrepancy between a Web site's identity and its structural features and HTTP transactions, which demands neither user expertise nor prior knowledge of the Web site.
A framework for detection and measurement of phishing attacks
- Computer ScienceWORM '07
- 2007
It is found that it is often possible to tell whether or not a URL belongs to a phishing attack without requiring any knowledge of the corresponding page data.
Visual-similarity-based phishing detection
- Computer ScienceSecureComm
- 2008
This paper identifies and considers three page features that play a key role in making a phishing page look similar to a legitimate one and performs an experimental evaluation using a dataset composed of 41 real-world phishing pages, along with their corresponding legitimate targets.
On the Effectiveness of Techniques to Detect Phishing Sites
- Computer ScienceDIMVA
- 2007
Over a period of three weeks, the effectiveness of the blacklists maintained by Google and Microsoft with 10,000 phishing URLs was tested, and the existence of page properties that can be used to identify phishing pages were explored.
Cantina: a content-based approach to detecting phishing web sites
- Computer ScienceWWW '07
- 2007
The design, implementation, and evaluation of CANTINA, a novel, content-based approach to detecting phishing web sites, based on the TF-IDF information retrieval algorithm, are presented.