Using phrases as features in email classification

@article{Chang2009UsingPA,
  title={Using phrases as features in email classification},
  author={Matthew Daniel Chang and Chung Keung Poon},
  journal={Journal of Systems and Software},
  year={2009},
  volume={82},
  pages={1036-1045}
}
In this paper, we report our experience on the use of phrases as basic features in the email classification problem. We performed extensive empirical evaluation using our large email collections and tested with three text classification algorithms, namely, a naive Bayes classifier and two k-NN classifiers using TF–IDF weighting and resemblance respectively. The investigation includes studies on the effect of phrase size, the size of local and global sampling, the neighbourhood size, and various… CONTINUE READING