Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model

@article{Isa2008TextDP,
title={Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model},
author={Dino Isa and Lam Hong Lee and V. P. Kallimani and Rajprasad Rajkumar},
journal={Computer and Information Science},
year={2008},
volume={1},
pages={79-90}
}

Published 2008 in Computer and Information Science

This work utilizes the Bayes formula to vectorize a document according to a probability distribution based on keywords reflecting the probable categories that the document may belong to. The Bayes formula gives a range of probabilities to which the document can be assigned according to a pre determined set of topics (categories). Using this probability distribution as the vectors to represent the document, the text classification algorithms based on the vector space model, such as the Support… CONTINUE READING