Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model

@article{Isa2008TextDP,
  title={Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model},
  author={Dino Isa and Lam Hong Lee and V. P. Kallimani and Rajprasad Rajkumar},
  journal={Computer and Information Science},
  year={2008},
  volume={1},
  pages={79-90}
}
This work utilizes the Bayes formula to vectorize a document according to a probability distribution based on keywords reflecting the probable categories that the document may belong to. The Bayes formula gives a range of probabilities to which the document can be assigned according to a pre determined set of topics (categories). Using this probability distribution as the vectors to represent the document, the text classification algorithms based on the vector space model, such as the Support… CONTINUE READING

Citations

Publications citing this paper.
Showing 1-5 of 5 extracted citations

References

Publications referenced by this paper.
Showing 1-10 of 24 references

Defect Detection in Oil and Gas Pipelines using the Support Vector Machines, In Proceeding the Circuits, Systems, Electronics and Communication

D. Isa, R. Rajkumar, K. C. Woo
2007

A Domain Knowledge Preserving in Process Engineering using Self-Organizing Concept

M. Hartley, D. Isa, V. P. Kallimani, L. H. Lee
Intelligent System Group, Faculty of Engineering and Computer Science, • 2006

Support Vector Machines Basics, School Of Engineering Report 616

V. Kecman
University of Auckland, • 2004

A Naïve Bayes Spam Filter, Faculty of Computer Science, University of Berkely

K. Wei
2003

Similar Papers

Loading similar papers…