Zeynep Hilal Kilimci

Learn More
It has been shown that Latent Semantic Indexing (LSI) takes advantage of implicit higher-order (or latent) structure in the association of terms and documents. Higher order relations in LSI capture "latent semantics". Inspired by this, a novel Bayesian framework for classification named Higher Order Naïve Bayes (HONB), which can explicitly make use(More)
Naïve Bayes is a commonly used algorithm in text categorization because of its easy implementation and low complexity. Naïve Bayes has mainly two event models used for text categorization which are multivariate Bernoulli and multinomial models. A very large number of studies choose multinomial model and Laplace smoothing just based on the(More)
It is known that latent semantic indexing (LSI) takes advantage of implicit higher-order (or latent) structure in the association of terms and documents. Higher-order relations in LSI capture “latent semantics”. These findings have inspired a novel Bayesian framework for classification named Higher-Order Naive Bayes (HONB), which was introduced previously,(More)
In this paper, we mainly study on n-gram models on text classification domain. In order to measure impact of n-gram models on the classification performance, we carry out Naïve Bayes classifier with various smoothing methods. Naïve Bayes classifier has generally used two main event models for text classification which are Bernoulli and(More)
Text categorization has become more and more popular and important problem day by day because of the large proliferation of documents in many fields. To come up with this problem, several machine learning techniques are used for categorization such as naîve Bayes, support vector machines, artificial neural networks, etc. In this study, we(More)
Majority of the existing text classification algorithms are based on the “bag of words” (BOW) approach, in which the documents are represented as weighted occurrence frequencies of individual terms. However, semantic relations between terms are ignored in this representation. There are several studies which address this problem by integrating(More)
  • 1