Dimitris Meretakis

Learn More
Large Bayes (LB) is a recently introduced classifier built from frequent and interesting itemsets. LB uses itemsets to create context-specific probabilistic models of the data and estimate the conditional probability P(c i |A) of each class c i given a case A. In this paper we use chi-square tests to address several drawbacks of the originally proposed(More)
Naïve Bayes (NB) classifier has long been considered a core methodology in text classification mainly due to its simplicity and computational efficiency. There is an increasing need however for methods that can achieve higher classification accuracy while maintaining the ability to process large document collections. In this paper we examine text(More)
In this paper we propose a new algorithm for Feature Selection, called Best Terms (BT). BT is linear with respect to the number of the training-set documents, while it is independent from both the vocabulary size and the number of training-set categories. We evaluate BT on two benchmark document collections, Reuters-21578 and 20-Newsgroups, using two(More)
Instance selection and feature selection are two orthogonal methods for reducing the amount and complexity of data. Feature selection aims at the reduction of redundant features in a dataset whereas instance selection aims at the reduction of the number of instances. So far, these two methods have mostly been considered in isolation. In this paper, we(More)
We investigate the relationship between association and classification mining. The main issue in association mining is the discovery of interesting patterns of the data, so called itemsets. We introduce the notion of labeled itemsets and derive the surprising result that classification techniques such as decision trees, Naïve Bayes, Bayesian networks and(More)
1. ABSTRACT This paper deals with queries involving the retrieval of images that contain certain object configurations. Consider, for instance, that a user wants to " find all images where there exists a building adjacent to the west side of a park which is southwest and near a commercial center ". This query can be formulated as a constraint satisfaction(More)
  • 1