Dimitris Meretakis

Learn More
In this paper, we propose a new feature-selection algorithm for text classification, called best terms (BT). The complexity of BT is linear in respect to the number of the training-set documents and is independent from both the vocabulary size and the number of categories. We evaluate BT on two benchmark document collections, Reuters-21578 and(More)
Instance selection and feature selection are two orthogonal methods for reducing the amount and complexity of data. Feature selection aims at the reduction of redundant features in a dataset whereas instance selection aims at the reduction of the number of instances. So far, these two methods have mostly been considered in isolation. In this paper, we(More)
Naïve Bayes (NB) classifier has long been considered a core methodology in text classification mainly due to its simplicity and computational efficiency. There is an increasing need however for methods that can achieve higher classification accuracy while maintaining the ability to process large document collections. In this paper we examine text(More)
We investigate the relationship between association and classification mining. The main issue in association mining is the discovery of interesting patterns of the data, so called itemsets. We introduce the notion of labeled itemsets and derive the surprising result that classification techniques such as decision trees, Naïve Bayes, Bayesian networks and(More)
1. ABSTRACT This paper deals with queries involving the retrieval of images that contain certain object configurations. Consider, for instance, that a user wants to “find all images where there exists a building adjacent to the west side of a park which is southwest and near a commercial center”. This query can be formulated as a constraint satisfaction(More)
  • 1