Learn More
This article discusses efficiency and effectiveness issues in caching the results of queries submitted to a Web search engine (WSE). We propose SDC (Static Dynamic Cache), a new caching strategy aimed to efficiently exploit the temporal and spatial locality present in the stream of processed queries. SDC extracts from historical usage data the results of(More)
Hierarchical Text Categorization (HTC) is the task of generating (usually by means of supervised learning algorithms) text classifiers that operate on hierarchically structured classification schemes. Notwithstanding the fact that most large-sized classification schemes for text have a hierarchical structure, so far the attention of text classification(More)
AdaBoost.MH is a popular supervised learning algorithm for building multi-label (aka n-of-m) text classifiers. AdaBoost.MH belongs to the family of " boosting " algorithms, and works by iteratively building a committee of " decision stump " classifiers, where each such classifier is trained to especially concentrate on the document-class pairs that(More)
In this paper we propose TreeBoost.MH, an algorithm for multi-label Hierarchical Text Categorization (HTC) consisting of a hierarchical variant of AdaBoost.MH. TreeBoost.MH embodies several intuitions that had arisen before within HTC: e.g. the intuitions that both feature selection and the selection of negative training examples should be performed "(More)
Hierarchical text classification (HTC) approaches have recently attracted a lot of interest on the part of researchers in human language technology and machine learning, since they have been shown to bring about equal, if not better, classification accuracy with respect to their " flat " counterparts while allowing exponential time savings at both learning(More)
We present a system for image classification based on an adaptive committee of five classifiers, each specialized on classifying images based on a single MPEG-7 feature. We test four different ways to set up such a committee, and obtain important accuracy improvements with respect to a baseline in which a single classifier, working an all five features at(More)
The mass adoption of smartphone and tablet devices has boosted the growth of the mobile applications market. Confronted with a huge number of choices, users may encounter difficulties in locating the applications that meet their needs. Sorting applications into a user-defined classification scheme would help the app discovery process. Systems for(More)
In this paper we tackle the problem of image search when the query is a short textual description of the image the user is looking for. We choose to implement the actual search process as a similarity search in a visual feature space, by learning to translate a textual query into a visual representation. Searching in the visual feature space has the(More)
Classifying companies by industry sector is an important task in finance, since it allows investors and research analysts to analyse specific subsectors of local and global markets for investment monitoring and planning purposes. Traditionally this classification activity has been performed manually, by dedicated specialists carrying out in-depth analysis(More)