Massimo Melucci

Learn More
Information retrieval (IR) models based on vector spaces have been investigated for a long time. Nevertheless, they have recently attracted much research interest. In parallel, context has been rediscovered as a crucial issue in information retrieval. This article presents a principled approach to modeling context and its role in ranking information objects(More)
Today managing textual resources and providing full-text search capabilities on them is a relevant issue also for database management systems. Stemming is part of the indexing and searching processes, when we deal with textual resources. In this paper we present a languageindependent probabilistic model which can automatically generate stemmers for several(More)
The rst part of the paper brie y introduces what automatic authoring of a hypertext for information retrieval means. The most di cult part of the automatic construction of a hypertext is the creation of links connecting documents or document fragments that are semantically related. Because of this, to many researchers it seemed natural to use IR techniques(More)
In this paper, we present a method based on Hidden Markov Models (HMMs) to generate statistical stemmers. Using a list of words as training set, the method estimates the HMM parameters which are used to calculate the most probable stem for an arbitrary word. Stemming is performed by computing the most probable path, through the HMM states, corresponding to(More)
In this paper, context is modeled by vector space bases and its evolution is modeled by linear transformations from one base to another. Each document or query can be associated to a distinct base, which corresponds to one context. Also, algorithms are proposed to discover contexts from document, query or groups or them. Linear algebra can thus by employed(More)
Some methods for rank correlation in evaluation are examined and their relative advantages and disadvantages are discussed. In particular, it is suggested that different test statistics should be used for providing additional information about the experiments other that the one provided by statistical significance testing. Kendall's τ is often used for(More)
The paper describes the design and implementation of TACHIR, a tool for the automatic construction of hypertexts for Information Retrieval. Through the use of an authoring methodology employing a set of well known Information Retrieval techniques, TACHIR automatically builds up a hypertext from a document collection. The structure of the hypertext re ects a(More)