Learn More
The rst part of the paper brieey introduces what automatic au-thoring of a hypertext for information retrieval means. The most diicult part of the automatic construction of a hypertext is the creation of links connecting documents or document fragments that are semantically related. Because of this, to many researchers it seemed natural to use IR techniques(More)
In this paper, we present a method based on Hidden Markov Models (HMMs) to generate statistical stemmers. Using a list of words as training set, the method estimates the HMM parameters which are used to calculate the most probable stem for an arbitrary word. Stemming is performed by computing the most probable path, through the HMM states, corresponding to(More)
Some methods for rank correlation in evaluation are examined and their relative advantages and disadvantages are discussed. In particular, it is suggested that different test statistics should be used for providing additional information about the experiments other that the one provided by statistical significance testing. Kendall's τ is often used for(More)
The paper describes the design and implementation of TACHIR, a tool for the automatic construction of hypertexts for Information Retrieval. Through the use of an authoring methodology employing a set of well known Information Retrieval techniques, TACHIR automatically builds up a hypertext from a document collection. The structure of the hypertext reeects a(More)
This paper presents the results of two separate studies into electronic book production. The Visual Book (Landoni, 1997) which explored the importance of the visual component of the book metaphor in the production of " good " electronic books, and the Hyper-TextBook (Crestani and Melucci, 1998a) which instead concentrated on the importance of hypertext(More)
In Information Retrieval (IR), stemming is used to reduce variant word forms to common root. The assumption is that if two words have the same root, then they represent the same concept. Hence stemming permits a IR system to match query and document terms which are related to a same meaning but which can appear in different morphological variants. In this(More)