Using Linear Algebra for Intelligent Information Retrieval

  title={Using Linear Algebra for Intelligent Information Retrieval},
  author={Michael W. Berry and Susan T. Dumais and Gavin W. O'Brien},
  journal={SIAM Rev.},
Currently, most approaches to retrieving textual materials from scientific databases depend on a lexical match between words in users’ requests and those in or assigned to documents in a database. ... 

Algorithms and Representations for Personalised Information Access

Personalised information access systems use historical feedback data, such as implicit and explicit ratings for textual documents and other items, to better locate the right or relevant information

Information Retrieval Modeling Techniques for Web Documents

A comparative study of various Best-Match Information Retrieval Techniques for word document is presented.

Intelligent information retrieval: some research trends

The focus is on the definition of “intelligent” systems, i.e. systems that can represent and manage the vagueness and uncertainty which is characteristic of the process of information searching and retrieval.

Topic Extraction and Bundling of Related Scientific Articles

An attempt is made to solve the problem of automatic classification of scientific articles based on common characteristics in order to speed up the development of knowledge retrieval systems and improve the efficiency of existing systems.

Automatic 3-Language Cross-Language Information Retrieval with Latent Semantic Indexing

This work uses a conpletely automatic retrieval method to create 3-way English-French-German information retrieval system, with no query tranlation required.

Latent Semantic Indexing based Intelligent Information Retrieval System for Digital Libraries

A novel approach to enhance the efficiency of the information retrieval system based on Latent Semantic Indexing, using intelligent information processing technique is presented.

Skinning the Cat: Comparing Alternative Text Mining Algorithms for Categorization

This work compares and contrast different approaches to text mining using Enterprise Miner for Text, and shows how category labels can be useful for Search Indexing, Document Filtering, and Summarization.

Mining the web - discovering knowledge from hypertext data

This chapter discusses the infrastructure of the Web, the future of Web mining, and applications of semi-supervised learning for text and similarity and clustering.

A Semantic Approach for Mining Biological XML Databases

This paper identifies an index-based approach to mining Bio-XML data and explores the improvement achieved in the quality of query results by the application of genetic algorithms.

Document Retrieval by Projection Based Frequency Distribution

In document retrieval task, random projection (RP) is a useful technique of dimension reduction. It can be obtained very quickly yet the recalculation is not necessary to any changes. However, in l...



LSI meets TREC: A Status Report

Describes the Latent Semantic Indexing approach, an extension of the vector retrieval method and the use of singular-value decomposition applied to the TREC collection.

Information Management Tools for Updating an SVD-Encoded Indexing Scheme

Latent Semantic Indexing (LSI) is a conceptual indexing technique which uses the SVD to estimate the underlying latent semantic structure of the word to document association, which dampens the effect of word choice variability by representing terms and documents using the (orthogonal) left and right singular vectors.

Improving retrieval performance by relevance feedback

Prescriptions are given for conducting text retrieval operations iteratively using relevance feedback, and evaluation data are included to demonstrate the effectiveness of the various methods.

An application of least squares fit mapping to text information retrieval

It is discovered that the knowledge about relevance among queries and documents can be used to obtain empirical connections between query terms and the canonical concepts which are used for indexing the content of documents.

Improving the retrieval of information from external sources

A statistical method is described called latent semantic indexing, which models the implicit higher order structure in the association of words and objects and improves retrieval performance by up to 30%.

Improving text retrieval for the routing problem using latent semantic indexing

This paper applies LSI to the routing task, which operates under the assumption that a sample of relevant and non-relevant documents is available to use in constructing the query, and finds that when LSI is used is conjuction with statistical classification, there is a dramatic improvement in performance.

Indexing by Latent Semantic Analysis

A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.

Using latent semantic indexing for information filtering

LSI improved prediction performance over keyword matching an average of 13% and showed a 26% improvement in precision over presenting articles in the order received and results indicate that user preferences for articles tend to cluster based on the semantic similarities between articles.

Cross-Language Information Retrieval Using Latent Semantic Indexing

Using the proposed merge strategies, LSI is shown to be able to retrieve relevant documents from either language (Greek or English) without requiring any translation of a user's query.

Dimensions of meaning

The author analyzes the structure of the vector representations and applies them to word sense disambiguation and thesaurus induction and finds that dimensionality reduction by means of a singular value decomposition is employed.