Share This Author
Indexing by Latent Semantic Analysis
- S. Deerwester, S. Dumais, T. Landauer, G. Furnas, R. Harshman
- Computer ScienceJournal of the American Society for Information…
- 1 September 1990
A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge.
A new general theory of acquired similarity and knowledge representation, latent semantic analysis (LSA), is presented and used to successfully simulate such learning and several other psycholinguistic phenomena.
Using Linear Algebra for Intelligent Information Retrieval
A lexical match between words in users’ requests and those in or assigned to documents in a database helps retrieve textual materials from scientific databases.
A Bayesian Approach to Filtering Junk E-Mail
- M. Sahami, S. Dumais, D. Heckerman, E. Horvitz
- Computer ScienceAAAI Conference on Artificial Intelligence
- 1 July 1998
This work examines methods for the automated construction of filters to eliminate such unwanted messages from a user’s mail stream, and shows the efficacy of such filters in a real world usage scenario, arguing that this technology is mature enough for deployment.
Inductive learning algorithms and representations for text categorization
- S. Dumais, John C. Platt, David Hecherman, M. Sahami
- Computer ScienceInternational Conference on Information and…
- 1 November 1998
A comparison of the effectiveness of five different automatic learning algorithms for text categorization in terms of learning speed, realtime classification speed, and classification accuracy is compared.
Hierarchical classification of Web content
This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content using support vector machine (SVM) classifiers, which have been shown to be efficient and effective for classification, but not previously explored in the context of hierarchical classification.
Improving the retrieval of information from external sources
- S. Dumais
- Computer Science
- 1 June 1991
A statistical method is described called latent semantic indexing, which models the implicit higher order structure in the association of words and objects and improves retrieval performance by up to 30%.
The vocabulary problem in human-system communication
It is shown how this fundamental property of language limits the success of various design methodologies for vocabulary-driven interaction, and an optimal strategy, unlimited aliasing, is derived and shown to be capable of several-fold improvements.
Modeling the impact of short- and long-term behavior on search personalization
- Paul N. Bennett, Ryen W. White, Xiaoyuan Cui
- PsychologyAnnual International ACM SIGIR Conference on…
- 12 August 2012
This first study to assess how short-term (session) behavior and long- term (historic) behavior interact, and how each may be used in isolation or in combination to optimally contribute to gains in relevance through search personalization finds historic behavior provides substantial benefits at the start of a search session.
Using latent semantic analysis to improve access to textual information
- S. Dumais, G. Furnas, T. Landauer, S. Deerwester, R. Harshman
- Computer ScienceInternational Conference on Human Factors in…
- 1 May 1988
Initial tests find this completely automatic method widely applicable and a promising way to improve users' access to many kinds of textual materials, or to objects and services for which textual descriptions are available.