• Publications
  • Influence
Probabilistic structured query methods
TLDR
This paper reviews prior work on structured query techniques and introduces three new variants that leverage estima improvements in retrieval effectiveness are demonstrated for cross-language retrieval and for retrieval based on optical character recognition when replacement probabilities are used. Expand
  • 113
  • 18
  • PDF
Implicit Feedback for Recommender Systems
TLDR
We identify three types of implicit feedback and suggest two strategies for using implicit feedback to make recommendations in recommender systems. Expand
  • 375
  • 15
  • PDF
Pairwise Document Similarity in Large Collections with MapReduce
TLDR
This paper presents a MapReduce algorithm for computing pairwise document similarity in large document collections that scales linearly with increasing collection size. Expand
  • 229
  • 15
  • PDF
A survey of multilingual text retrieval
TLDR
This report reviews the present state of the art in selection of texts in one language based on queries in another a problem we refer to as multilingual text retrieval. Expand
  • 199
  • 13
  • PDF
Modeling Information Content Using Observable Behavior
TLDR
This paper presents a framework for modeling the content of information objects such as documents and video programs based on observation of how users interact with those objects in the course of information seeking and use. Expand
  • 175
  • 12
  • PDF
A Comparative Study of Query and Document Translation for Cross-Language Information Retrieval
TLDR
This paper explores the utility of two sources of translation knowledge for cross-language retrieval. Expand
  • 155
  • 12
  • PDF
The State of the Art in Text Filtering
  • Douglas W. Oard
  • Computer Science
  • User Modeling and User-Adapted Interaction
  • 1 March 1997
TLDR
Text filtering is an information seeking process in which documents are selected from a dynamic text stream to satisfy a relatively stable and specific information need. Expand
  • 113
  • 10
  • PDF
Confidentiality-preserving rank-ordered search
TLDR
We present practical techniques for proper integration of relevance scoring methods and cryptographic techniques, such as order preserving encryption, to protect data collections and indices and provide efficient and accurate search capabilities to securely rank-order documents in response to a query. Expand
  • 182
  • 7
  • PDF
Building an information retrieval test collection for spontaneous conversational speech
TLDR
Test collections model use cases in ways that facilitate evaluation of information retrieval systems. Expand
  • 70
  • 7
  • PDF
Automatic recognition of spontaneous speech for access to multilingual oral history archives
TLDR
This paper presents initial results from experiments with speech recognition, topic segmentation, topic categorization, and named entity detection using a large collection of recorded oral histories. Expand
  • 140
  • 6
  • PDF