• Publications
  • Influence
An effective statistical approach to blog post opinion retrieval
Finding opinionated blog posts is still an open problem in information retrieval, as exemplified by the recent TREC blog tracks. Most of the current solutions involve the use of external resourcesExpand
  • 87
  • 10
Temporal feedback for tweet search with non-parametric density estimation
This paper investigates the temporal cluster hypothesis: in search tasks where time plays an important role, do relevant documents tend to cluster together in time? We explore this question in theExpand
  • 42
  • 7
A research tool for long-term and continuous analysis of fish assemblage in coral-reefs using underwater camera footage
Abstract We present a research tool that supports marine ecologists' research by allowing analysis of long-term and continuous fish monitoring video content. The analysis can be used for instance toExpand
  • 66
  • 4
The University of Amsterdam at WePS2
In this paper we describe our participation in the Second Web People Search workshop (WePS2) and detail our approaches. For the clustering task, our focus was on replicating the lessons learned atExpand
  • 19
  • 4
Supporting ground-truth annotation of image datasets using clustering
As more subject-specific image datasets (medical images, birds, etc) become available, high quality labels associated with these datasets are essential for building statistical models and methodExpand
  • 42
  • 3
User Intent, Behaviour, and Perceived Satisfaction in Product Search
As online shopping becomes increasingly popular, users perform more product search to purchase items. Previous studies have investigated people's online shopping behaviours and ways to predict onlineExpand
  • 23
  • 3
Result diversification based on query-specific cluster ranking
Result diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a resultExpand
  • 59
  • 2
Using Coherence-Based Measures to Predict Query Difficulty
We investigate the potential of coherence-based scores to predict query difficulty. The coherence of a document set associated with each query word is used to capture the quality of a query topicExpand
  • 75
  • 2
Combining implicit and explicit topic representations for result diversification
Result diversification deals with ambiguous or multi-faceted queries by providing documents that cover as many subtopics of a query as possible. Various approaches to subtopic modeling have beenExpand
  • 60
  • 2
Generating links to background knowledge: a case study using narrative radiology reports
Automatically annotating texts with background information has recently received much attention. We conduct a case study in automatically generating links from narrative radiology reports toExpand
  • 38
  • 2