Learn More
It has been previously observed that optimization of the 1-call@k relevance objective (i.e., a set-based objective that is 1 if at least one document is relevant, otherwise 0) empirically correlates with diverse retrieval. In this paper, we proceed one step further and show theoretically that greedily optimizing expected 1-call@k w.r.t. a latent subtopic(More)
We describe Hugo -- a service initially available on iOS that solicits a structured, semantic query and returns entity-specific news articles. Retrieval is powered by a semantic annotation pipeline that includes named entity linking and automatic summarisation. Search and entity linking use an in-house knowledge base initialised with Wikipedia data and(More)
We present a study of which baseline to use when testing a new retrieval technique. In contrast to past work, we show that measuring a statistically significant improvement over a weak baseline is not a good predictor of whether a similar improvement will be measured on a strong baseline. Sometimes strong baselines are made worse when a new technique is(More)
We investigate the application of a light-weight approach to result list clustering for the purposes of diversifying search results. We introduce a novel post-retrieval approach, which is independent of external information or even the full-text content of retrieved documents; only the retrieval score of a document is used. Our experiments show that this(More)
The amount of biomedical literature, and the popularity of health-related searches, are both growing rapidly. While most biomedical search systems offer a range of advanced features, there is limited understanding of user preferences, and how searcher expertise relates to the use and perception of different search features in this domain. Through a(More)
Users of information retrieval systems employ a variety of strategies when searching for information. One factor that can directly influence how searchers go about their information finding task is the level of familiarity with a search topic. We investigate how the search behavior of domain experts changes based on their previous level of familiarity with(More)
One of the most important Web-based services that established the foundations of the Web 2.0 is the weblog. Weblogs are evolving to be topic based systems that can lead to more revenue for companies. Therefore many companies provide free weblog hosting. Weblog popularity is an effective factor to gain more revenue. Weblogs have posts and topics that are(More)