Emre Velipasaoglu

Learn More
We study the problem of web search result diversification in the case where intent based relevance scores are available. A diversified search result will hopefully satisfy the information need of user-L.s who may have different intents. In this context, we first analyze the properties of an intent-based metric, ERR-IA, to measure relevance and diversity(More)
We consider the task of suggesting related queries to users after they issue their initial query to a web search engine. We propose a machine learning approach to learn the probability that a user may find a follow-up query both useful and relevant, given his initial query. Our approach is based on a machine learning model which enables us to generalize to(More)
Search engines are continuously looking into methods to alleviate users' effort in finding desired information. For this, all major search engines employ query suggestions methods to facilitate effective query formulation and reformulation. Providing high quality query suggestions is a critical task for search engines and so far most research efforts have(More)
Conducting focused but large-scale studies and experiments of user search behavior is highly desirable. Crowd-sourcing services such as the Amazon Mechanical Turk allow such studies to be conducted quickly and cheaply. They also have the potential to mitigate the problems associated with traditional experimental methods, in particular the relatively small(More)
In practical classification, there is often a mix of learnable and unlearnable classes and only a classifier above a minimum performance threshold can be deployed. This problem is exacerbated if the training set is created by active learning. The bias of actively learned training sets makes it hard to determine whether a class has been learned. We give(More)
Web pages are usually highly structured documents. In some documents, content with different functionality is laid out in blocks, some merely supporting the main discourse. In other documents, there may be several blocks of unrelated main content. Indexing a web page as if it were a linear document can cause problems because of the diverse nature of its(More)
Search engines are important resources for finding information on the Web. They are also important for publishers and advertisers to present their content to users. Thus, user satisfaction is key and must be quantified. In this tutorial, we give a practical review of web search metrics from a user satisfaction point of view. We cover metrics for relevance,(More)
Active learning efficiently hones in on the decision boundary between relevant and irrelevant documents, but in the process can miss entire clusters of relevant documents, yielding classifiers with low recall. In this paper, we propose a method to increase active learning recall by constraining sampling to a document subset rich in relevant examples.
  • 1