• Corpus ID: 14212821

Living Labs for UMAP Evaluation

@inproceedings{Kelly2016LivingLF,
  title={Living Labs for UMAP Evaluation},
  author={Liadh Kelly},
  booktitle={UMAP},
  year={2016}
}
  • L. Kelly
  • Published in UMAP 2016
  • Computer Science
Generating shared task initiatives in the user-modelling, adaptation and personalization (UMAP) space is difficult, especially given individual difference, privacy concerns and the interactive nature of the space. We put forward that the living labs evaluation paradigm, i.e., observing users in their natural task environments, has potential to overcome these difficulties and to allow for comparative evaluation in the UMAP space of research. In particular, the emerging approach to living labs… 

Figures from this paper

EvalUMAP: Towards Comparative Evaluation in User Modeling, Adaptation and Personalization
TLDR
EvalUMAP is presented, a new concerted drive towards the establishment of shared challenges for comparative evaluation within the UMAP community.

References

SHOWING 1-9 OF 9 REFERENCES
Head First: Living Labs for Ad-hoc Search Evaluation
TLDR
This paper presents the first living labs for the IR community benchmarking campaign initiative, taking as test two use-cases: local domain search on a university website and product search on an e-commerce site, and proposes that head queries can be used to generate result lists offline, then interleaved with results of the production system for live evaluation.
Towards a Living Lab for Information Retrieval Research and Development - A Proposal for a Living Lab for Product Search Tasks
TLDR
A proposal for a living lab on product search tasks within the context of an online shop is put forward to further the discussion on living labs for IR evaluation and propose one possible architecture to create such an evaluation environment.
UvA-DARE (Digital Academic Repository) Overview of the Living Labs for Information Retrieval Evaluation (LL4IR) CLEF Lab 2015
TLDR
The main goal of the LL4IR CLEF Lab is to provide a benchmarking platform for researchers to evaluate their ranking systems in a live setting with real users in their natural task environments.
Extended Overview of the Living Labs for Information Retrieval Evaluation (LL4IR) CLEF Lab 2015
TLDR
The first Living Labs for Information Retrieval Evaluation LL4IR CLEF Lab is reported on, providing a benchmarking platform for researchers to evaluate their ranking systems in a live setting with real users in their natural task environments.
Benchmarking News Recommendations in a Living Lab
TLDR
It is argued that the living lab can serve as reference point for the implementation of living labs for the evaluation of information access systems and the experimental setup of the two benchmarking events is outlined.
Shedding light on a living lab: the CLEF NEWSREEL open recommendation platform
TLDR
In the CLEF NEWSREEL lab, participants are invited to evaluate news recommendation techniques in real-time by providing news recommendations to actual users that visit commercial news portals to satisfy their information needs by using the Open Recommendation Platform.
Controlled experiments on the web: survey and practical guide
TLDR
This work provides a practical guide to conducting online experiments, and shares key lessons that will help practitioners in running trustworthy controlled experiments, including statistical power, sample size, and techniques for variance reduction.
Evaluating Personal Information Retrieval
TLDR
The "personal information retrieval evaluation (PIRE)" tool is presented, which provides a solution to this evaluation problem using a ‘living laboratory' approach and allows for the evaluation of retrieval techniques using ‘real' individuals' personal collections, queries and result sets, in a cross-comparable repeatable way, while importantly maintaining an individual's informational privacy.
Evaluation Challenges and Directions for Information-Seeking Support Systems
ISSSs provide an exciting opportunity to extend previous information-seeking and interactive information retrieval evaluation models and create a research community that embraces diverse methods and