Human Similarity Judgments: Implications for the Design of Formal Evaluations

Abstract

This paper presents findings of a series of analyses of human similarity judgments from the Symbolic Melodic Similarity, and Audio Music Similarity tasks from the Music Information Retrieval Evaluation Exchange (MIREX) 2006. The categorical judgment data generated by the evaluators is analyzed with regard to judgment stability, inter-grader reliability, and patterns of disagreement, both within and between the two tasks. An exploration of this space yields implications for the design of MIREX-like evaluations.

Extracted Key Phrases

8 Figures and Tables

Cite this paper

@inproceedings{Jones2007HumanSJ, title={Human Similarity Judgments: Implications for the Design of Formal Evaluations}, author={M. Cameron Jones and J. Stephen Downie and Andreas F. Ehmann}, booktitle={ISMIR}, year={2007} }