Test collection recycling for semantic text similarity


Semantic text similarity (STS) uses specific test collections as its performance evaluation measurement. The test collections consist of text pairs with the same meaning even though in different text form. The existence is scarce compared with information retrieval (IR) test collections. This paper investigates the possibility to reuse IR test collections for STS tasks. Text pairs are derived from the relevant pair of IR test collections. Latent semantic analysis (LSA) and explicit semantic analysis (ESA) evaluate Glasgow's test collections, which are provided by ACM SIGIR community. Jaccard index measures the lexical similarity. Recall metric measures retrievability of recycling test collection with two existing test collections, Microsoft research paraphrase corpus and Microsoft research video description corpus, as evaluation baselines. Evaluation yields a promising outcome; the evaluated test collections have low Jaccard index and their recall values between the two baselines.

DOI: 10.1145/2428736.2428784

