Data Set Used
Word embedding, specially with its recent developments, promises a quantification of the similarity between terms. However, it is not clear to which extent this similarity value can be genuinely meaningful and useful for subsequent tasks. We explore how the similarity score obtained from the models is really indicative of term relatedness. We first observe… (More)
A recurring question in information retrieval is whether term associations can be properly integrated in traditional information retrieval models while preserving their robustness and effectiveness. In this paper, we revisit a wide spectrum of existing models (Pivoted Document Normalization, BM25, BM25 Verboseness Aware, Multi-Aspect TF, and Language… (More)
This document describes the participation of Vienna University of Technology in the TREC Clinical Decision Support Track 2014. Four different search models are investigated, as well as different strategies to index the corpus and to extract the most relevant information from the topics. Our results conclude that BM25 and Vector Space Model had similar… (More)
This paper describes the efforts of Vienna University of Technology (TUW) in the MediaEval 2014 Retrieving Diverse Social Images challenge. Our approach consisted of 3 steps: (1) a pre-filtering based on Machine Learning, (2) a re-ranking based on Word2Vec, and (3) a clustering part based on an ensemble of clusters. Our best run reached a F@20 of 0.564.
— We revisit text-based image retrieval for social media, exploring the opportunities offered by statistical semantics. We assess the performance and limitation of several complementary corpus-based semantic text similarity methods in combination with word representations. We compare results with state-of-the-art text search engines. Our deep learning-based… (More)
This paper describes the contributions of Vienna University of Technology (TUW) to the MediaEval 2015 Retrieving Diverse Social Images challenge. Our approach consists of 3 phases: (1) Precision-oriented-phase: in which we focus only on the relevance of the documents; (2) Recall-oriented-phase: in which we focus only on the diversity aspect; (3) Merging… (More)
Volatility prediction—an essential concept in financial markets—has recently been addressed using sentiment analysis methods. We investigate the sentiment of annual disclosures of companies in stock markets to forecast volatility. We specifically explore the use of recent Information Retrieval (IR) term weighting models that are effectively extended by… (More)