Learn More
The ability of fast similarity search at large scale is of great importance to many Information Retrieval (IR) applications. A promising way to accelerate similarity search is semantic hashing which designs compact binary codes for a large number of documents so that semantically similar documents are mapped to similar codes (within a short Hamming(More)
SUMMARY Most current P2P file-sharing systems treat their users as anonymous, unrelated entities, and completely disregard any social relationships between them. However, social phenomena such as friendship and the existence of communities of users with similar tastes or interests may well be exploited in such systems in order to increase their usability(More)
This paper studies document ranking under uncertainty. It is tackled in a general situation where the relevance predictions of individual documents have uncertainty, and are dependent between each other. Inspired by the Modern Portfolio Theory, an economic theory dealing with investment in financial markets, we argue that ranking under uncertainty is not(More)
Memory-based methods for collaborative filtering predict new ratings by averaging (weighted) ratings between, respectively, pairs of similar users <i>or</i> items. In practice, a large number of ratings from similar users or similar items are not available, due to the sparsity inherent to rating data. Consequently, prediction quality can be poor. This paper(More)
The top-k retrieval problem aims to find the optimal set of k documents from a number of relevant documents given the user's query. The key issue is to balance the relevance and diversity of the top-k search results. In this paper, we address this problem using Facility Location Analysis taken from Operations Research, where the locations of facilities are(More)
Implicit acquisition of user preferences makes log-based col-laborative filtering favorable in practice to accomplish recommendations. In this paper, we follow a formal approach in text retrieval to re-formulate the problem. Based on the classic probability ranking principle, we propose a probabilistic user-item relevance model. Under this formal model, we(More)
Collaborative filtering is concerned with making recommendations about items to users. Most formulations of the problem are specifically designed for predicting user ratings, assuming past data of explicit user ratings is available. However, in practice we may only have implicit evidence of user preference; and furthermore, a better view of the task is of(More)
Most commercial television channels use video logos, which can be considered a form of visible watermark, as a declaration of intellectual property ownership. They are also used as a symbol of authorization to rebroadcast when original logos are used in conjunction with newer logos. An unfortunate side effect of such logos is the concomitant decrease in(More)
Collaborative filtering aims at predicting a user's interest for a given item based on a collection of user profiles. This article views collaborative filtering as a problem highly related to information retrieval, drawing an analogy between the concepts of users and items in recommender systems and queries and documents in text retrieval. We present a(More)
Collaborative filtering requires a centralized rating database. However, within a peer-to-peer network such a centralized database is not readily available. In this paper, we propose a fully distributed collaborative filtering method that is <i>self-organizing</i> and operates in a <i>distributed</i> way. Similarity ranks between multimedia files (items)(More)