Learn More
  • Tamás Sarlós
  • 2006
Several results appeared that show significant reduction in time for matrix multiplication, singular value decomposition as well as linear (lscr<sub>2</sub>) regression, all based on data dependent random sampling. Our key idea is that low dimensional embeddings can be used to eliminate data dependence and provide more versatile, linear time pass efficient(More)
Personalized PageRank expresses link-based page quality around user-selected pages in a similar way as PageRank expresses quality over the entire web. Existing personalized PageRank algorithms can, however, serve online queries only for a restricted choice of pages. In this paper we achieve full personalization by a novel algorithm that precomputes a(More)
Least squares approximation is a technique to find an approximate solution to a system of linear equations that has no exact solution. In a typical setting , one lets n be the number of constraints and d be the number of variables, with n d. Then, existing exact methods find a solution vector in O(nd 2) time. We present two randomized algorithms that(More)
Spammers intend to increase the PageRank of certain spam pages by creating a large number of links pointing to them. We propose a novel method based on the concept of personalized PageRank that detects pages with an undeserved high PageRank value without the need of any kind of white or blacklists or other means of human intervention. We assume that spammed(More)
Personalized PageRank expresses link-based page quality around user selected pages. The only previous personalized PageRank algorithm that can serve on-line queries for an unrestricted choice of pages on large graphs is our Monte Carlo algorithm [WAW 2004]. In this paper we achieve unrestricted personalization by combining rounding and randomized sketching(More)
In this paper, we consider the problem of devising blocking schemes for entity matching. There is a lot of work on blocking techniques for supporting various kinds of predicates, e.g. exact matches, fuzzy string-similarity matches, and spatial matches. However, given a complex entity matching function in the form of a Boolean expression over several such(More)