Learn More
We study the problem of enhancing Entity Resolution (ER) with the help of crowdsourcing. ER is the problem of clustering records that refer to the same real-world entity and can be an extremely dicult process for computer algorithms alone. For example, figuring out which images refer to the same person can be a hard task for computers, but an easy one for(More)
We present new algorithms for Personalized PageRank estimation and Personalized PageRank search. First, for the problem of estimating Personalized PageRank (PPR) from a source distribution to a target node, we present a new bidirectional estimator with simple yet strong guarantees on correctness and performance, and 3x to 8x speedup over existing estimators(More)
We propose a new algorithm, FAST-PPR, for computing personalized PageRank: given start node <i>s</i> and target node <i>t</i> in a directed graph, and given a threshold &#948;, it computes the Personalized PageRank &#960;_s(t) from <i>s</i> to <i>t</i>, guaranteeing that the relative error is small as long &#960;<sub><i>s</i></sub>(<i>t</i>) &#62; &#948;.(More)
Latency is a critical factor when using a crowdsourcing platform to solve a problem like entity resolution or sorting. In practice, most frameworks attempt to reduce latency by heuristically splitting a budget of questions into rounds, so that after each round the answers are analyzed and new questions are selected. We focus on one of the most extensively(More)
Personalalized PageRank uses random walks to determine the importance or authority of nodes in a graph from the point of view of a given source node. Much past work has considered how to compute personalized PageRank from a given source node to other nodes. In this work we consider the problem of computing personalized PageRanks to a given target node from(More)
Anonymous blacklisting schemes allow online service providers to prevent future anonymous access by abusive users while preserving the privacy of all anonymous users (both abusive and non-abusive). The first scheme proposed for this purpose was Nymble, an extremely efficient scheme based only on symmetric primitives; however, Nymble relies on trusted third(More)
We present new, more efficient algorithms for estimating random walk scores such as Personalized PageRank from a given source node to one or several target nodes. These scores are useful for personalized search and recommendations on networks including social networks, user-item networks, and the web. Past work has proposed using Monte Carlo or using linear(More)
We present a new algorithm for estimating the Personal-ized PageRank (PPR) between a source and target node on undirected graphs, with sublinear running-time guarantees over the worst-case choice of source and target nodes. Our work builds on a recent line of work on bidirectional estimators for PPR, which obtained sublinear running-time guarantees but in(More)