Fast and Exact Top-k Search for Random Walk with Restart

@article{Fujiwara2012FastAE,
  title={Fast and Exact Top-k Search for Random Walk with Restart},
  author={Yasuhiro Fujiwara and Makoto Nakatsuji and Makoto Onizuka and Masaru Kitsuregawa},
  journal={ArXiv},
  year={2012},
  volume={abs/1201.6566}
}
Graphs are fundamental data structures and have been employed for centuries to model real-world systems and phenomena. Random walk with restart (RWR) provides a good proximity score between two nodes in a graph, and it has been successfully used in many applications such as automatic image captioning, recommender systems, and link prediction. The goal of this work is to find nodes that have top-k highest proximities for a given node. Previous approaches to this problem find nodes efficiently at… 

Figures and Tables from this paper

Fast Inbound Top-K Query for Random Walk with Restart

TLDR
This paper proposes two algorithms, namely Squeeze and Ripple, both of which can accurately answer the Ink query in a fast and incremental manner and are orders of magnitude faster than state-of-the-art method.

Index-Free Approach with Theoretical Guarantee for Efficient Random Walk with Restart Query

TLDR
An index-free algorithm called Residue-Accumulated approach (ResAcc) which returns answers with a theoretical guarantee efficiently and is up to 6 orders of magnitude more accurate than the best-known algorithm in practice with the same execution time, which is considered as a substantial improvement.

TPA: Two Phase Approximation for Random Walk with Restart

TLDR
This paper proposes TPA, a fast, scalable, and highly accurate method for computing approximate RWR on large graphs and shows that TPA requires 1140× less time with 20× less memory space than other state-of-the-art methods for the preprocessing phase.

Reverse Top-k Search using Random Walk with Restart

TLDR
This work proposes an indexing technique, paired with an on-line reverse top-k search algorithm, that is efficient and has manageable storage requirements even when applied on very large graphs.

Finding Top‐k Answers in Node Proximity Search Using Distribution State Transition Graph

TLDR
This paper presents a novel method to find top‐k answers in a node proximity search based on the well‐known measure, Personalized PageRank (PPR), and introduces a distribution state transition graph (DSTG) to depict iterative steps for solving the PPR equation.

TPA: Fast, Scalable, and Accurate Method for Approximate Random Walk with Restart on Billion Scale Graphs

TLDR
This paper proposes TPA, a fast, scalable, and highly accurate method for computing approximate RWR on large graphs and shows that it requires up to 3.5× less time with up to 40× less memory space than other state-of-the-art methods for the preprocessing phase and computes up to 30× faster than existing methods while maintaining high accuracy.

Efficient Processing Node Proximity via Random Walk with Restart

TLDR
This paper proposes hybrid techniques to efficiently compute Random Walk with Restart using a novel divide-and-conquer paradigm, aiming to convert the large LU decomposition into small triangular matrix operations recursively on several partitioned subgraphs.

IRWR: incremental random walk with restart

TLDR
The main contribution of this paper is to devise an efficient and fast incremental algorithm of RWR for edge updates that can incrementally compute any node proximity in $O(1)$ time for each edge update without loss of exactness.

TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs

TLDR
PPR is proposed, an algorithm for top-k PPR queries that ensure at least ρ precision with at least 1 - 1/n probability, where ρ ∈;n (0, 1] is a user-specified parameter and n is the number of nodes in G.

BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs

TLDR
BEAR is proposed, a fast, scalable, and accurate method for computing RWR on large graphs that significantly outperforms other state-of-the-art methods in terms of preprocessing and query speed, space efficiency, and accuracy.
...

References

SHOWING 1-10 OF 27 REFERENCES

Fast Random Walk with Restart and Its Applications

TLDR
The heart of the approach is to exploit two important properties shared by many real graphs: linear correlations and block- wise, community-like structure and exploit the linearity by using low-rank matrix approximation, and the community structure by graph partitioning, followed by the Sherman- Morrison lemma for matrix inversion.

Quick Detection of Top-k Personalized PageRank Lists

TLDR
This work proposes Monte Carlo methods for quick detection of top-k Personalized PageRank (PPR) lists, and demonstrates the effectiveness of these methods on the Web and Wikipedia graphs, and provides performance evaluation and supply stopping criteria.

Neighborhood based fast graph search in large networks

TLDR
Under this new measure, it is proved that subgraph similarity search is NP hard, while graph similarity match is polynomial, and an information propagation model is found that is able to convert a large network into a set of multidimensional vectors, where sophisticated indexing and similarity search algorithms are available.

Fast algorithms for topk personalized pagerank queries

TLDR
This work proposes a framework to answer top-k graph conductance queries, and extends the system to handle hard predicates, leading to a 4X speedup and overall, the system executes queries 200-1600X faster than whole-graph PageRank.

Accuracy estimate and optimization techniques for SimRank computation

TLDR
This paper presents a technique to estimate the accuracy of computing SimRank iteratively and presents optimization techniques that improve the computational complexity of the iterative algorithm from O(n4) in the worst case to min(O(nl), O( n3/ log2n), with n denoting the number of objects, and l denotes the number object-to-object relationships.

Center-piece subgraphs: problem definition and fast solutions

TLDR
Wall-clock timing results on the DBLP dataset show that the proposed approximation achieve good accuracy for about 6:1 speedup, and experiments confirm that the method naturally deals with multi-source queries and that the resulting subgraphs agree with the intuition.

Optimization and evaluation of shortest path queries

TLDR
This work investigates the problem of how to evaluate efficiently a collection of shortest path queries on massive graphs that are too big to fit in the main memory and introduces two pruning algorithms.

On social networks and collaborative recommendation

TLDR
This work created a collaborative recommendation system that effectively adapts to the personal information needs of each user, and adopts the generic framework of Random Walk with Restarts in order to provide with a more natural and efficient way to represent social networks.

Multilevel k-way Partitioning Scheme for Irregular Graphs

In this paper, we present and study a class of graph partitioning algorithms that reduces the size of the graph by collapsing vertices and edges, we find ak-way partitioning of the smaller graph, and

PathSim: Meta Path-Based Top-K Similarity Search in Heterogeneous Information Networks

TLDR
Under the meta path framework, a novel similarity measure called PathSim is defined that is able to find peer objects in the network (e.g., find authors in the similar field and with similar reputation), which turns out to be more meaningful in many scenarios compared with random-walk based similarity measures.