Massive graph triangulation

@inproceedings{Hu2013MassiveGT,
  title={Massive graph triangulation},
  author={Xiaocheng Hu and Yufei Tao and Chin-Wan Chung},
  booktitle={SIGMOD '13},
  year={2013}
}
This paper studies I/O-efficient algorithms for settling the classic triangle listing problem, whose solution is a basic operator in dealing with many other graph problems. Specifically, given an undirected graph G, the objective of triangle listing is to find all the cliques involving 3 vertices in G. The problem has been well studied in internal memory, but remains an urgent difficult challenge when G does not fit in memory, rendering any algorithm to entail frequent I/O accesses. Although… 

Figures and Tables from this paper

I/O-Efficient Algorithms on Triangle Listing and Counting
TLDR
A new algorithm is developed that is provably I/O and CPU efficient at the same time, without making any assumption on the input G at all, and outperforms the existing competitors by a factor of over an order of magnitude in the authors' extensive experimentation.
OPT: a new framework for overlapped and parallel triangulation in large-scale graphs
TLDR
This paper proposes an overlapped and parallel disk-based triangulation framework for billion-scale graphs, OPT, which achieves the ideal cost by (1) full overlap of the CPU and I/O operations and (2) full parallelism of multi-core CPU and FlashSSD I/o.
PDTL: Parallel and Distributed Triangle Listing for Massive Graphs
TLDR
This paper presents the first distributed triangle listing algorithm with provable CPU, I/O, Memory, and Network bounds, and highlights the importance ofI/O in a distributed environment.
Improving I/O Complexity of Triangle Enumeration
TLDR
An investigation into the properties of Pagh and PCF is undertaken, which leads to a novel framework called Trigon that surpasses the I/O performance of both previous techniques in all graphs and under all RAM conditions.
Fast In-Memory Triangle Listing for Large Real-World Graphs
TLDR
This paper proposes a fast and precise in-memory solution for the triangle listing problem, and proves how theoretic lower bound can be achieved by sorting the nodes in the graph by their degree and applying pruning.
Parallel subgraph listing in a large-scale graph
TLDR
A novel parallel subgraph listing framework, named PSgL, which completely relies on the graph traversal, and avoids the explicit join operation, and proves the problem of partial subgraph instance distribution for workload balance is NP-hard, and carefully design a set of heuristic strategies.
iTri: Index-based triangle listing in massive graphs
An efficient exact algorithm for triangle listing in large graphs
TLDR
This paper presents a new efficient exact algorithm for listing triangles in a large graph on a compressed copy of the input graph that lists the triangles without decompressing the graph.
General-Purpose Join Algorithms for Listing Triangles in Large Graphs
TLDR
This work presents "boxing": a novel, yet conceptually simple, approach for feeding input data to Leapfrog Triejoin (LFTJ), showing that this approach is I/O efficient, being worst-case optimal (in a certain sense).
On Efficient External-Memory Triangle Listing
TLDR
A novel external-memory approach is developed, which is called Pruned Companion Files (PCF), that supports operation of all 18 triangle-search techniques, while significantly reducing I/O compared to the common methods in this area.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 30 REFERENCES
Triangle listing in massive networks
TLDR
This work proposes an I/O-efficient algorithm for triangle listing that is exact, scalable and outperforms the state-of-the-art in-memory and local triangle estimation algorithms.
On Triangulation-based Dense Neighborhood Graphs Discovery
TLDR
A new definition of dense subgraph pattern, the DN -graph, which considers both the size of the substructure and the minimum level of interactions between any pair of the vertices, and can cope with semi-streaming environment where the graph edges cannot fit into main memory.
Triangle listing in massive networks and its applications
TLDR
This work proposes an I/O-efficient algorithm for triangle listing that is scalable and outperforms the state-of-the-art local triangle estimation algorithm and avoids random disk access.
An efficient MapReduce algorithm for counting triangles in a very large graph
TLDR
A new algorithm based on graph partitioning with a novel idea of triangle classification to count the number of triangles in a graph is proposed that substantially reduces the duplication by classifying triangles into three types and processing each triangle differently according to its type.
Arboricity and Subgraph Listing Algorithms
TLDR
A new simple strategy into edge-searching of a graph, which is useful to the various subgraph listing problems, is introduced, and an upper bound on $a(G)$ is established for a graph $G:a (G) \leqq \lceil (2m + n)^{1/2} \rceil $, where n is the number of vertices in G.
Finding maximal cliques in massive networks
TLDR
A general framework enables maximal clique enumeration to be processed recursively in small subgraphs of the input graph, thus allowing in-memory computation of maximal cliques without the costly random disk access.
Finding, Counting and Listing All Triangles in Large Graphs, an Experimental Study
TLDR
This work gives a surprisingly simple enhancement of a well known algorithm that performs best, and makes triangle listing and counting in huge networks feasible.
The h-Index of a Graph and its Application to Dynamic Subgraph Statistics
TLDR
A data structure that maintains the number of triangles in a dynamic undirected graph, subject to insertions and deletions of edges and of degree-zero vertices, which has applications in social network analysis using the exponential random graph model (ERGM).
...
1
2
3
...