Pangolin: An Efficient and Flexible Graph Mining System on CPU and GPU

  title={Pangolin: An Efficient and Flexible Graph Mining System on CPU and GPU},
  author={Xuhao Chen and Roshan Dathathri and G. Gill and Keshav Pingali},
There is growing interest in graph mining algorithms such as motif counting. Generic graph mining systems have been developed to provide unified interfaces for programming these algorithms. However, existing systems take minutes or even hours to mine even simple patterns in moderate-sized graphs, which significantly limits their real-world usability. We present Pangolin, a high-performance and flexible in-memory graph mining framework targeting both shared-memory CPUs and GPUs. Pangolin is the… 

Efficient Strategies for Graph Pattern Mining Algorithms on GPUs

This work proposes novel strategies to design and implement subgraph enumeration efficiently on GPU that support a depth-first search style search that maximizes memory performance while providing enough parallelism to be exploited by the GPU, along with a warp-centric design that minimizes execution divergence and improves utilization of the computing capabilities.

A GPU-based Graph Pattern Mining System

  • Lin HuLei Zou
  • Computer Science
    Proceedings of the 31st ACM International Conference on Information & Knowledge Management
  • 2022
This work proposes a graph pattern mining framework on GPU, called GAMMA, which has great scalability and performance advantages compared with state-of-the-art graph mining works in experiments and presents several optimizations to process large graphs.

Efficient and Scalable Graph Pattern Mining on GPUs

G 2 Miner is described, the first GPM framework that runs efficiently on multiple GPUs and provides a code generator that automatically generates pattern-aware CUDA code to simplify programming and propose a customized scheduling policy to balance workload among multiple GPUs.

DIMMining: pruning-efficient and parallel graph mining on near-memory-computing

A DIMM-based Near-Memory-Computing architecture, which eliminates the large-volume data transfer between the computation and the memory, and the Systolic Merge Array is designed to further explore the parallelism on discontinuous vertices from the architecture perspective.

Sandslash: a two-level framework for efficient graph pattern mining

Sandslash is presented, an in-memory graph pattern mining framework that uses a novel programming interface to support productive, expressive, and efficient GPM on large graphs and which demonstrates that applications written using Sandslash high-level or low-level API outperform those in state-of-the-art GPM systems AutoMine, Pangolin, and Peregrine.

Khuzdul: Efficient and Scalable Distributed Graph Pattern Mining Engine

This paper proposes Khuzdul, a distributed execution engine with a well-defined abstraction that can be integrated with existing single-machine graph pattern mining (GPM) systems to provide

Efficient Mining of Frequent Subgraphs with Two-Vertex Exploration

This work proposes a novel two-vertex exploration strategy to accelerate the mining process and achieves significant speedups against the state-of-the-art graph pattern mining systems and supports larger pattern mining tasks that none of the existing systems can handle.

GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra

GraphMineSuite (GMS) is proposed, the first benchmarking suite for graph mining that facilitates evaluating and constructing highperformance graph mining algorithms and is supported with a broad concurrency analysis for portability in performance insights, and a novel performance metric to assess the throughput of graphs mining algorithms.

FlexMiner: A Pattern-Aware Accelerator for Graph Pattern Mining

FlexMiner is presented, a software/hardware co-designed GPM accelerator that improves the efficiency without compromising the generality or productivity of state-of-the-art software GPM frameworks.

Mint: An Accelerator For Mining Temporal Motifs

A task-centric programming model that enables decoupled, asynchronous execution of temporal motif mining, and a novel optimization called search index memoization that significantly reduces memory traffic is proposed.



G-Miner: an efficient task-oriented graph mining system

G-Miner is proposed, a distributed system with a new architecture designed for general graph mining that adopts a unified programming framework for implementing a wide range of graph mining algorithms and designs a novel task pipeline to streamline task processing for better CPU, network and I/O utilization.

EvoGraph: On-the-Fly Efficient Mining of Evolving Graphs on GPU

EvoGraph is a highly efficient and scalable GPU-based dynamic graph analytics framework that incrementally processes graphs on-the-fly using fixed-sized batches of updates and achieves over 232x speedup compared to the competing frameworks such as STINGER.

Parallel Graph Mining with GPUs

This paper proposes a novel approach for parallel graph mining on GPUs, which have emerged as a relatively cheap but powerful architecture for general purpose computing, but the thread-model for GPUs is different from that of CPUs, which makes the parallelization of graph mining algorithms on GPUs a challenging task.

AutoMine: harmonizing high-level abstraction and high performance for graph mining

This paper builds AutoMine, a single-machine system to provide both high-level interfaces and high performance for large-scale graph mining applications, and extensively evaluated AutoMine against 3 graph mining systems on 8 real-world graphs.

Kaleido: An Efficient Out-of-core Graph Mining System on A Single Machine

Kaleido is presented, an efficient single machine, out-of-core graph mining system which treats disks as an extension of memory and adopts a succinct data structure for the intermediate data.

Network Motif Discovery: A GPU Approach

A GPU-based solution that is up to two orders of magnitude faster than the best CPU-based approach, and is around $20$ times more cost-effective than the latter, when taking into account the monetary costs of the CPU and GPUs used.

Falcon: A Graph Manipulation Language for Heterogeneous Systems

A domain-specific language (DSL) is proposed, Falcon, for implementing graph algorithms that abstracts the hardware, provides constructs to write explicitly parallel programs at a higher level, and can work with general algorithms that may change the graph structure.

Ligra: a lightweight graph processing framework for shared memory

This paper presents a lightweight graph processing framework that is specific for shared-memory parallel/multicore machines, which makes graph traversal algorithms easy to write and significantly more efficient than previously reported results using graph frameworks on machines with many more cores.

ScaleMine: Scalable Parallel Frequent Subgraph Mining in a Single Large Graph

ScaleMine is proposed; a novel parallel frequent subgraph mining system for a single large graph that scales to 8,192 cores on a Cray XC40; supports graphs with one billion edges (10× larger than competitors), and is at least an order of magnitude faster than existing solutions.

Arabesque: a system for distributed graph mining

Arabesque is presented, the first distributed data processing platform for implementing graph mining algorithms that automates the process of exploring a very large number of subgraphs and defines a high-level filter-process computational model that simplifies the development of scalableGraph mining algorithms.