Analysis and Optimization of GNN-Based Recommender Systems on Persistent Memory
@inproceedings{Hu2022AnalysisAO, title={Analysis and Optimization of GNN-Based Recommender Systems on Persistent Memory}, author={Yuwei Hu and Jiajie Li and Zhongming Yu and Zhiru Zhang}, year={2022} }
Graph neural networks (GNNs), which have emerged as an effective method for handling machine learning tasks on graphs, bring a new approach to building recommender systems, where the task of recommendation can be formulated as the link prediction problem on user-item bipartite graphs. Training GNN-based recommender systems (GNNRecSys) on large graphs incurs a large memory footprint, easily exceeding the DRAM capacity on a typical server. Existing solutions resort to distributed subgraphβ¦Β
Figures and Tables from this paper
60 References
DistGNN: Scalable Distributed Training for Large-Scale Graph Neural Networks
- Computer ScienceSC21: International Conference for High Performance Computing, Networking, Storage and Analysis
- 2021
DistGNN is presented, which optimizes the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters via an efficient shared memory implementation, communication reduction using a minimum vertex-cut graph partitioning algorithm and communication avoidance using a family of delayed-update algorithms.
GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs
- Computer ScienceOSDI
- 2021
GNNAdvisor is proposed, an adaptive and efficient runtime system to accelerate various GNN workloads on GPU platforms and incorporates a lightweight analytical model for an effective design parameter search.
FeatGraph: A Flexible and Efficient Backend for Graph Neural Network Systems
- Computer ScienceSC20: International Conference for High Performance Computing, Networking, Storage and Analysis
- 2020
FeatGraph incorporates optimizations for graph traversal into the sparse templates and allows users to specify optimizations for UDFs with a feature dimension schedule (FDS) and FeatGraph speeds up end-to-end GNN training and inference by up to 32$ \times on CPU and 7$\times on GPU.
Graph Convolutional Neural Networks for Web-Scale Recommender Systems
- Computer ScienceKDD
- 2018
A novel method based on highly efficient random walks to structure the convolutions and a novel training strategy that relies on harder-and-harder training examples to improve robustness and convergence of the model are developed.
P3: Distributed Deep Graph Learning at Scale
- Computer ScienceOSDI
- 2021
This paper presents P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting and proposes a new approach for distributed GNN training that effectively eliminates high communication and partitioning overheads, and couples it with a new pipelined push-pull parallelism based execution strategy for fast model training.
DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs
- Computer Science2020 IEEE/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3)
- 2020
The results show that DistDGL achieves linear speedup without compromising model accuracy and requires only 13 seconds to complete a training epoch for a graph with 100 million nodes and 3 billion edges on a cluster with 16 machines.
AliGraph: A Comprehensive Graph Neural Network Platform
- Computer ScienceProc. VLDB Endow.
- 2019
This paper presents a comprehensive graph neural network system, namely AliGraph, which consists of distributed graph storage, optimized sampling operators and runtime to efficiently support not only existing popular GNNs but also a series of in-house developed ones for different scenarios.
Graph Neural Networks in Recommender Systems: A Survey
- Computer ScienceACM Comput. Surv.
- 2023
This article provides a taxonomy of GNN-based recommendation models according to the types of information used and recommendation tasks and systematically analyze the challenges of applying GNN on different types of data.
Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads
- Computer ScienceOSDI
- 2021
Dorylus is a distributed system for training GNNs that can take advantage of serverless computing to increase scalability at a low cost and is up to 3.8x faster and 10.7x cheaper compared to existing sampling-based systems.
Software-hardware co-design for fast and scalable training of deep learning recommendation models
- Computer ScienceISCA
- 2022
This paper presents Neo, a software-hardware co-designed system for high-performance distributed training of large-scale DLRMs that employs a novel 4D parallelism strategy that combines table-wise, row- Wise, column- wise, and data parallelism for training massive embedding operators inDLRMs.