Understanding and Resolving Performance Degradation in Deep Graph Convolutional Networks

@article{Zhou2021UnderstandingAR,
  title={Understanding and Resolving Performance Degradation in Deep Graph Convolutional Networks},
  author={Kuangqi Zhou and Yanfei Dong and Kaixin Wang and Wee Sun Lee and Bryan Hooi and Huan Xu and Jiashi Feng},
  journal={Proceedings of the 30th ACM International Conference on Information \& Knowledge Management},
  year={2021}
}
  • Kuangqi ZhouYanfei Dong Jiashi Feng
  • Published 12 June 2020
  • Computer Science
  • Proceedings of the 30th ACM International Conference on Information & Knowledge Management
A Graph Convolutional Network (GCN) stacks several layers and in each layer performs a PROPagation operation~(PROP) and a TRANsformation operation~(TRAN) for learning node representations over graph-structured data. Though powerful, GCNs tend to suffer performance drop when the model gets deep. Previous works focus on PROPs to study and mitigate this issue, but the role of TRANs is barely investigated. In this work, we study performance degradation of GCNs by experimentally examining howโ€ฆย 

Figures and Tables from this paper

Adaptive Aggregation-Transformation Decoupled Graph Convolutional Network for Semi-Supervised Learning

An Adaptive Aggregation-Transform Decoupled Graph Convolutional Network (AATD-GCN) is proposed which divides the model into two depths DNA and DFT, and proposes improved approaches in NA and FT, respectively.

Old can be Gold: Better Gradient Flow can Make Vanilla-GCNs Great Again

This paper argues that blindly adopting the Glorot initialization for GCNs is not optimal, and derive a topology-aware isometric initialization scheme for vanilla-GCNs based on the principles of isometry, and proposes to use gradient-guided dynamic rewiring of vanilla- GCNs with skip connections.

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study

This work presents the first fair and reproducible benchmark dedicated to assessing the tricks of training deep GNNs and demonstrates that an organic combo of initial connection, identity mapping, group, and batch normalization attains the new state-of-the-art results for deep Gnns on large datasets.

Residual Network and Embedding Usage: New Tricks of Node Classification with Graph Convolutional Networks

Two novel tricks named GCN_res Framework and Embedding Usage are proposed by leveraging residual network and pre-trained embedding to improve baselineโ€™s test accuracy in different datasets.

Dirichlet Energy Constrained Learning for Deep Graph Neural Networks

A novel deep GNN framework โ€“ Energetic Graph Neural Networks (EGNN) is designed, which could provide lower and upper constraints in terms of Dirichlet energy at each layer to avoid over-smoothing.

Evaluating Deep Graph Neural Networks

The first systematic experimental evaluation is conducted to present the fundamental limitations of shallow architectures and presents Deep Graph Multi-Layer Perceptron (DGMLP), a powerful approach (a paradigm in its own right) that helps guide deep GNN designs.

Two Sides of the Same Coin: Heterophily and Oversmoothing in Graph Convolutional Neural Networks

This work takes a new uni๏ฌed perspective to understand the performance degradation of GCNs at the node level and shows the effectiveness of two strategies: degree correction, which learns to adjust degree coef๏ฌ‚cients, and signed messages, which may be useful (under conditions) by learning to optionally negate the messages.

Tuning the Geometry of Graph Neural Networks

It is shown that this aggregation operator is in fact tunable, and explicit regimes in which certain choices of operators โ€” and therefore, embedding geometries โ€” might be more appropriate are more appropriate.

Graph Masked Autoencoder

Graph Masked Autoencoders (GMAE), a self-supervised model for learning graph representations, where vanilla graph transformers are used as the encoder and the decoder, and asymmetric encoder-decoder design, which makes GMAE a memory-efficient model compared with conventional transformers.

References

SHOWING 1-10 OF 33 REFERENCES

DropEdge: Towards Deep Graph Convolutional Networks on Node Classification

DropEdge is a general skill that can be equipped with many other backbone models (e.g. GCN, ResGCN, GraphSAGE, and JKNet) for enhanced performance and consistently improves the performance on a variety of both shallow and deep GCNs.

DeeperGCN: All You Need to Train Deeper GCNs

Extensive experiments on Open Graph Benchmark show DeeperGCN significantly boosts performance over the state-of-the-art on the large scale graph learning tasks of node property prediction and graph property prediction.

Simplifying Graph Convolutional Networks

This paper successively removes nonlinearities and collapsing weight matrices between consecutive layers, and theoretically analyze the resulting linear model and show that it corresponds to a fixed low-pass filter followed by a linear classifier.

Simple and Deep Graph Convolutional Networks

The GCNII is proposed, an extension of the vanilla GCN model with two simple yet effective techniques: {\em Initial residual} and {\em Identity mapping} that effectively relieves the problem of over-smoothing.

Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning

It is shown that the graph convolution of the GCN model is actually a special form of Laplacian smoothing, which is the key reason why GCNs work, but it also brings potential concerns of over-smoothing with many convolutional layers.

PairNorm: Tackling Oversmoothing in GNNs

PairNorm is a novel normalization layer that is based on a careful analysis of the graph convolution operator, which prevents all node embeddings from becoming too similar and significantly boosts performance for a new problem setting that benefits from deeper GNNs.

DeepGCNs: Can GCNs Go As Deep As CNNs?

This work presents new ways to successfully train very deep GCNs by borrowing concepts from CNNs, specifically residual/dense connections and dilated convolutions, and adapting them to GCN architectures, and building a very deep 56-layer GCN.

On Asymptotic Behaviors of Graph CNNs from Dynamical Systems Perspective

The theory enables us to relate the expressive power of GCNs with the topological information of the underlying graphs inherent in the graph spectra and provides a principled guideline for weight normalization of graph NNs.

Geom-GCN: Geometric Graph Convolutional Networks

The proposed aggregation scheme is permutation-invariant and consists of three modules, node embedding, structural neighborhood, and bi-level aggregation, and an implementation of the scheme in graph convolutional networks, termed Geom-GCN, to perform transductive learning on graphs.

GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning

This work proposes a unified approach in which a fully-connected network is trained jointly with the graph neural network via parameter sharing, interpolation-based regularization, and self-predicted-targets.