# Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

@inproceedings{Lim2021LargeSL, title={Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods}, author={Derek Lim and Felix Hohne and Xiuyu Li and Sijia Huang and Vaishnavi Gupta and Omkar Bhalerao and Ser-Nam Lim}, booktitle={NeurIPS}, year={2021} }

Many widely used datasets for graph machine learning tasks have generally been homophilous, where nodes with similar labels connect to each other. Recently, new Graph Neural Networks (GNNs) have been developed that move beyond the homophily regime; however, their evaluation has often been conducted on small graphs with limited application domains. We collect and introduce diverse nonhomophilous datasets from a variety of application areas that have up to 384x more nodes and 1398x more edges…

## Figures and Tables from this paper

## 6 Citations

Graph Neural Networks for Graphs with Heterophily: A Survey

- Computer ScienceArXiv
- 2022

A systematic taxonomy that essentially governs existing heterophilic GNN models is proposed, along with a general summary and detailed analysis, to facilitate robust and fair evaluations of these graph neural networks.

Unsupervised Heterophilous Network Embedding via r-Ego Network Discrimination

- Computer ScienceArXiv
- 2022

The first empirical study on the impact of homophily ratio on the performance of existing unsupervised NE methods and reveals their limitations is introduced and a SELf-supErvised Network Embedding (Selene) framework is developed for learning useful node representations for both homophilous and heterophILous networks.

Augmentation-Free Graph Contrastive Learning

- Computer ScienceArXiv
- 2022

A novel, theoretically-principled, and augmentation-free GCL method, named AF-GCL, that leverages the features aggregated by Graph Neural Network to construct the self-supervision signal instead of augmentations and therefore is less sensitive to the graph homophily degree is proposed.

Efficient Representation Learning of Subgraphs by Subgraph-To-Node Translation

- Computer ScienceArXiv
- 2022

This work proposes Subgraph-To-Node (S2N) translation, which is a novel formulation to efﬁciently learn representations of subgraphs, and demonstrates that models with S2N translation are more ef-cient than state-of-the-art models without substantial performance decrease.

Finding Global Homophily in Graph Neural Networks When Meeting Heterophily

- Computer ScienceArXiv
- 2022

We investigate graph neural networks on graphs with heterophily. Some existing methods amplify a node’s neighborhood with multi-hop neighbors to include more nodes with homophily. How-ever, it is a…

Two Sides of the Same Coin: Heterophily and Oversmoothing in Graph Convolutional Neural Networks

- Computer ScienceArXiv
- 2021

This work theoretically characterize the connections between heterophily and oversmoothing, and designs a model that addresses the discrepancy in features and degrees between neighbors by incorporating signed messages and learned degree corrections.

## References

SHOWING 1-10 OF 91 REFERENCES

Inductive Representation Learning on Large Graphs

- Computer ScienceNIPS
- 2017

GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.

Pitfalls of Graph Neural Network Evaluation

- Computer ScienceArXiv
- 2018

This paper performs a thorough empirical evaluation of four prominent GNN models and suggests that simpler GNN architectures are able to outperform the more sophisticated ones if the hyperparameters and the training procedure are tuned fairly for all models.

Representation Learning on Graphs with Jumping Knowledge Networks

- Computer ScienceICML
- 2018

This work explores an architecture -- jumping knowledge (JK) networks -- that flexibly leverages, for each node, different neighborhood ranges to enable better structure-aware representation in graphs.

GraphSAINT: Graph Sampling Based Inductive Learning Method

- Computer ScienceICLR
- 2020

GraphSAINT is proposed, a graph sampling based inductive learning method that improves training efficiency in a fundamentally different way and can decouple the sampling process from the forward and backward propagation of training, and extend GraphSAINT with other graph samplers and GCN variants.

How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision

- Computer ScienceICLR
- 2021

A self-supervised graph attention network (SuperGAT), an improved graph attention model for noisy graphs that generalizes across 15 datasets of them, and the models designed by recipe show improved performance over baselines.

Predict then Propagate: Graph Neural Networks meet Personalized PageRank

- Computer ScienceICLR
- 2019

This paper uses the relationship between graph convolutional networks (GCN) and PageRank to derive an improved propagation scheme based on personalized PageRank, and constructs a simple model, personalized propagation of neural predictions (PPNP), and its fast approximation, APPNP.

Combining Label Propagation and Simple Models Out-performs Graph Neural Networks

- Computer ScienceICLR
- 2021

This work shows that for many standard transductive node classification benchmarks, it can exceed or match the performance of state-of-the-art GNNs by combining shallow models that ignore the graph structure with two simple post-processing steps that exploit correlation in the label structure.

Decoupled Smoothing on Graphs

- Computer ScienceWWW
- 2019

This work presents a decoupled approach to graph smoothing that decouples notions of “identity” and “preference,” resulting in an alternative social phenomenon of monophily whereby individuals are similar to “the company they're kept in,’ as observed in recent empirical work.

Graph Convolutional Neural Networks for Web-Scale Recommender Systems

- Computer ScienceKDD
- 2018

A novel method based on highly efficient random walks to structure the convolutions and a novel training strategy that relies on harder-and-harder training examples to improve robustness and convergence of the model are developed.

Semi-Supervised Learning with Heterophily

- Computer ScienceArXiv
- 2014

We derive a family of linear inference algorithms that generalize existing graph-based label propagation algorithms by allowing them to propagate generalized assumptions about "attraction" or…