# N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules

@inproceedings{Liu2019NGramGS, title={N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules}, author={Shengchao Liu and Mehmet F. Demirel and Yingyu Liang}, booktitle={NeurIPS}, year={2019} }

Machine learning techniques have recently been adopted in various applications in medicine, biology, chemistry, and material engineering. An important task is to predict the properties of molecules, which serves as the main subroutine in many downstream applications such as virtual screening and drug design. Despite the increasing interest, the key challenge is to construct proper representations of molecules for learning algorithms. This paper introduces the N-gram graph, a simple unsupervised…

## Figures and Tables from this paper

## 50 Citations

MolCLR: Molecular Contrastive Learning of Representations via Graph Neural Networks

- Computer ScienceArXiv
- 2021

This work presents MolCLR: Molecular Contrastive Learning of Representations via Graph Neural Networks (GNNs), a self-supervised learning framework for large unlabeled molecule datasets and proposes three novel molecule graph augmentations: atom masking, bond deletion, and subgraph removal.

Do Large Scale Molecular Language Representations Capture Important Structural Information?

- Computer ScienceArXiv
- 2021

Experiments show that the learned molecular representation, MOLFORMER, performs competitively, when compared to existing graph-based and fingerprint-based supervised learning baselines, on the challenging tasks of predicting properties of QM8 and QM9 molecules.

Molecular contrastive learning of representations via graph neural networks

- Computer ScienceNature Machine Intelligence
- 2022

Experiments show that the MolCLR framework significantly improves the performance of graph-neural-network encoders on various molecular property benchmarks including both classification and regression tasks and achieves state of the art on several challenging benchmarks after fine-tuning.

ChemRL-GEM: Geometry Enhanced Molecular Representation Learning for Property Prediction

- Computer ScienceArXiv
- 2021

A novel Geometry Enhanced Molecular representation learning method (GEM) for Chemical Representation Learning (ChemRL) that combines geometry-based GNN architecture with several novel geometry-level self-supervised learning strategies to learn spatial knowledge by utilizing the local and global molecular 3D structures.

FraGAT: a fragment-oriented multi-scale graph attention model for molecular property prediction

- ChemistryBioinform.
- 2021

A definition of molecule graph fragments that may be or contain functional groups, which are relevant to molecular properties, are proposed, then a fragment-oriented multi-scale graph attention network for molecular property prediction, which is called FraGAT is developed.

Deep Graph Learning: Foundations, Advances and Applications

- Computer ScienceKDD
- 2020

This tutorial aims to provide a comprehensive introduction to deep graph learning and introduces the applications of DGL towards various domains, including but not limited to drug discovery, computer vision, medical image analysis, social network analysis, natural language processing and recommendation.

Learning Attributed Graph Representations with Communicative Message Passing Transformer

- Computer ScienceIJCAI
- 2021

A Communicative Message Passing Transformer (CoMPT) neural network is proposed to improve the molecular graph representation by reinforcing message interactions between nodes and edges based on the Transformer architecture to leverage the graph connectivity inductive bias and reduce the message enrichment explosion.

Multi-View Graph Neural Networks for Molecular Property Prediction

- Computer Science
- 2020

This work presents Multi-View Graph Neural Network (MV-GNN), a multi-view message passing architecture to enable more accurate predictions of molecular properties and boosts the expressive power of MV- GNN by proposing a cross-dependent message passing scheme that enhances information communication of the two views.

Geometry-enhanced molecular representation learning for property prediction

- Computer ScienceNature Machine Intelligence
- 2022

This work proposes a novel geometry-enhanced molecular representation learning method (GEM), which has a specially designed geometry-based graph neural network architecture as well as several dedicated geometry-level self-supervised learning strategies to learn the molecular geometry knowledge.

Communicative Representation Learning on Attributed Molecular Graphs

- Computer ScienceIJCAI
- 2020

A Communicative Message Passing Neural Network (CMPNN) is proposed to improve the molecular embedding by strengthening the message interactions between nodes and edges through a communicative kernel.

## References

SHOWING 1-10 OF 69 REFERENCES

Representation Learning on Graphs: Methods and Applications

- Computer ScienceIEEE Data Eng. Bull.
- 2017

A conceptual review of key advancements in this area of representation learning on graphs, including matrix factorization-based methods, random-walk based algorithms, and graph neural networks are provided.

How Powerful are Graph Neural Networks?

- Computer ScienceICLR
- 2019

This work characterize the discriminative power of popular GNN variants, such as Graph Convolutional Networks and GraphSAGE, and show that they cannot learn to distinguish certain simple graph structures, and develops a simple architecture that is provably the most expressive among the class of GNNs.

Gated Graph Sequence Neural Networks

- Computer ScienceICLR
- 2016

This work studies feature learning techniques for graph-structured inputs and achieves state-of-the-art performance on a problem from program verification, in which subgraphs need to be matched to abstract data structures.

Hierarchical Graph Representation Learning with Differentiable Pooling

- Computer ScienceNeurIPS
- 2018

DiffPool is proposed, a differentiable graph pooling module that can generate hierarchical representations of graphs and can be combined with various graph neural network architectures in an end-to-end fashion.

A Comprehensive Survey on Graph Neural Networks

- Computer ScienceIEEE Transactions on Neural Networks and Learning Systems
- 2019

This article provides a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields and proposes a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNS, convolutional GNN’s, graph autoencoders, and spatial–temporal Gnns.

node2vec: Scalable Feature Learning for Networks

- Computer ScienceKDD
- 2016

In node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks, a flexible notion of a node's network neighborhood is defined and a biased random walk procedure is designed, which efficiently explores diverse neighborhoods.

Weisfeiler-Lehman Graph Kernels

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2011

A family of efficient kernels for large graphs with discrete node labels based on the Weisfeiler-Lehman test of isomorphism on graphs that outperform state-of-the-art graph kernels on several graph classification benchmark data sets in terms of accuracy and runtime.

Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules

- Computer ScienceACS central science
- 2018

We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration…

Molecular graph convolutions: moving beyond fingerprints

- Computer ScienceJournal of Computer-Aided Molecular Design
- 2016

Molecular graph convolutions are described, a machine learning architecture for learning from undirected graphs, specifically small molecules, that represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.