# Estimation of Graphlet Statistics

@article{Rossi2017EstimationOG, title={Estimation of Graphlet Statistics}, author={Ryan A. Rossi and R. Zhou and Nesreen Ahmed}, journal={ArXiv}, year={2017}, volume={abs/1701.01772} }

Graphlets are induced subgraphs of a large network and are important for understanding and modeling complex networks. Despite their practical importance, graphlets have been severely limited to applications and domains with relatively small graphs. Most previous work has focused on exact algorithms, however, it is often too expensive to compute graphlets exactly in massive networks with billions of edges, and finding an approximate count is usually sufficient for many applications. In this work…

## Figures and Tables from this paper

## 19 Citations

### E-CLoG: Counting edge-centric local graphlets

- Computer Science2017 IEEE International Conference on Big Data (Big Data)
- 2017

Local graphlet counts around an edge are much better features for link prediction than well-known topological features; the experiments show that the former enjoys between 10% to 45% of improvement in the AUC value for predicting future links in three real-life social and collaboration networks.

### SNOD: a fast sampling method of exploring node orbit degrees for large graphs

- Computer ScienceKnowledge and Information Systems
- 2018

A novel sampling method SNOD is proposed to efficiently estimate node orbit degrees for large-scale graphs and quantify the error of the estimates and demonstrate that the method is several orders of magnitude faster than state-of-the-art enumeration methods for accurately estimating node orbit Degrees for graphs with millions of edges.

### Higher-order Spectral Clustering for Heterogeneous Graphs

- Computer ScienceArXiv
- 2018

This work develops a general principled framework for higher-order clustering in heterogeneous networks using typed-graphlets as a basis and provides mathematical guarantees on the optimality of the higher- order clustering obtained.

### Higher-Order Clustering for Heterogeneous Networks via Typed Motifs

- Computer Science
- 2018

This work develops a general principled framework for higher-order clustering in heterogeneous networks using typed-graphlets as a basis and provides mathematical guarantees on the optimality of the higher- order clustering obtained.

### A Survey on Subgraph Counting

- Computer ScienceACM Comput. Surv.
- 2021

This survey aims to provide a comprehensive overview of the existing methods for subgraph counting, identifying and describing the main conceptual approaches, giving insight on their advantages and limitations, and providing pointers to existing implementations.

### Graph Classification using Structural Attention

- Computer ScienceKDD
- 2018

This work presents a novel RNN model, called the Graph Attention Model (GAM), that processes only a portion of the graph by adaptively selecting a sequence of "informative" nodes, and shows that the proposed method is competitive against various well-known methods in graph classification.

### Deep Graph Attention Model

- Computer ScienceArXiv
- 2017

This work presents a novel RNN model, called the Graph Attention Model (GAM), that processes only a portion of the graph by adaptively selecting a sequence of "interesting" nodes, and is equipped with an external memory component which allows it to integrate information gathered from different parts of thegraph.

### A Framework for Generalizing Graph-based Representation Learning Methods

- Computer Science, MathematicsArXiv
- 2017

This work introduces the notion of attributed random walks which serves as a basis for generalizing existing methods such as DeepWalk, node2vec, and many others that leverage random walks and enables these methods to be more widely applicable for both transductive and inductive learning as well as for use on graphs with attributes.

### Local Algorithms for Hierarchical Dense Subgraph Discovery

- Computer ScienceProc. VLDB Endow.
- 2018

A framework of local algorithms to obtain the core, truss, and nucleus decompositions, which are local, parallel, offer high scalability, and enable approximations to explore time and quality trade-offs is presented.

## References

SHOWING 1-10 OF 43 REFERENCES

### Efficient Graphlet Counting for Large Networks

- Computer Science2015 IEEE International Conference on Data Mining
- 2015

This paper proposes a fast, efficient, and parallel algorithm for counting graphlets of size k={3,4}-nodes that take only a fraction of the time to compute when compared with the current methods used, and is on average 460x faster than current methods.

### Graft: An Efficient Graphlet Counting Method for Large Graph Analysis

- Computer ScienceIEEE Transactions on Knowledge and Data Engineering
- 2014

This work uses graphlet frequency distribution (GFD) as an analysis tool for understanding the variance of local topological structure in a network; it is shown that it can help in comparing, and characterizing real-life networks.

### Efficient graphlet kernels for large graph comparison

- Computer ScienceAISTATS
- 2009

In this article, two theoretically grounded speedup schemes are introduced, one based on sampling and the second specifically designed for bounded degree graphs, to efficiently compare large graphs that cannot be tackled by existing graph kernels.

### Graph sample and hold: a framework for big-graph analytics

- Computer Science, MathematicsKDD
- 2014

A generic stream sampling framework for big-graph analytics, called Graph Sample and Hold (gSH), which samples from massive graphs sequentially in a single pass, one edge at a time, while maintaining a small state in memory is proposed.

### GUISE: a uniform sampler for constructing frequency histogram of graphlets

- Computer ScienceKnowledge and Information Systems
- 2013

This paper proposes Guise, which uses a Markov Chain Monte Carlo sampling method for constructing the approximate GFD of a large network, and shows that Guise obtains the GFD with very low rate of error within few minutes, whereas the exhaustive counting-based approach takes several days.

### Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts

- Computer ScienceWWW
- 2015

A sampling algorithm that provably and accurately approximates the frequencies of all 4-vertex pattern subgraphs is provided, based on a novel technique of 3-path sampling and a special pruning scheme to decrease the variance in estimates.

### Graphlet-based measures are suitable for biological network comparison

- Computer ScienceBioinform.
- 2013

It is demonstrated that it is the model networks themselves that are 'unstable' at low edge density and that graphlet-based measures correctly reflect this instability, and that PPI networks of many species are well-fit by several models not previously tested.

### DOULION: counting triangles in massive graphs with a coin

- Computer ScienceKDD
- 2009

A practical method, out of which all triangle counting algorithms can potentially benefit, is proposed, which works with high accuracy, typically more than 99% and gives significant speedups, resulting in even ≈ 130 times faster performance.

### Graph Kernels

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2007

A unified framework to study graph kernels is presented and a kernel that is close to the optimal assignment kernel of kernel of Frohlich et al. (2006) yet provably positive semi-definite is provided.