• Corpus ID: 1344273

Estimation of Graphlet Statistics

@article{Rossi2017EstimationOG,
  title={Estimation of Graphlet Statistics},
  author={Ryan A. Rossi and R. Zhou and Nesreen Ahmed},
  journal={ArXiv},
  year={2017},
  volume={abs/1701.01772}
}
Graphlets are induced subgraphs of a large network and are important for understanding and modeling complex networks. Despite their practical importance, graphlets have been severely limited to applications and domains with relatively small graphs. Most previous work has focused on exact algorithms, however, it is often too expensive to compute graphlets exactly in massive networks with billions of edges, and finding an approximate count is usually sufficient for many applications. In this work… 

E-CLoG: Counting edge-centric local graphlets

Local graphlet counts around an edge are much better features for link prediction than well-known topological features; the experiments show that the former enjoys between 10% to 45% of improvement in the AUC value for predicting future links in three real-life social and collaboration networks.

SNOD: a fast sampling method of exploring node orbit degrees for large graphs

A novel sampling method SNOD is proposed to efficiently estimate node orbit degrees for large-scale graphs and quantify the error of the estimates and demonstrate that the method is several orders of magnitude faster than state-of-the-art enumeration methods for accurately estimating node orbit Degrees for graphs with millions of edges.

Higher-order Spectral Clustering for Heterogeneous Graphs

This work develops a general principled framework for higher-order clustering in heterogeneous networks using typed-graphlets as a basis and provides mathematical guarantees on the optimality of the higher- order clustering obtained.

Higher-Order Clustering for Heterogeneous Networks via Typed Motifs

This work develops a general principled framework for higher-order clustering in heterogeneous networks using typed-graphlets as a basis and provides mathematical guarantees on the optimality of the higher- order clustering obtained.

A Survey on Subgraph Counting

This survey aims to provide a comprehensive overview of the existing methods for subgraph counting, identifying and describing the main conceptual approaches, giving insight on their advantages and limitations, and providing pointers to existing implementations.

Graph Classification using Structural Attention

This work presents a novel RNN model, called the Graph Attention Model (GAM), that processes only a portion of the graph by adaptively selecting a sequence of "informative" nodes, and shows that the proposed method is competitive against various well-known methods in graph classification.

Deep Graph Attention Model

This work presents a novel RNN model, called the Graph Attention Model (GAM), that processes only a portion of the graph by adaptively selecting a sequence of "interesting" nodes, and is equipped with an external memory component which allows it to integrate information gathered from different parts of thegraph.

Feature selection and learning for graphlet kernel

A Framework for Generalizing Graph-based Representation Learning Methods

This work introduces the notion of attributed random walks which serves as a basis for generalizing existing methods such as DeepWalk, node2vec, and many others that leverage random walks and enables these methods to be more widely applicable for both transductive and inductive learning as well as for use on graphs with attributes.

Local Algorithms for Hierarchical Dense Subgraph Discovery

A framework of local algorithms to obtain the core, truss, and nucleus decompositions, which are local, parallel, offer high scalability, and enable approximations to explore time and quality trade-offs is presented.

References

SHOWING 1-10 OF 43 REFERENCES

Efficient Graphlet Counting for Large Networks

This paper proposes a fast, efficient, and parallel algorithm for counting graphlets of size k={3,4}-nodes that take only a fraction of the time to compute when compared with the current methods used, and is on average 460x faster than current methods.

RAGE - A rapid graphlet enumerator for large networks

Graft: An Efficient Graphlet Counting Method for Large Graph Analysis

This work uses graphlet frequency distribution (GFD) as an analysis tool for understanding the variance of local topological structure in a network; it is shown that it can help in comparing, and characterizing real-life networks.

Efficient graphlet kernels for large graph comparison

In this article, two theoretically grounded speedup schemes are introduced, one based on sampling and the second specifically designed for bounded degree graphs, to efficiently compare large graphs that cannot be tackled by existing graph kernels.

Graph sample and hold: a framework for big-graph analytics

A generic stream sampling framework for big-graph analytics, called Graph Sample and Hold (gSH), which samples from massive graphs sequentially in a single pass, one edge at a time, while maintaining a small state in memory is proposed.

GUISE: a uniform sampler for constructing frequency histogram of graphlets

This paper proposes Guise, which uses a Markov Chain Monte Carlo sampling method for constructing the approximate GFD of a large network, and shows that Guise obtains the GFD with very low rate of error within few minutes, whereas the exhaustive counting-based approach takes several days.

Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts

A sampling algorithm that provably and accurately approximates the frequencies of all 4-vertex pattern subgraphs is provided, based on a novel technique of 3-path sampling and a special pruning scheme to decrease the variance in estimates.

Graphlet-based measures are suitable for biological network comparison

It is demonstrated that it is the model networks themselves that are 'unstable' at low edge density and that graphlet-based measures correctly reflect this instability, and that PPI networks of many species are well-fit by several models not previously tested.

DOULION: counting triangles in massive graphs with a coin

A practical method, out of which all triangle counting algorithms can potentially benefit, is proposed, which works with high accuracy, typically more than 99% and gives significant speedups, resulting in even ≈ 130 times faster performance.

Graph Kernels

A unified framework to study graph kernels is presented and a kernel that is close to the optimal assignment kernel of kernel of Frohlich et al. (2006) yet provably positive semi-definite is provided.