# Efficiently Counting Vertex Orbits of All 5-vertex Subgraphs, by EVOKE

@article{Pashanasangi2020EfficientlyCV,
title={Efficiently Counting Vertex Orbits of All 5-vertex Subgraphs, by EVOKE},
journal={Proceedings of the 13th International Conference on Web Search and Data Mining},
year={2020}
}
• Published 24 November 2019
• Computer Science
• Proceedings of the 13th International Conference on Web Search and Data Mining
Subgraph counting is a fundamental task in network analysis. Typically, algorithmic work is on total counting, where we wish to count the total frequency of a (small) pattern subgraph in a large input data set. But many applications require local counts (also called vertex orbit counts) wherein, for every vertex v of the input graph, one needs the count of the pattern subgraph involving v. This provides a rich set of vertex features that can be used in machine learning tasks, especially…

## Figures and Tables from this paper

Distributed subgraph counting
• Computer Science, Mathematics
VLDB 2020
• 2020
This work develops a new general approach to count any k pattern graphs with any orbits selected by homomorphism counting, which can be solved by relational algebra using joins, group-by and aggregation.
Distributed Subgraph Counting: A General Approach
• Computer Science
Proc. VLDB Endow.
• 2020
This work develops a new general approach to count any k pattern graphs with any orbits selected by homomorphism counting, which can be solved by relational algebra using joins, group-by and aggregation.
BFS based distributed algorithm for parallel local directed sub-graph enumeration
• Computer Science
• 2022
VDMC (Vertex specific Distributed Motif Counting) a fully distributed algorithm to optimally count all the 3 and 4 vertices connected directed graphs (sub-graph motifs) associated with each vertex of a graph is proposed and its efficacy is linear in the number of counted motifs.
Lightning Fast and Space Efficient k-clique Counting
• Computer Science
WWW
• 2022
This work develops two novel dynamic programming based k-color set sampling techniques to efficiently estimate the k-clique counts, where a k- Color set contains k nodes with k different colors, which are extremely efficient and accurate.
Faster and Generalized Temporal Triangle Counting, via Degeneracy Ordering
• Computer Science, Mathematics
KDD
• 2021
A new algorithm, DOTTT (Degeneracy Oriented Temporal Triangle Totaler), that exactly counts all directed variants of (δ1,3, δ2,3), and it is proved that DOTTT runs in O(mκłog m) time, where m is the number of (temporal) edges and κ is the graph degeneracy (max core number).
MaNIACS: Approximate Mining of Frequent Subgraph Patterns through Sampling
• Computer Science
KDD
• 2021
MaNIACS leverages properties of the MNI-frequency to aggressively prune the pattern search space, and thus to reduce the time spent in exploring subspaces containing no frequent patterns, and returns high-quality collections of frequent patterns in large graphs up to two orders of magnitude faster than the exact algorithm.
odeN: Simultaneous Approximation of Multiple Motif Counts in Large Temporal Networks
• Computer Science
CIKM
• 2021
OdeN, a sampling-based algorithm that provides an accurate approximation of all the counts of the motifs in temporal networks in a fraction of the time needed by state-of-the-art methods, and that it also reports more accurate approximations than such methods.
Counting five-node subgraphs
This work proves the main result that induced subgraph counts follow as linear combinations of non-induced counts, using short and purely combinatorial arguments that can be adapted to derive count formulae for larger subgraphs.
Near-Linear Time Homomorphism Counting in Bounded Degeneracy Graphs: The Barrier of Long Induced Cycles
• Mathematics
SODA
• 2021
The following are proved: if the largest induced cycle in H has length at most $5, then there is an O(m\log m) algorithm for counting H-homomorphisms in bounded degeneracy graphs, and there is a constant$\gamma > 0, such that there is no o(m^{1+\gamma})\$ time algorithm.
Motif-based spectral clustering of weighted directed networks
• Computer Science
Appl. Netw. Sci.
• 2020
It is concluded that motif-based spectral clustering is a valuable tool for analysis of directed and bipartite weighted networks, which is also scalable and easy to implement.

## References

SHOWING 1-10 OF 72 REFERENCES
ESCAPE: Efficiently Counting All 5-Vertex Subgraphs
• Computer Science
WWW
• 2017
It is proved that it suffices to enumerate only four specific subgraphs (three of them have less than 5 vertices) to exactly count all 5-vertex patterns, the first practical algorithm for 5- Vertex pattern counting that runs at this scale and is able to compute counts of graphs with tens of millions of edges in minutes on a commodity machine.
Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts
• Computer Science
WWW
• 2015
A sampling algorithm that provably and accurately approximates the frequencies of all 4-vertex pattern subgraphs is provided, based on a novel technique of 3-path sampling and a special pruning scheme to decrease the variance in estimates.
Scalable Subgraph Counting: The Methods Behind The Madness
• Computer Science
WWW
• 2019
This tutorial will also cover methods for subgraph analysis on “big data” computational models such as the streaming model and models of parallel and distributed computation.
Efficient semi-streaming algorithms for local triangle counting in massive graphs
• Computer Science, Mathematics
KDD
• 2008
This is the first paper that addresses the problem of local triangle counting with a focus on the efficiency issues arising in massive graphs and proposes two approximation algorithms, which are based on the idea of min-wise independent permutations.
An Algorithm to Automatically Generate the Combinatorial Orbit Counting Equations
• Mathematics
PloS one
• 2016
Two new techniques are presented that allow to generate the equations needed to count graphlets with 4, 5 and 6 vertices in an automatic way and can be used to count larger graphlets than previously possible.
Efficient Graphlet Counting for Large Networks
• Computer Science
2015 IEEE International Conference on Data Mining
• 2015
This paper proposes a fast, efficient, and parallel algorithm for counting graphlets of size k={3,4}-nodes that take only a fraction of the time to compute when compared with the current methods used, and is on average 460x faster than current methods.
DOULION: counting triangles in massive graphs with a coin
• Computer Science
KDD
• 2009
A practical method, out of which all triangle counting algorithms can potentially benefit, is proposed, which works with high accuracy, typically more than 99% and gives significant speedups, resulting in even ≈ 130 times faster performance.
Efficient Triangle Counting in Large Graphs via Degree-Based Vertex Partitioning
• Computer Science, Mathematics
Internet Math.
• 2010
This paper presents an efficient triangle-counting approximation algorithm that can be adapted to the semistreaming model with space usage and a constant number of passes over the graph stream, and applies its methods to various networks with several millions of edges and gets excellent results.
MOSS-5: A Fast Method of Approximating Counts of 5-Node Graphlets in Large Graphs
• Computer Science
IEEE Transactions on Knowledge and Data Engineering
• 2018
A fast sampling method and unbiased estimators of graphlet counts, but also derive simple yet exact formulas for the variances of the estimators which are of great value in practice—the variances can be used to bound the estimates’ errors and determine the smallest necessary sampling budget for a desired accuracy.