• Corpus ID: 28498423

Co-clustering for directed graphs: the Stochastic co-Blockmodel and spectral algorithm Di-Sim

@article{Rohe2012CoclusteringFD,
  title={Co-clustering for directed graphs: the Stochastic co-Blockmodel and spectral algorithm Di-Sim},
  author={Karl Rohe and Tai Qin and Bin Yu},
  journal={arXiv: Machine Learning},
  year={2012}
}
Directed graphs have asymmetric connections, yet the current graph clustering methodologies cannot identify the potentially global structure of these asymmetries. We give a spectral algorithm called di-sim that builds on a dual measure of similarity that correspond to how a node (i) sends and (ii) receives edges. Using di-sim, we analyze the global asymmetries in the networks of Enron emails, political blogs, and the c elegans neural connectome. In each example, a small subset of nodes have… 

Figures from this paper

Randomized spectral co-clustering for large-scale directed networks

TLDR
The approximation error rates and misclustering error rates of proposed two randomized spectral co-clUSTering algorithms are established, which indicate better bounds than the state-of-the-art results of co-Clustering literature.

Clustering and Community Detection in Directed Networks: A Survey

Co-modularity and Co-community Detection in Large Networks

TLDR
The existing non-parametric understanding of co-clustering is generalised in this paper, by introducing an anisotropic graphon class for realisations of bipartite networks and obtaining a quantitative measure to determine the number of groups to be used when fitting co-communities.

Tensor Spectral Clustering for Partitioning Higher-order Network Structures

TLDR
This work proposes a Tensor Spectral Clustering algorithm that allows for modeling higher-order network structures in a graph partitioning framework and demonstrates that the TSC algorithm produces large partitions that cut fewer directed 3-cycles than standard spectral clustering algorithms.

Community Detection in Directed Networks and its Application to Analysis of Social Networks

TLDR
It is shown that incorporating the direction of links reveals new perspectives on communities regarding to two di↵erent roles, source and terminal, that a node may play in a community, in comparison to the existing community detection algorithms.

Community Detection in Networks with Node Covariates

TLDR
This dissertation proposes a model-based approach which allows for matched communities in the bipartite setting, in addition to node covariates with information about the matching, and proposes a unified affinity matrix (USim) to leverage the node covariate information that can be used in unipartite networks (directed and undirected).

Co-clustering separately exchangeable network data

TLDR
Stochastic blockmodels are established in addressing the co-clustering problem of partitioning a binary array into subsets, and it is shown for large sample sizes that the detection of co-Clusters in such data indicates with high probability the existence ofCo-clusters of equal size and asymptotically equivalent connectivity in the underlying generative process.

The blessing of transitivity in sparse and stochastic networks

TLDR
The first statistical results that demonstrate how small transitive clusters are more amenable to statistical estimation are provided, suggesting a theoretical explanation for the robust empirical performance of local clustering algorithms.

Inferring structure in bipartite networks using the latent blockmodel and exact ICL

TLDR
The approach is based on use of the exact integrated complete likelihood for the latent blockmodel which allows one to infer the number of clusters as well as cluster memberships using a greedy search and gives a model-based clustering of the node sets.

The Forward-Backward Embedding of Directed Graphs

TLDR
It is shown that, after proper normalization of the singular vectors, the distances between vectors in the embedding space are proportional to the mean commute times between the corresponding nodes by a forward-backward random walk in the graph, which follows the edges alternately in forward and backward directions.

References

SHOWING 1-10 OF 94 REFERENCES

Spectral clustering and the high-dimensional stochastic blockmodel

TLDR
The asymptotic results in th is paper are the first clustering results that allow the number of clusters in the model to grow with theNumber of nodes, hence the name high-dimensional.

Impact of regularization on spectral clustering

  • Antony JosephBin Yu
  • Computer Science
    2014 Information Theory and Applications Workshop (ITA)
  • 2014
TLDR
This work attempts to understand regularized form of spectral clustering and proposes a data-driven technique DK-est (standing for estimated Davis-Kahn bounds) for choosing the regularization parameter, which is shown to perform very well for simulated and real data sets.

Spectral Clustering of Graphs with General Degrees in the Extended Planted Partition Model

TLDR
A spectral clustering algorithm for similarity graphs drawn from a simple random graph model, where nodes are allowed to have varying degrees, is examined, and guarantees on the performance are shown that it outputs the correct partition under a wide range of parameter values.

Spectral redemption in clustering sparse networks

TLDR
A way of encoding sparse data using a “nonbacktracking” matrix, and it is shown that the corresponding spectral algorithm performs optimally for some popular generative models, including the stochastic block model.

Stochastic Blockmodels for Directed Graphs

TLDR
An iterative scaling algorithm is presented for fitting the model parameters by maximum likelihood and blockmodels that are simple extensions of the p 1 model are proposed specifically for such data.

A Consistent Adjacency Spectral Embedding for Stochastic Blockmodel Graphs

TLDR
It is proved that this method to estimate block membership of nodes in a random graph generated by a stochastic blockmodel is consistent for assigning nodes to blocks, as only a negligible number of nodes will be misassigned.

Community Detection in Networks using Graph Distance

TLDR
This work proposes an algorithm based on the graph distance of vertices in the network that works in identifying communities for block models and can be extended for degree-corrected block model and block models with the number of communities growing with number of Vertices.

Co-clustering separately exchangeable network data

TLDR
Stochastic blockmodels are established in addressing the co-clustering problem of partitioning a binary array into subsets, and it is shown for large sample sizes that the detection of co-Clusters in such data indicates with high probability the existence ofCo-clusters of equal size and asymptotically equivalent connectivity in the underlying generative process.

Regularized Spectral Clustering under the Degree-Corrected Stochastic Blockmodel

TLDR
The paper characterizes and justifies several of the variations of the spectral clustering algorithm in terms of the Degree-Corrected Stochastic Blockmodel and the Extended Planted Partition model, two statistical models that allow for highly heterogeneous degrees.

Spectral analysis of random graphs with skewed degree distributions

TLDR
This work extends spectral methods to random graphs with skewed degree distributions through a degree based normalization closely connected to the normalized Laplacian, and proves that after applying the transformation, spectral analysis succeeds in recovering the latent structure with high probability.
...