Hierarchical block structures and high-resolution model selection in large networks

@article{Peixoto2013HierarchicalBS,
  title={Hierarchical block structures and high-resolution model selection in large networks},
  author={Tiago P. Peixoto},
  journal={ArXiv},
  year={2013},
  volume={abs/1310.4377}
}
Discovering and characterizing the large-scale topological features in empirical networks are crucial steps in understanding how complex systems function. However, most existing methods used to obtain the modular structure of networks suffer from serious problems, such as being oblivious to the statistical evidence supporting the discovered patterns, which results in the inability to separate actual structure from noise. In addition to this, one also observes a resolution limit on the size of… 

Figures from this paper

Consistency of community structure in complex networks

It is argued that traditional community detection does in fact give a significant amount of insight into network structure, and an information theoretic method for discovering the building blocks in specific networks is proposed.

Hierarchical community structure in networks

This work enumerates the challenges involved in detecting hierarchies and, by studying the spectral properties of hierarchical structure, presents an efficient and principled method for detecting them.

Inferring the mesoscale structure of layered, edge-valued, and time-varying networks.

  • Tiago P. Peixoto
  • Computer Science
    Physical review. E, Statistical, nonlinear, and soft matter physics
  • 2015
A robust and principled method by constructing generative models of modular network structure, incorporating layered, attributed and time-varying properties, as well as a nonparametric Bayesian methodology to infer the parameters from data and select the most appropriate model according to statistical evidence is proposed.

Consistencies and inconsistencies between model selection and link prediction in networks.

It is shown that, in general, the predictive performance is higher when the authors average over collections of models that are individually less plausible than when they consider only the single most plausible model.

Hypergraph reconstruction from network data

This work introduces a Bayesian framework to infer higher-order interactions hidden in network data, based on the principle of parsimony and only includes higher- order structures when there is sufficient statistical evidence for them.

Scalable detection of statistically significant communities and hierarchies, using message passing for modularity

  • Pan ZhangC. Moore
  • Computer Science
    Proceedings of the National Academy of Sciences
  • 2014
By applying the proposed algorithm recursively, subdividing communities until no statistically significant subcommunities can be found, it is shown that the algorithm can detect hierarchical structure in real-world networks more efficiently than previous methods.

Multiresolution Network Models

This work proposes a class of network models that represent network structure on multiple scales and facilitate comparison across graphs with different numbers of individuals, and shows that the model class is projective, highlighting an ongoing discussion in the social network modeling literature on the dependence of inference paradigms on the size of the observed graph.

On the consistency between model selection and link prediction in networks

It is shown that, in general, the predictive performance is higher when the authors average over collections of models that are individually less plausible, than when they consider only the single most plausible model.

Modular hierarchical and power-law small-world networks bear structural optima for minimal first passage times and cover time

This work revisits a modular hierarchical network model that interpolates, using a single parameter, between two known network topologies: from strong hierarchical modularity to an Erd\H{o}s-R\'enyi random connectivity structure, and finds an optimal structure for which the pair-averaged first passage time (FPT) and mean cover time of a discrete-time random walk are minimal.

Ordered community detection in directed networks

A method to infer community structure in directed networks where the groups are ordered in a latent one-dimensional hierarchy that determines the preferred edge direction is developed, based on a modification of the stochastic block model.
...

References

SHOWING 1-10 OF 97 REFERENCES

Hierarchical structure and the prediction of missing links in networks

This work presents a general technique for inferring hierarchical structure from network data and shows that the existence of hierarchy can simultaneously explain and quantitatively reproduce many commonly observed topological properties of networks.

The Interplay between Microscopic and Mesoscopic Structures in Complex Networks

It is shown how multiscale generative probabilistic exponential random graph models combined with efficient, distributive message-passing inference techniques can be used to achieve this separation of scales, leading to improved detection accuracy of latent classes as demonstrated on benchmark problems.

Resolution limit in community detection

It is found that modularity optimization may fail to identify modules smaller than a scale which depends on the total size of the network and on the degree of interconnectedness of the modules, even in cases where modules are unambiguously defined.

Model selection for degree-corrected block models

The first principled and tractable approach to model selection between standard and degree-corrected block models is presented, based on new large-graph asymptotics for the distribution of log-likelihood ratios under the stochastic block model, finding substantial departures from classical results for sparse graphs.

Parsimonious module inference in large networks.

It is obtained that the maximum number of detectable blocks scales as sqrt[N], where N is the number of nodes in the network, for a fixed average degree ⟨k⟩ and the simplicity of the minimum description length approach yields an efficient multilevel Monte Carlo inference algorithm.

Uncovering the overlapping community structure of complex networks in nature and society

After defining a set of new characteristic quantities for the statistics of communities, this work applies an efficient technique for exploring overlapping communities on a large scale and finds that overlaps are significant, and the distributions introduced reveal universal features of networks.

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters

This paper employs approximation algorithms for the graph-partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities, and defines the network community profile plot, which characterizes the "best" possible community—according to the conductance measure—over a wide range of size scales.

Structural Inference of Hierarchies in Networks

This work gives a precise definition of hierarchical structure, gives a generic model for generating arbitrary hierarchical structure in a random graph, and describes a statistically principled way to learn the set of hierarchical features that most plausibly explain a particular real-world network.

Multi-resolution modularity methods and their limitations in community detection

A set of multi-resolution modularity methods derived from modularity using self-loop assignment schemes, and it is shown that all these methods will encounter a limitation which is independent of the network size: large communities will break up before small communities are revealed by increasing their resolution parameters when the distribution of community sizes is very broad.

Multifractal network generator

The present work provides a tool for researchers from a variety of fields enabling them to create a versatile model of their network data by going to the infinite limit of the singular measure and the size of the corresponding graph simultaneously.
...