author={David J. Aldous and Nathan Ross},
  journal={Probability in the Engineering and Informational Sciences},
  pages={145 - 168}
  • D. Aldous, Nathan Ross
  • Published 2 January 2013
  • Computer Science
  • Probability in the Engineering and Informational Sciences
Consider the setting of sparse graphs on N vertices, where the vertices have distinct “names”, which are strings of length O(log N) from a fixed finite alphabet. For many natural probability models, the entropy grows as c N log N for some model-dependent rate constant c. The mathematical content of this paper is the (often easy) calculation of c for a variety of models, in particular for various standard random graph models adapted to this setting. Our broader purpose is to publicize this… 
Recovery of vertex orderings in dynamic graphs
This work gives a rigorous formulation of the problem of recovering the arrival order of nodes in a statistical learning framework and ties its feasibility to several sets of permutations associated with the symmetries of the random graph model and graphs generated by it.
Structural Information and Compression of Scale-Free Graphs
This work gives algorithmically efficient, asymptotically optimal algorithms for compression of both unlabeled and labeled preferential attachment graphs and completes the characterization of the number of symmetries for a broad range of parameters of the model.
Compression and Symmetry of Small-World Graphs and Structures
The degree distribution of this model is established, and it is used to prove the model's asymmetry in an appropriate range of parameters, and the relevant entropy and structural entropy of these random graphs are derived, in connection with graph compression.
A Universal Low Complexity Compression Algorithm for Sparse Marked Graphs
A low–complexity lossless compression algorithm for sparse marked graphs, i.e. graphical data indexed by sparse graphs, which is capable of universally achieving the optimal compression rate in a precisely defined sense is introduced.
Entropy of some general plane trees
It turns out that extending from binary trees to general trees is mathematically quite challenging and leads to new recurrences that find ample applications in the information theory of structures.
Universal lossless compression of graphical data
A recently developed notion of entropy for such processes, due to Bordenave and Caputo, is generalized to the case of marked graphs, and argued that it is an appropriate way to evaluate the efficiency of a compression scheme.
A Universal Lossless Compression Method applicable to Sparse Graphs and heavy-tailed Sparse Graphs
This paper introduces a universal lossless compression method which is simultaneously applicable to both classes of graphs, employing the local weak convergence framework for sparse graphs and the sparse graphon framework for heavy-tailed sparse graphs.
Universal Lossless Compression of Graphical Data
The lossless compression scheme proposed in this paper is proved to be universally optimal in a precise technical sense and is also capable of performing local data queries in the compressed form.
On Lossy Compression of Directed Graphs
It is shown that given a more natural distortion measure, fitting the data structure of a directed graph, the method of types cannot be applied, and a lower and upper bound on the rate-distortion problem of lossy compression is provided.
Lossless Compression of Binary Trees With Correlated Vertex Names
This paper considers trees with statistically correlated vertex names with binary plane trees and their non-plane version and finds that in this natural setting, both the entropy analysis and optimal compression are analytically tractable.


On compressing social networks
This work proposes simple combinatorial formulations that encapsulate efficient compressibility of graphs and shows that some of the problems are NP-hard yet admit effective heuristics, some of which can exploit properties of social networks such as link reciprocity.
The t-Improper Chromatic Number of Random Graphs
The t-improper chromatic number χt(G) is the smallest number of colours needed in a colouring of the vertices in which each colour class induces a subgraph of maximum degree at most t.
Processes on Unimodular Random Networks
We investigate unimodular random networks. Our motivations include their characterization via reversibility of an associated random walk and their similarities to unimodular quasi-transitive graphs.
Complex Graphs and Networks
Graph theory in the information age Old and new concentration inequalities A generative model--the preferential attachment scheme Duplication models for biological networks Random graphs with given
A history of graph entropy measures
Compression of Graphical Structures: Fundamental Limits, Algorithms, and Experiments
This paper proposes a two-stage compression algorithm that asymptotically achieves the structural entropy up to the nlog term (i.e., the first two leading terms) of theStructural entropy of G(n,p) is (n;2)h(p)-logn!+o(1)-nlog+O(n), which is the first provable (asymptotical) optimal graph compressor for Erdónyi graph models.
The webgraph framework I: compression techniques
This papers presents the compression techniques used in WebGraph, which are centred around referentiation and intervalisation (which in turn are dual to each other).
Universal compression of memoryless sources over unknown alphabets
It is shown that the patterns of i.i.d. strings over all, including infinite and even unknown, alphabets, can be compressed with diminishing redundancy, both in block and sequentially, and that the compression can be performed in linear time.
Probability Estimation in the Rare-Events Regime
This work addresses the problem of estimating the probability of an observed string that is drawn i.i.d. from an unknown distribution and introduces a novel sequence probability estimator that is consistent.
Pattern matching and lossy data compression on random fields
This work considers the problem of lossy data compression for data arranged on two-dimensional arrays, or more generally on higher dimensional arrays, and proves that the compression rate achieved is no worse than R(D/2) bits per symbol, where R( D) is the rate-distortion function.