# A Brief History of Generative Models for Power Law and Lognormal Distributions

@article{Mitzenmacher2003ABH, title={A Brief History of Generative Models for Power Law and Lognormal Distributions}, author={Michael Mitzenmacher}, journal={Internet Mathematics}, year={2003}, volume={1}, pages={226 - 251} }

Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a lognormal distribution. In trying to learn enough about these distributions to settle the question, I found a rich and long history, spanning many fields. Indeed, several recently proposed models from the computer science community have antecedents in work from decades ago. Here, I briefly survey some of this history, focusing on underlying generative models…

## 1,699 Citations

On the Power Laws of Language: Word Frequency Distributions

- Computer ScienceSIGIR
- 2017

A simple generative model is proposed to capture the word frequency distribution of languages and is shown to match the observations both analytically and empirically.

Swarm simulations of the power law distribution models

- Computer Science
- 2003

This work bases its simulations on existing models where incremental growth and preferential attachment are the key ingredients for the emergence of power laws as well as expand those to include new variables and proposes a new model without the incremental growth requirement.

How rare are power-law networks really?

- Computer ScienceProceedings of the Royal Society A
- 2020

This paper modifications the well-known Kolmogorov–Smirnov test to achieve even sensitivity along the tail, considering the dependence between the empirical degrees under the null distribution, while guaranteeing sufficient power of the test.

Learning and Interpreting Complex Distributions in Empirical Data

- Computer ScienceKDD
- 2018

This paper showcases a four-parameter dynamic model together with inference and simulation algorithms, which is able to fit and generate a family of distributions, ranging from Gaussian, Exponential, Power Law, Stretched Exponential (Weibull), to their complex variants with multi-scale complexities.

Competition and fragmentation: a simple model generating lognormal-like distributions

- Physics
- 2009

The current distribution of language size in terms of speaker population is generally described using a lognormal distribution. Analyzing the original real data we show how the double-Pareto…

Short-ranged memory model with preferential growth.

- Computer SciencePhysical review. E
- 2018

A variant of the Yule-Simon model for preferential growth by incorporating a finite kernel to model the effects of bounded memory is introduced and the properties of the model are characterized combining analytical arguments with extensive numerical simulations.

Power laws, Pareto distributions and Zipf's law

- Physics
- 2005

Some of the empirical evidence for the existence of power-law forms and the theories proposed to explain them are reviewed.

Probability Distributions in Complex Systems

- PhysicsEncyclopedia of Complexity and Systems Science
- 2009

This essay enlarges the description of distributions by proposing that ``kings'', i.e., events even beyond the extrapolation of the power law tail, may reveal an information which is complementary and perhaps sometimes even more important than the powerlaw distribution.

Power-Law Distributions in Empirical Data

- MathematicsSIAM Rev.
- 2009

This work proposes a principled statistical framework for discerning and quantifying power-law behavior in empirical data by combining maximum-likelihood fitting methods with goodness-of-fit tests based on the Kolmogorov-Smirnov (KS) statistic and likelihood ratios.

## References

SHOWING 1-10 OF 174 REFERENCES

The Double Pareto-Lognormal Distribution—A New Parametric Model for Size Distributions

- MathematicsWWW 2001
- 2001

Abstract A family of probability densities, which has proved useful in modelling the size distributions of various phenomens, including incomes and earnings, human settlement sizes, oil-field volumes…

On 1/f noise and other distributions with long tails.

- MathematicsProceedings of the National Academy of Sciences of the United States of America
- 1982

A simple amplification model is introduced to characterize the transition from a log-normal distribution to an inverse-power Pareto tail.

Some Further Notes on a Class of Skew Distribution Functions

- MathematicsInf. Control.
- 1960

Population fluctuations, power laws and mixtures of lognormal distributions

- Environmental Science
- 2001

A number of investigators have invoked a cascading local interaction model to account for power-law-distributed fluctuations in ecological variables. Invoking such a model requires that species be…

Maximum entropy formalism, fractals, scaling phenomena, and 1/f noise: A tale of tails

- Mathematics
- 1983

In this report on examples of distribution functions with long tails we (a) show that the derivation of distributions with inverse power tails from a maximum entropy formalism would be a consequence…

From gene families and genera to incomes and internet file sizes: why power laws are so common in nature.

- EconomicsPhysical review. E, Statistical, nonlinear, and soft matter physics
- 2002

If stochastic processes with exponential growth in expectation are killed (or observed) randomly, the distribution of the killed or observed state exhibits power-law behavior in one or both tails.

On the tails of web file size distributions

- Computer Science
- 2001

It is argued that the data ususally available for classifying a distribution is insufficient to classify the tail and it is sufficient to focus on mechanisms leading to power law like “waists” of the distributions.

Informetric distributions, part I: Unified overview

- Mathematics
- 1990

This article is the first of a two‐part series on the informetric distributions, a family of regularities found to describe a wide range of phenomena both within and outside of the information…

ON A CLASS OF SKEW DISTRIBUTION FUNCTIONS

- Mathematics
- 1955

It is the purpose of this paper to analyse a class of distribution functions that appears in a wide range of empirical data-particularly data describing sociological, biological and economic…

Informetric distributions, part I: Unified overview

- MathematicsJ. Am. Soc. Inf. Sci.
- 1990

The basic forms these regularities take are introduced, a model is proposed that makes plausible the possibility that, in spite of marked differences in their appearance, these distributions are variants of a single distribution.