# Head/Tail Breaks: A New Classification Scheme for Data with a Heavy-Tailed Distribution

@article{Jiang2012HeadTailBA, title={Head/Tail Breaks: A New Classification Scheme for Data with a Heavy-Tailed Distribution}, author={Bin Jiang}, journal={The Professional Geographer}, year={2012}, volume={65}, pages={482 - 494} }

This article introduces a new classification scheme—head/tail breaks—to find groupings or hierarchy for data with a heavy-tailed distribution. The heavy-tailed distributions are heavily right skewed, with a minority of large values in the head and a majority of small values in the tail, commonly characterized by a power law, a lognormal, or an exponential function. For example, a country's population is often distributed in such a heavy-tailed manner, with a minority of people (e.g., 20 percent…

## 287 Citations

### Scaling of Geographic Space as a Universal Rule for Map Generalization

- Mathematics
- 2013

Map generalization is a process of producing maps at different levels of detail by retaining essential properties of the underlying geographic space. In this article, we explore how the map…

### CHAPTER 13 Head / tail Breaks for Visualization of City Structure and Dynamics

- Environmental Science
- 2016

The things surrounding us vary dramatically, which implies that there are far more small things than large ones, e.g., far more small cities than large ones in the world. This dramatic variation is…

### A Head/Tail Breaks-Based Method for Efficiently Estimating the Absolute Boltzmann Entropy of Numerical Raster Data

- Computer ScienceISPRS Int. J. Geo Inf.
- 2020

The condition of head and tail breaks was relaxed and classified data with a heavy-tailed distribution and the average of the data values in a given class was regarded as its representative value and this was substituted into a linear function to obtain the full expression of the relationship between classification level and Boltzmann entropy.

### Characterizing the Heterogeneity of the OpenStreetMap Data and Community

- Computer ScienceISPRS Int. J. Geo Inf.
- 2015

The heterogeneity of the entire OSM database and historical archive in the context of big data is characterized, finding that there are far more small elements than large ones, far more inactive users than active ones, and far more lightly edited elements than heavy-edited ones.

### A multi-scale representation model of polyline based on head/tail breaks

- Computer ScienceInt. J. Geogr. Inf. Sci.
- 2020

A model to quantify the multiscale representation of a polyline based on iterative head/tail breaks based on Shannon's information theory and the radical law is introduced and applied to model multiscales polyline representation by quantifying the scale of each simplified polyline.

### A Comparison Study on Natural and Head/tail Breaks Involving Digital Elevation Models

- Business
- 2013

The most widely used classification method for statistical mapping is Jenks’s natural breaks. However, it has been found that natural breaks is not good at classifying data which have scaling…

### Equal Area Breaks: A Classification Scheme for Data to Obtain an Evenly-colored Choropleth Map

- Computer ScienceArXiv
- 2020

An efficient algorithm for computing the choropleth map classification scheme known as equal area breaks or geographical quantiles is introduced and is compared with the quantiles and Jenks natural breaks algorithms and found to be superior from a visual standpoint by a user study.

### Wholeness as a hierarchical graph to capture the nature of space

- ArtInt. J. Geogr. Inf. Sci.
- 2015

This paper defines wholeness as a hierarchical graph, in which individual centers are represented as the nodes and their relationships as the directed links, and suggests that the hierarchical levels, or the ht-index of the PR scores induced by the head/tail breaks, can characterize the degree of wholleness for the whole.

### SECTOR: A Neural Model for Coherent Topic Segmentation and Classification

- Computer ScienceTACL
- 2019

SECTOR, a model to support machine reading systems by segmenting documents into coherent sections and assigning topic labels to each section, and reports a highest score of 71.6% F1 for the segmentation and classification of 30 topics from the English city domain.

## References

SHOWING 1-10 OF 23 REFERENCES

### Power-Law Distributions in Empirical Data

- MathematicsSIAM Rev.
- 2009

This work proposes a principled statistical framework for discerning and quantifying power-law behavior in empirical data by combining maximum-likelihood fitting methods with goodness-of-fit tests based on the Kolmogorov-Smirnov (KS) statistic and likelihood ratios.

### The selection of class intervals

- Environmental Science
- 1977

The selection of class intervals, which can strongly affect the visual impression given by a map, is currently a totally anarchic branch of cartography. While practising cartographers have barely…

### NESTED-MEANS MAP CLASSES FOR STATISTICAL MAPS

- Mathematics
- 1970

ABSTRACT A general, objective method is presented for the calculation of class intervals for statistical maps. The arithmetic mean divides a numerical array into two classes and the means of each of…

### A Universal Rule for the Distribution of Sizes

- Mathematics
- 1999

Human artifacts, ranging from small objects all the way up to large buildings and cities, display a variety and range of subdivisions. Repeating structural and design elements of the same size will…

### Evaluation of Methods for Classifying Epidemiological Data on Choropleth Maps in Series

- Environmental Science
- 2002

Our research goal was to determine which choropleth classification methods are most suitable for epidemiological rate maps. We compared seven methods using responses by fifty-six subjects in a…

### Scaling of geographic space from the perspective of city and field blocks and using volunteered geographic information

- GeographyInt. J. Geogr. Inf. Sci.
- 2012

An analogy between a country and a city (or a city or geographic space in general) and a complex organism like the human body or the human brain is drawn to further elaborate on the power of this block perspective in reflecting the structure or patterns of geographic space.

### Street hierarchies: a minority of streets account for a majority of traffic flow

- Computer ScienceInt. J. Geogr. Inf. Sci.
- 2009

This study provides new evidence as to how a city is (self‐)organized, contributing to the understanding of cities and their evolution using increasingly available mobility geographic information.

### Population-Density Maps of the United States: Techniques and Patterns

- Geology
- 1943

The construction of an isarithmic map of population density involves a number of problems. In the solution of these problems several techniques were applied, the most important of which was the use…

### On Grouping for Maximum Homogeneity

- Education
- 1958

Abstract Given a set of arbitrary numbers, what is a practical procedure for grouping them so that the variance within groups is minimized? An answer to this question, including a description of an…

### Self-organized natural roads for predicting traffic flow: a sensitivity study

- Computer Science
- 2008

It was found that there exists a tipping point from segment-based to road-based network topology in terms of correlation between ranking metrics and their traffic and to the great surprise, this correlation is significantly improved if a selfish rather than utopian strategy is adopted in forming the self-organized natural roads.