Hierarchical clustering schemes

  title={Hierarchical clustering schemes},
  author={S C Johnson},
Techniques for partitioning objects into optimally homogeneous groups on the basis of empirical measures of similarity among those objects have received increasing attention in several different fields. This paper develops a useful correspondence between any hierarchical system of such clusters, and a particular type of distance measure. The correspondence gives rise to two methods of clustering that are computationally rapid and invariant under monotonic transformations of the data. In an… Expand

Figures, Tables, and Topics from this paper

Direct Clustering of a Data Matrix
Abstract Clustering algorithms are now in widespread use for sorting heterogeneous data into homogeneous blocks. If the data consist of a number of variables taking values over a number of cases,Expand
A hybrid clustering algorithm
This technical note explores how the application of sequential clustering algorithms could be generalized to partition datasets in which there is no natural sequential ordering of the objects. Expand
Monotone invariant clustering procedures
A major justification for the hierarchical clustering methods proposed by Johnson is based upon their invariance with respect to monotone increasing transformations of the original similarityExpand
Some extensions of Johnson's hierarchical clustering algorithms
Considerable attention has been given in the psychological literature to techniques of data reduction that partition a set of objects into optimally homogeneous groups. This paper is an attempt toExpand
Overview on techniques in cluster analysis.
This chapter describes a number of methods and algorithms for cluster analysis in a stepwise framework for unsupervised, semisupervised, and supervised classification of patterns into groups. Expand
Information Theoretic Hierarchical Clustering
Two bottom-up hierarchical approaches that exploit an information theoretic proximity measure to explore the nonlinear boundaries between clusters and extract data structures further than the second order statistics are introduced. Expand
The Dynamic Clusters Method and Optimization in Non-Hierarchical Clustering
  • E. Diday
  • Computer Science
  • Optimization Techniques
  • 1973
The main aim of this paper is a synthetical study of properties of optimality in spaces formed by partitions of a finite set, and takes for a model of that study a family of particularily efficient techniques of "clusters centers" type. Expand
Learning the Threshold in Hierarchical Agglomerative Clustering
This paper shows one such solution for complete-link hierarchical agglomerative clustering using the F-measure and a small subset of labeled examples and shows promise for semi-supervised learning. Expand
Consensus of Clusterings Based on High-Order Dissimilarities
A DID-based algorithm builds upon an initial data partition, different initializations producing different data partitions, and a validation criterion based on DID is presented to select the best final partition, consisting in the estimation of graph probabilities for each cluster based on the DID. Expand
Hierarchical Clustering for Large Data Sets
A new method for speeding up hierarchical clustering with cluster seeding is introduced, and this method is compared with a traditional agglomerative hierarchical, average link clustering algorithm using several internal and external cluster validation indices. Expand


Hierarchical Grouping to Optimize an Objective Function
Abstract A procedure for forming hierarchical groups of mutually exclusive subsets, each of which has members that are maximally similar with respect to specified characteristics, is suggested forExpand
The application of computers to taxonomy.
  • P. Sneath
  • Biology, Medicine
  • Journal of general microbiology
  • 1957
SUMMARY: A method is described for handling large quantities of taxonomic data by an electronic computer so as to yield the outline of a classification based on equally weighted features. ThisExpand
The analysis of proximities: Multidimensional scaling with an unknown distance function. I.
A computer program is described that is designed to reconstruct the metric configuration of a set of points in Euclidean space on the basis of essentially nonmetric information about thatExpand
Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis
Multidimensional scaling is the problem of representingn objects geometrically byn points, so that the interpoint distances correspond in some sense to experimental dissimilarities between objects.Expand
Hierarchical Linkage Analysis for the Isolation of Types
A method of analysis which would enable investigators to analyze large matrices into hierarchical types, as illustrated in Chart 1, is needed. Expand
The analysis of proximities: Multidimensional scaling with an unknown distance function. II
The first in the present series of two papers described a computer program for multidimensional scaling on the basis of essentially nonmetric data. This second paper reports the results of two kindsExpand
An Analysis of Perceptual Confusions Among Some English Consonants
Sixteen English consonants were spoken over voice communication systems with frequency distortion and with random masking noise. The listeners were forced to guess at every sound and a count was madeExpand
Principles Of Numerical Taxonomy
This new edition continues the story of psychology with added research and enhanced content from the most dynamic areas of the field--cognition, gender and diversity studies, neuroscience and more,Expand
S ~
  • Principles of Numerical Taxonomy
  • 1963
S~rensen, T. A method of establishing groups of equal amplitude in plant sociology based on
  • 1963