Agglomerative Information Bottleneck
@inproceedings{Slonim1999AgglomerativeIB, title={Agglomerative Information Bottleneck}, author={Noam Slonim and Naftali Tishby}, booktitle={NIPS}, year={1999} }
We introduce a novel distributional clustering algorithm that maximizes the mutual information per cluster between data and given categories. [] Key Method The algorithm is compared with the top-down soft version of the information bottleneck method and a relationship between the hard and soft results is established. We demonstrate the algorithm on the 20 Newsgroups data set. For a subset of two news-groups we achieve compression by 3 orders of magnitudes loosing only 10% of the original mutual information.
453 Citations
Agglomerative Multivariate Information Bottleneck
- Computer ScienceNIPS
- 2001
This paper presents a new family of simple agglomerative algorithms to construct such systems of inter-related clusters and analyzes the behavior of these algorithms and applies them to several real-life datasets.
The Density-Based Agglomerative Information Bottleneck
- Computer SciencePRICAI
- 2008
The concept of density-based chains is adopted to evaluate the information loss among the neighbors of an element, rather than the information Loss between pairs of elements, to alleviate the sub-optimality problem in aIB while simultaneously keeping the useful hierarchical clustering tree-structure.
The Density Connectivity Information Bottleneck
- Computer Science2008 The 9th International Conference for Young Computer Scientists
- 2008
The DCIB algorithm is proposed, a density connectivity information bottleneck algorithm that applies the information bottleneck method to quantify the relative information during the clustering procedure and can preserve more relative information and achieve higher precision than the aIB algorithm.
Information Bottleneck Co-clustering
- Computer Science
- 2010
An agglomerative Information Bottleneck Co-clustering approach, which automatically captures the relation between the numbers of clusters, and leverages an annealing-style strategy to bypass local optima.
Multi-way distributional clustering via pairwise interactions
- Computer ScienceICML
- 2005
An extensive empirical study of two-way, three-way and four-way applications of the MDC scheme using six real-world datasets including the 20 News-groups and the Enron email collection shows that the algorithms consistently and significantly outperform previous state-of-the-art information theoretic clustering algorithms.
Information Theoretic Clustering Using Minimum Spanning Trees
- Computer ScienceDAGM/OAGM Symposium
- 2012
In this work we propose a new information-theoretic clustering algorithm that infers cluster memberships by direct optimization of a non-parametric mutual information estimate between data…
Finding the Optimal Cardinality Value for Information Bottleneck Method
- Computer Science, BusinessADMA
- 2006
Empirical results in the documentation clustering scenario indicates that the proposed method works well for the determination of the optimal parameter value for information bottleneck method.
Determine the Optimal Parameter for Information Bottleneck Method
- Computer SciencePRICAI
- 2006
Empirical results in the documentation clustering scenario indicates that the proposed method works well for the determination of the optimal parameter value for information bottleneck method.
Data Clustering by Markovian Relaxation and the Information Bottleneck Method
- Computer ScienceNIPS
- 2000
This method combines a pairwise based approach with a vector-quantization method which provide a meaningful interpretation to the resulting clusters and can cluster data with no geometric or other bias and makes no assumption about the underlying distribution.
Multivariate Information Bottleneck
- Computer ScienceNeural Computation
- 2006
A general principled framework for multivariate extensions of the information bottleneck method is introduced that provides insights about bottleneck variations and enables us to characterize the solutions of these variations.
References
SHOWING 1-10 OF 14 REFERENCES
Pairwise Data Clustering by Deterministic Annealing
- Computer ScienceIEEE Trans. Pattern Anal. Mach. Intell.
- 1997
A deterministic annealing approach to pairwise clustering is described which shares the robustness properties of maximum entropy inference and the resulting Gibbs probability distributions are estimated by mean-field approximation.
Distributional Clustering of English Words
- Computer ScienceACL
- 1993
Deterministic annealing is used to find lowest distortion sets of clusters: as the annealed parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical "soft" clustering of the data.
Agnostic Classification of Markovian Sequences
- Computer ScienceNIPS
- 1997
The method for the classification of discrete sequences whenever they can be compressed is introduced and its application for hierarchical clustering of languages and for estimating similarities of protein sequences is illustrated.
Divergence measures based on the Shannon entropy
- Computer ScienceIEEE Trans. Inf. Theory
- 1991
A novel class of information-theoretic divergence measures based on the Shannon entropy is introduced, which do not require the condition of absolute continuity to be satisfied by the probability distributions involved and are established in terms of bounds.
Class-Based n-gram Models of Natural Language
- Computer ScienceCL
- 1992
This work addresses the problem of predicting a word from previous words in a sample of text and discusses n-gram models based on classes of words, finding that these models are able to extract classes that have the flavor of either syntactically based groupings or semanticallybased groupings, depending on the nature of the underlying statistics.
Learning from Dyadic Data
- Computer ScienceNIPS
- 1998
This paper proposes an annealed version of the standard EM algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains.
Elements of Information Theory
- Computer Science
- 1991
The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
Deterministic annealing for clustering, compression, classification, regression, and related optimization problems
- Computer ScienceProc. IEEE
- 1998
The deterministic annealing approach to clustering and its extensions has demonstrated substantial performance improvement over standard supervised and unsupervised learning methods in a variety of…
The information bottleneck method: Extracting relevant information from concurrent data. Yet unpublished manuscript
- NEC Research Institute TR,
- 1998