Metrics for Clustering Comparison in Bioinformatics

  title={Metrics for Clustering Comparison in Bioinformatics},
  author={Giovanni Rossi},
Developing from a concern in bioinformatics, this work analyses alternative metrics between partitions. From both theoretical and applicative perspectives, a useful and interesting distance between any two partitions is HD, which counts the number of atoms finer than either one but not both. While faithfully reproducing the traditional Hamming distance between subsets, HD is very sensible and computable through scalar products between Boolean vectors. It properly deals with… 

Tables from this paper


Partition distances
Alternative novel measures of the distance between any two partitions of a n-set are proposed and compared, together with a main existing one, namely partition-distance D(·, ·). The comparison
Comparing clusterings---an information based distance
This paper considers the problem of finding a consensus partition between the set of these partitions, called central partition, and defines a new graph where the nodes are the strong patterns, by using the concept of strong patterns.
Comparing partitions
The problem of comparing two different partitions of a finite set of objects reappears continually in the clustering literature. We begin by reviewing a well-known measure of partition correspondence
Partition-distance: A problem and class of perfect graphs arising in clustering
  • D. Gusfield
  • Computer Science, Mathematics
    Inf. Process. Lett.
  • 2002
Extremes in the Complexity of Computing Metric Distances Between Partitions
  • W. H. Day, R. S. Wells
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 1984
On Metric Generators of Graphs
Results on the detection of false coins are used to approximate the metric dimension (minimum size of a generator for the metric space defined by the distances) of some particular graphs for which the problem was known and open and the existence of connected joins in graphs can be solved in polynomial time.
Partition-distance via the assignment problem
An algorithm is presented that very efficiently reduces the partition-distance calculation to the classic assignment problem of weighted bipartite graphs that has known polynomial-time solutions.
An Introduction to Logical Entropy and its Relation to Shannon Entropy
  • D. Ellerman
  • Computer Science
    Int. J. Semantic Comput.
  • 2013
The logical basis for information theory is the newly developed logic of partitions that is dual to the usual Boolean logic of subsets. The key concept is a "distinction" of a partition, an ordered
Encyclopedia of Distances
This book begins with several metrics in classical geometry, then proceeds to applications of distance in fields like algebra and probability, eventually working through applied mathematics, computer science, physics and chemistry, social science, and even art and religion.