• Corpus ID: 221081667

Correlation Clustering with Asymmetric Classification Errors

@article{Jafarov2020CorrelationCW,
  title={Correlation Clustering with Asymmetric Classification Errors},
  author={Jafar Jafarov and Sanchit Kalhan and Konstantin Makarychev and Yury Makarychev},
  journal={ArXiv},
  year={2020},
  volume={abs/2108.05696}
}
In the Correlation Clustering problem, we are given a weighted graph G with its edges labelled as “similar” or “dissimilar” by a binary classifier. The goal is to produce a clustering that minimizes the weight of “disagreements”: the sum of the weights of “similar” edges across clusters and “dissimilar” edges within clusters. We study the correlation clustering problem under the following assumption: Every “similar” edge e has weight we ∈ [αw,w] and every “dissimilar” edge e has weight we ≥ αw… 

Figures from this paper

Local Correlation Clustering with Asymmetric Classification Errors

TLDR
An O ( (1/α)/2−/2p · log 1/α ) approximation algorithm is given for Correlation Clustering and an almost matching convex programming integrality gap is shown.

Robust Correlation Clustering with Asymmetric Noise

TLDR
It is demonstrated that l2-norm-diag recovers nodes with sufficiently strong cluster membership in graph instances generated by the NFM, thereby making progress towards establishing the provable robustness of the proposed algorithm.

THE UNIVERSITY OF CHICAGO FOUR ALGORITHMS FOR CORRELATION CLUSTERING: A SURVEY A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES IN CANDIDACY FOR THE DEGREE OF MASTERS

  • Computer Science
  • 2020
TLDR
This exposition focuses on the case when G is complete and unweighted, and explores four approximation algorithms for the Correlation Clustering problem under this assumption.

Correlation Clustering with Sherali-Adams

TLDR
This paper shows that there exists a (1 . 994+ ε )-approximation algorithm based on O (1 /ε 2 ) rounds of the Sherali-Adams hierarchy and reaches an approximation ratio of 2 + ε for Correlation Clustering.

Faster Deterministic Approximation Algorithms for Correlation Clustering and Cluster Deletion

TLDR
This paper proves new relationships between correlation clustering problems and edge labeling problems related to the principle of strong triadic closure, and develops faster techniques that are purely combinatorial, based on computing maximal matchings in certain auxiliary graphs and hypergraphs.

Correlation Clustering via Strong Triadic Closure Labeling: Fast Approximation Algorithms and Practical Lower Bounds

TLDR
This work presents faster approximation algorithms that avoid linear programming relaxations, for two well-studied special cases: cluster editing and cluster deletion, by draw-ing new connections to edge labeling problems related to the principle of strong triadic closure.

Differentially Private Correlation Clustering

TLDR
An algorithm is proposed that achieves subquadratic additive error compared to the optimal cost and a lower bound is given showing that any pure differentially private algorithm for correlation clustering requires additive error of Ω(n).

References

SHOWING 1-10 OF 24 REFERENCES

Clustering with qualitative information

TLDR
This work considers the problem of clustering a collection of elements based on pairwise judgments of similarity and dissimilarity, and gives a factor 4 approximation for minimization on complete graphs, and a factor O(log n) approximation for general graphs.

Correlation clustering with noisy input

TLDR
This work uses the natural semi-definite programming relaxation followed by an interesting rounding phase and uses SDP duality and spectral properties of random matrices to analyserelation clustering, a type of clustering that uses a basic form of input data that uses similarity/dissimilarity information.

Correlation clustering in general weighted graphs

Correlation Clustering with Noisy Partial Information

TLDR
A semi-random model for the Correlation Clustering problem on arbitrary graphs G is proposed and two approximation algorithms for Correlationclustering instances from this model are given.

Parallel Correlation Clustering on Big Graphs

TLDR
C4 and ClusterWild!, two algorithms for parallel correlation clustering that run in a polylogarithmic number of rounds and achieve nearly linear speedups, provably are presented.

Breaking the Small Cluster Barrier of Graph Clustering

TLDR
It is proved that small clusters, under certain mild assumptions, do not hinder recovery of large ones and an iterative algorithm to recover almost all clusters via a "peeling strategy", i.e., recover large clusters first, leading to a reduced problem, and repeat this procedure.

Aggregating inconsistent information: Ranking and clustering

TLDR
This work almost settles a long-standing conjecture of Bang-Jensen and Thomassen and shows that unless NP⊆BPP, there is no polynomial time algorithm for the problem of minimum feedback arc set in tournaments.

Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks

TLDR
Experiments performed show that combining the order produced by the proposed algorithm with the WebGraph compression framework provides a major increase in compression with respect to all currently known techniques, both on web graphs and on social networks.

Bounding and Comparing Methods for Correlation Clustering Beyond ILP

TLDR
This work uses semi-definite programming (SDP) to provide a tighter bound on the NP-hard problem of partitioning a dataset given pairwise affinities between all points, showing that simple algorithms are already close to optimality.

Near Optimal LP Rounding Algorithm for CorrelationClustering on Complete and Complete k-partite Graphs

TLDR
These results improve a long line of work on approximation algorithms for correlation clustering in complete graphs, previously culminating in a ratio of 2.5 by Ailon, Charikar and Newman.