Maria - Florina Balcan

Learn More
This paper provides new algorithms for distributed clustering for two popular center-based objectives, k-median and k-means. These algorithms have provable guarantees and improve communication complexity over existing approaches. Following a classic approach in clustering by [13], we reduce the problem of finding a clustering with low cost to the problem of(More)
The typical algorithmic problem in viral marketing aims to identify a set of influential users in a social network, who, when convinced to adopt a product, shall influence other users in the network and trigger a large cascade of adoptions. However, the host (the owner of an online social platform) often faces more constraints than a single product, endless(More)
Table 1: Learnable/not learnable in polynomial time; k is assumed to be constant. Consistency model Mistake Bound Model PAC model Disjunctions Yes Yes Yes Conjunctions Yes Yes Yes k-CNF Yes Yes Yes k-term DNF No (NP ̸= RP assumption) Yes Yes k-Decision Lists Yes Yes Yes Linear separators Yes Yes Yes Blum’91 Yes No (crypto assumption) Yes poly size circuits(More)
This thesis explores the power of interactivity in unsupervised machine learning problems. Interactive algorithms employ feedback driven measurements to mitigate the cost of data acquisition and consequently enable statistical analysis in otherwise intractable settings. Unsupervised learning methods are fundamental tools across a variety of domains, and(More)
We introduce a new approach for designing computationally efficient learning algorithms that are tolerant to noise, and demonstrate its effectiveness by designing algorithms with improved noise tolerance guarantees for learning linear separators. We consider both the malicious noise model of Valiant [Valiant 1985; Kearns and Li 1988] and the adversarial(More)
Computational social science is an emerging academic research area at the intersection of computer science, statistics, and the social sciences, in which quantitative methods and computational tools are used to identify and answer social science questions. The field is driven by new sources of data from the Internet, sensor networks, government databases,(More)