• Publications
  • Influence
Clustering Data Streams
TLDR
This work gives constant-factor approximation algorithms for the k-median problem in the data stream model of computation in a single pass, and shows negative results implying that these algorithms cannot be improved in a certain sense. Expand
Streaming-data algorithms for high-quality clustering
TLDR
This work describes a streaming algorithm that effectively clusters large data streams and provides empirical evidence of the algorithm's performance on synthetic and real data streams. Expand
Clustering Data Streams: Theory and Practice
TLDR
This work describes a streaming algorithm that effectively clusters large data streams and provides empirical evidence of the algorithm's performance on synthetic and real data streams. Expand
Releasing search queries and clicks privately
TLDR
This paper demonstrates that a non-negligible fraction of queries and clicks can indeed be safely published via a collection of experiments on a real search log, and selects an application, keyword generation, and shows that the keyword suggestions generated from the perturbed data resemble thosegenerated from the original data. Expand
Robust Random Cut Forest Based Anomaly Detection on Streams
TLDR
A robust random cut data structure that can be used as a sketch or synopsis of the input stream is investigated and it is shown how the sketch can be efficiently updated in a dynamic data stream. Expand
Privacy via the Johnson-Lindenstrauss Transform
TLDR
This work shows that distance computations with privacy is an achievable goal by projecting each user's representation into a random, lower-dimensional space via a sparse Johnson-Lindenstrauss transform and then adding Gaussian noise to each entry of the lower- dimensional representation. Expand
Simulatable auditing
TLDR
It is demonstrated that sum queries can be audited in a simulatable fashion under the classical definition of privacy where a breach occurs if a sensitive value is fully compromised and a probabilistic notion of (partial) compromise is introduced. Expand
Secure computation of the kth-ranked element
TLDR
The multi-party protocol can be used in the two-party case and can be made secure against a malicious adversary, and can hide the sizes of the original datasets. Expand
When Random Sampling Preserves Privacy
TLDR
This work quantitatively examines the relationship between the number of rare values in a table and the privacy in a released random sample, and establishes a direct connection between sample size that is safe to release and privacy. Expand
Secure Computation of the k th-Ranked Element
TLDR
The multi-party protocol can be used in the two-party case and can be made secure against a malicious adversary, and can hide the sizes of the original datasets. Expand
...
1
2
3
4
5
...