Collecting and Analyzing Multidimensional Data with Local Differential Privacy

@article{Wang2019CollectingAA,
  title={Collecting and Analyzing Multidimensional Data with Local Differential Privacy},
  author={Ning Wang and Xiaokui Xiao and Yin Yang and Jun Zhao and Siu Cheung Hui and Hyejin Shin and Junbum Shin and Ge Yu},
  journal={2019 IEEE 35th International Conference on Data Engineering (ICDE)},
  year={2019},
  pages={638-649}
}
  • N. Wang, Xiaokui Xiao, +5 authors Ge Yu
  • Published 8 April 2019
  • Computer Science, Geology
  • 2019 IEEE 35th International Conference on Data Engineering (ICDE)
Local differential privacy (LDP) is a recently proposed privacy standard for collecting and analyzing data, which has been used, e.g., in the Chrome browser, iOS and macOS. In LDP, each user perturbs her information locally, and only sends the randomized version to an aggregator who performs analyses, which protects both the users and the aggregator against private information leaks. Although LDP has attracted much research attention in recent years, the majority of existing work focuses on… Expand
Local Differential Privacy for data collection and analysis
TLDR
This paper designs two ( ∊, δ ) -LDP algorithms for mean estimations on multi-dimensional numeric data, which can ensure higher accuracy than the optimal Gaussian mechanism, and investigates different local protocols for frequency estimations in collecting and analyzing users’ data. Expand
Random Sampling Plus Fake Data: Multidimensional Frequency Estimates With Local Differential Privacy
TLDR
It is argued that aggregators (who are also seen as attackers) are aware of the sampled attribute and its LDP value, which is protected by a "less strict" eε probability bound (rather than e^ε/d ); this way, a solution named Random S ampling plus Fake Data (RS+FD), which allows creating uncertainty over the sampledattribute by generating fake data for each non-sampled attribute; RS+FD further benefits from amplification by sampling. Expand
Locally Differentially Private Data Collection and Analysis
TLDR
Novel algorithms for collecting multi-dimensional numeric data that can ensure higher accuracy than the optimal Gaussian mechanism while guaranteeing strong privacy for each user are proposed with high utility in data analytics and machine learning. Expand
Local Differential Privacy and Its Applications: A Comprehensive Survey
TLDR
This survey provides a comprehensive and structured overview of the local differential privacy technology and summarise and analyze state-of-the-art research in LDP and compare a range of methods in the context of answering a variety of queries and training different machine learning models. Expand
BiSample: Bidirectional Sampling for Handling Missing Data with Local Differential Privacy
TLDR
This paper proposes BiSample: a bidirectional sampling technique value perturbation in the framework of LDP, and combines the BiSample mechanism with users' privacy preferences for missing data perturbations. Expand
Improving the Utility of Locally Differentially Private Protocols for Longitudinal and Multidimensional Frequency Estimates
TLDR
This paper proposes a new solution named Adaptive LDP for LOngitudinal and Multidimensional FREquency Estimates (ALLOMFREE), which randomly samples a single attribute to send with the whole privacy budget and adaptively selects the optimal protocol, i.e., either L-GRR or L-OSUE. Expand
Task-aware Privacy Preservation for Multi-dimensional Data
TLDR
The key idea is to use an encoder-decoder framework to learn (and anonymize) a task-relevant latent representation of user data, which gives an analytical nearoptimal solution for a linear setting with mean-squared error (MSE) task loss. Expand
Efficient Discrete Distribution Estimation Schemes Under Local Differential Privacy
TLDR
A family of new efficient discrete distribution estimation schemes under LDP which reduce the communication cost to less than \(O(\mathrm {log}(2+e^\epsilon ))\) and obtain almost the same expected estimation loss as Ye-Barg mechanisms under \(\ell _2^2\) metric and \(\ell_1\) metric are proposed. Expand
Locally Differentially Private Frequency Estimation with Consistency
TLDR
It is shown that adding post-processing steps to FO protocols by exploiting the knowledge that all individual frequencies should be non-negative and they sum up to one can lead to significantly better accuracy for a wide range of tasks, including frequencies of individual values, frequencies of the most frequent values, and frequencies of subsets of values. Expand
Conditional Analysis for Key-Value Data with Local Differential Privacy
TLDR
This paper develops a set of new perturbation mechanisms for key-value data collection and analysis under the strong model of local differential privacy, and proposes the conditional frequency estimation method for key analysis and the conditional mean estimation for value analysis in key- Value data. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 48 REFERENCES
Heavy Hitter Estimation over Set-Valued Data with Local Differential Privacy
TLDR
The main idea is to first gather a candidate set of heavy hitters using a portion of the privacy budget, and focus the remaining budget on refining the candidate set in a second phase, which is much more efficient budget-wise than obtaining the heavy hitters directly from the whole dataset. Expand
Locally Differentially Private Heavy Hitter Identification
TLDR
In this paper, a proposed LDP protocol, which the authors call Prefix Extending Method (PEM), users are divided into groups, with each group reporting a prefix of her value and experiments show that under the same privacy guarantee and computational cost, PEM has better utility on both synthetic and real-world datasets than existing solutions. Expand
Locally Differentially Private Frequent Itemset Mining
TLDR
This paper formally defines padding and sample based frequency oracles (PSFO) and identifies the privacy amplification property in PSFO, and proposes SVIM, a protocol for finding frequent items in the set-valued LDP setting, which significantly improves over existing methods. Expand
PrivTrie: Effective Frequent Term Discovery under Local Differential Privacy
  • Ning Wang, Xiaokui Xiao, +4 authors Ge Yu
  • Computer Science
  • 2018 IEEE 34th International Conference on Data Engineering (ICDE)
  • 2018
TLDR
The proposed PrivTrie directly collects frequent terms from users by iteratively constructing a trie under LDP with a novel adaptive approach that conserves privacy budget by building internal nodes of the trie with the lowest level of accuracy necessary. Expand
Toward Distribution Estimation under Local Differential Privacy with Small Samples
TLDR
This paper focuses on the EM (Expectation-Maximization) reconstruction method, which is a state-of-the-art statistical inference method, and proposes a method to correct its estimation error, and proves that the proposed method reduces the MSE (Mean Square Error) under some assumptions. Expand
Private spatial data aggregation in the local setting
TLDR
A new privacy model called personalized local differential privacy (PLDP) is proposed that allows to achieve desirable utility while still providing rigorous privacy guarantees, and an efficient personalized count estimation protocol is designed as a building block for achieving PLDP. Expand
Locally Differentially Private Protocols for Frequency Estimation
TLDR
This paper introduces a framework that generalizes several LDP protocols proposed in the literature and yields a simple and fast aggregation algorithm, whose accuracy can be precisely analyzed, resulting in two new protocols that provide better utility than protocols previously proposed. Expand
PrivateClean: Data Cleaning and Differential Privacy
TLDR
PrivateClean explores the link between data cleaning and differential privacy in a framework that includes a technique for creating private datasets of numerical and discrete-valued attributes, a formalism for privacy-preserving data cleaning, and techniques for answering sum, count, and avg queries after cleaning. Expand
Extremal Mechanisms for Local Differential Privacy
TLDR
It is shown that for all information theoretic utility functions studied in this paper, maximizing utility is equivalent to solving a linear program, the outcome of which is the optimal staircase mechanism, which is universally optimal in the high and low privacy regimes. Expand
$\textsf{LoPub}$ : High-Dimensional Crowdsourced Data Publication With Local Differential Privacy
TLDR
A local differentially private high-dimensional data publication algorithm (LoPub) is developed by taking advantage of the distribution estimation techniques, and correlations among multiple attributes are identified to reduce the dimensionality of crowdsourced data, thus speeding up the distribution learning process and achieving high data utility. Expand
...
1
2
3
4
5
...