Composition attacks and auxiliary information in data privacy

  title={Composition attacks and auxiliary information in data privacy},
  author={Srivatsava Ranjit Ganta and Shiva Prasad Kasiviswanathan and Adam D. Smith},
Privacy is an increasingly important aspect of data publishing. Reasoning about privacy, however, is fraught with pitfalls. One of the most significant is the auxiliary information (also called external knowledge, background knowledge, or side information) that an adversary gleans from other channels such as the web, public records, or domain knowledge. This paper explores how one can reason about privacy in the face of rich, realistic sources of auxiliary information. Specifically, we… 

Figures and Tables from this paper

Personal privacy vs population privacy: learning to attack anonymization

It is demonstrated that even under Differential Privacy, such classifiers can be used to infer "private" attributes accurately in realistic data and it is observed that the accuracy of inference of private attributes for differentially private data and $l$-diverse data can be quite similar.

Privacy in Databases

It is proved that absolute disclosure prevention is impossible, which means that a person that gains access to a database can always breach the privacy of an individual.

Individual Privacy vs Population Privacy: Learning to Attack Anonymization

It is demonstrated that even under Differential Privacy, such classifiers can be used to accurately infer "private" attributes in realistic data and the accuracy of inference of private attributes for Differentially Private data and l-diverse data can be quite similar.

Certain Investigations on Approaches for Protecting Graph Privacy in Data Anonymization

This paper systematically analyze the pure structure anonymization mechanisms and models proposed in the literature and makes a detailed study on k-degree-l-diversity anonymity model, which takes into consideration the structural information and sensitive labels of individuals as well.

A Note on Data Privacy: Past, Present, and Future

The problem of data privacy perhaps goes back to 2000 when Rakesh Agrawal and Ramakrishnan Srikant published their seminal paper on privacy-preserving data mining [1]. The most influential data

No free lunch in data privacy

This paper argues that privacy of an individual is preserved when it is possible to limit the inference of an attacker about the participation of the individual in the data generating process, different from limiting the inference about the presence of a tuple.

GUPT: privacy preserving data analysis made easy

The design and evaluation of a new system, GUPT, that guarantees differential privacy to programs not developed with privacy in mind, makes no trust assumptions about the analysis program, and is secure to all known classes of side-channel attacks.

Differentially private data release for data mining

This paper proposes the first anonymization algorithm for the non-interactive setting based on the generalization technique, which first probabilistically generalizes the raw data and then adds noise to guarantee ∈-differential privacy.



L-diversity: privacy beyond k-anonymity

This paper shows with two simple attacks that a \kappa-anonymized dataset has some subtle, but severe privacy problems, and proposes a novel and powerful privacy definition called \ell-diversity, which is practical and can be implemented efficiently.

Minimality Attack in Privacy Preserving Data Publishing

This paper introduces a model called m-confidentiality which deals with minimality attacks, and proposes a feasible solution that can prevent such attacks with very little overhead and information loss.

Privacy Skyline: Privacy with Multidimensional Adversarial Knowledge

A novel multidimensional approach to quantifying an adversary's external knowledge is proposed, which allows the publishing organization to investigate privacy threats and enforce privacy requirements in the presence of various types and amounts of external knowledge.

Limiting privacy breaches in privacy preserving data mining

This paper presents a new formulation of privacy breaches, together with a methodology, "amplification", for limiting them, and instantiate this methodology for the problem of mining association rules, and modify the algorithm from [9] to limit privacy breaches without knowledge of the data distribution.

Anonymity for continuous data publishing

This paper systematically characterize the correspondence attacks and proposes an efficient anonymization algorithm to thwart the attacks in the model of continuous data publishing.

Privacy Preserving Data Mining

This work considers a scenario in which two parties owning confidential databases wish to run a data mining algorithm on the union of their databases, without revealing any unnecessary information, and proposes a protocol that is considerably more efficient than generic solutions and demands both very few rounds of communication and reasonable bandwidth.

Worst-Case Background Knowledge for Privacy-Preserving Data Publishing

A language that can express any background knowledge about the data is proposed and a polynomial time algorithm is provided to measure the amount of disclosure of sensitive information in the worst case, given that the attacker has at most k pieces of information in this language.

Maintaining K-Anonymity against Incremental Updates

This paper investigates the problem of maintaining k-anonymity against incremental updates, and proposes the monotonic incremental anonymization property, and a new approach utilizes the more and more accumulated data to reduce the information loss.

Secure Anonymization for Incremental Datasets

This paper analyzes various inference channels that may exist in multiple anonymized datasets and discusses how to avoid such inferences, and presents an approach to securely anonymizing a continuously growing dataset in an efficient manner while assuring high data quality.

Personalized privacy preservation

The authors' technique performs the minimum generalization for satisfying everybody's requirements, and thus, retains the largest amount of information from the microdata, and establishes the superiority of the proposed solutions.