# The hardness and approximation algorithms for l-diversity

@inproceedings{Xiao2010TheHA, title={The hardness and approximation algorithms for l-diversity}, author={Xiaokui Xiao and Ke Yi and Yufei Tao}, booktitle={EDBT '10}, year={2010} }

The existing solutions to privacy preserving publication can be classified into the theoretical and heuristic categories. The former guarantees provably low information loss, whereas the latter incurs gigantic loss in the worst case, but is shown empirically to perform well on many real inputs. While numerous heuristic algorithms have been developed to satisfy advanced privacy principles such as l-diversity, t-closeness, etc., the theoretical category is currently limited to k-anonymity which…

## Figures and Tables from this paper

## 60 Citations

The l-Diversity problem: Tractability and approximability

- Computer ScienceTheor. Comput. Sci.
- 2013

On the Complexity of the l-diversity Problem

- Computer Science, MathematicsMFCS
- 2011

This paper investigates the approximation and parameterized complexity of l-diversity, where the possible attributes are distinguished in sensible attributes and quasi-identifier attributes.

On the Complexity of t-Closeness Anonymization and Related Problems

- Computer Science, MathematicsDASFAA
- 2013

It is proved that for every constant $t$ such that $0\leq t<1$, it is NP-hard to find an optimal $t-closeness generalization of a given table.

Randomized addition of sensitive attributes for l-diversity

- Computer Science2014 11th International Conference on Security and Cryptography (SECRYPT)
- 2014

This paper proposes a new technique for l-diversity, which keeps QIDs unchanged and randomizes sensitive attributes of each individual so that data users can analyze it based on QIDs they focus on and does not require the eligibility requirement.

An Algorithm for l-diversity based on Randomized Addition of Sensitive Values

- Computer Science
- 2015

This study proposes a new technique for l-diversity, which keeps QIDs unchanged so that data users can analyze it based on QIDs they focus on, and proves that the proposed method can result in a better tradeoff between privacy and utility of the anonymized database.

A generalization model for multi-record privacy preservation

- Computer ScienceJ. Ambient Intell. Humaniz. Comput.
- 2020

A bidirectional personalized generalization model is proposed as a new solution to satisfy higher privacy requirements and make it suitable for multi-record publishing datasets, and a new hierarchical generalization strategy is proposed for personal privacy preservation of sensitive attributes.

The effect of homogeneity on the computational complexity of combinatorial data anonymization

- Computer Science, MathematicsData Mining and Knowledge Discovery
- 2012

The fixed-parameter tractability result implies that k-Anonymity can be solved in linear time when tin is a constant, and the computational hardness results extend to p-Sensitivity and the usage of domain generalization hierarchies.

An enhanced l-diversity privacy preservation

- Computer Science2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)
- 2013

A (k, l, θ)-diversity model base on clustering to minimize the information loss as well as assure data quality and extensive experimental evaluation shows that the techniques clearly outperform the existing approaches in terms of execution time and data utility.

Data Anonymization Based on Natural Equivalent Class

- Computer Science2019 IEEE 23rd International Conference on Computer Supported Cooperative Work in Design (CSCWD)
- 2019

This paper proposes a novel clustering-based anonymization algorithm, which tries to cluster records without separating any natural equivalent class, and proves that the natural equivalentclass can effectively reduce the computational complexity of clustering algorithms as well as information loss.

(l1, ..., lq)-diversity for Anonymizing Sensitive Quasi-Identifiers

- Computer Science2015 IEEE Trustcom/BigDataSE/ISPA
- 2015

This paper proposes a novel privacy definition (l1, ..., lq)-diversity and a method that can treat sensitive QIDs, which is composed of an anonymization algorithm and a reconstruction algorithm.

## References

SHOWING 1-10 OF 69 REFERENCES

On optimal anonymization for l+-diversity

- Computer Science2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)
- 2010

A pruning based algorithm for finding an optimal solution to an extended form of the l-diversity problem that can be instantiated with any reasonable cost metric and improves the data utility.

On the complexity of optimal K-anonymity

- Computer SciencePODS '04
- 2004

It is proved that two general versions of optimal k-anonymization of relations are NP-hard, including the suppression version which amounts to choosing a minimum number of entries to delete from the relation.

Approximate algorithms for K-anonymity

- Computer ScienceSIGMOD '07
- 2007

This paper proposes several approximation algorithms that guarantee O(log k)-approximation ratio and perform significantly better than the traditional algorithms and also provides O(ß log k-approximate algorithms which gracefully adjust their running time according to the tolerance é (≥ 1) of the approximation ratios.

Fast Data Anonymization with Low Information Loss

- Computer ScienceVLDB
- 2007

This paper focuses on one-dimensional (i.e., single attribute) quasi-identifiers, and study the properties of optimal solutions for k-anonymity and l-diversity, and develops efficient heuristics to solve the one- dimensional problems in linear time based on meaningful information loss metrics.

On k-Anonymity and the Curse of Dimensionality

- Computer ScienceVLDB
- 2005

It is shown that the curse of high dimensionality also applies to the problem of privacy preserving data mining, and when a data set contains a large number of attributes which are open to inference attacks, it becomes difficult to anonymize the data without an unacceptably high amount of information loss.

Aggregate Query Answering on Anonymized Tables

- Computer Science2007 IEEE 23rd International Conference on Data Engineering
- 2007

A general framework of permutations-based anonymization to support accurate answering of aggregate queries is presented and it is shown that, for the same grouping, permutation-based techniques can always answer aggregate queries more accurately than generalization-based approaches.

t-Closeness: Privacy Beyond k-Anonymity and l-Diversity

- Computer Science2007 IEEE 23rd International Conference on Data Engineering
- 2007

The k-anonymity privacy requirement for publishing microdata requires that each equivalence class (i.e., a set of records that are indistinguishable from each other with respect to certain…

L-diversity: privacy beyond k-anonymity

- Computer Science22nd International Conference on Data Engineering (ICDE'06)
- 2006

This paper shows with two simple attacks that a \kappa-anonymized dataset has some subtle, but severe privacy problems, and proposes a novel and powerful privacy definition called \ell-diversity, which is practical and can be implemented efficiently.

Data privacy through optimal k-anonymization

- Computer Science21st International Conference on Data Engineering (ICDE'05)
- 2005

This paper proposes and evaluates an optimization algorithm for the powerful de-identification procedure known as k-anonymization, and presents a new approach to exploring the space of possible anonymizations that tames the combinatorics of the problem, and develops data-management strategies to reduce reliance on expensive operations such as sorting.

Utility-based anonymization using local recoding

- Computer ScienceKDD '06
- 2006

This paper proposes a simple framework to specify utility of attributes and develops two simple yet efficient heuristic local recoding methods for utility-based anonymization, which outperform the state-of-the-art multidimensional global recode methods in both discernability and query answering accuracy.