• Publications
  • Influence
Tools for privacy preserving distributed data mining
TLDR
This paper presents some components of a toolkit of components that can be combined for specific privacy-preserving data mining applications, and shows how they can be used to solve several Privacy preserving data mining problems.
Privacy-preserving distributed mining of association rules on horizontally partitioned data
TLDR
This work addresses secure mining of association rules over horizontally partitioned data by incorporating cryptographic techniques to minimize the information shared, while adding little overhead to the mining task.
Privacy-preserving k-means clustering over vertically partitioned data
TLDR
This work presents a method for k-means clustering when different sites contain different attributes for a common set of entities, where each site learns the cluster of each entity, but learns nothing about the attributes at other sites.
Privacy-preserving Naïve Bayes classification
TLDR
This paper brings privacy-preservation to that baseline, presenting protocols to develop a Naïve Bayes classifier on both vertically as well as horizontally partitioned data.
Secure set intersection cardinality with application to association rule mining
TLDR
This paper presents an efficient protocol for securely determining the size of set intersection, and shows how this can be used to generate association rules where multiple parties have different (and private) information about the same set of individuals.
Privacy-preserving data integration and sharing
TLDR
A privacy framework for data integration is laid out, in the context of existing accomplishments in data integration, that addresses challenges and opportunities for the data mining community.
How Much Is Enough? Choosing ε for Differential Privacy
TLDR
The probability of identifying any particular individual as being in the database is considered, and the challenge of setting the proper value of e given the goal of protecting individuals in thedatabase with some fixed probability is demonstrated.
Using unknowns to prevent discovery of association rules
TLDR
This work introduces a method for selectively removing individual values from a database to prevent the discovery of a set of rules, while preserving the data for other applications.
Semantic Integration in Heterogeneous Databases Using Neural Networks
TLDR
This work presents a procedure using a classifier to categorize attributes according to their field specifications and data values, then train a neural network to recognize similar attributes and present a technique to match equivalent data elements.
...
...