Privacy preserving schema and data matching

@inproceedings{Scannapieco2007PrivacyPS,
  title={Privacy preserving schema and data matching},
  author={Monica Scannapieco and Ilya Figotin and Elisa Bertino and Ahmed K. Elmagarmid},
  booktitle={SIGMOD '07},
  year={2007}
}
In many business scenarios, record matching is performed across different data sources with the aim of identifying common information shared among these sources. However such need is often in contrast with privacy requirements concerning the data stored by the sources. In this paper, we propose a protocol for record matching that preserves privacy both at the data level and at the schema level. Specifically, if two sources need to identify their common data, by running the protocol they can… Expand
A survey of privacy preserving data integration
  • V. Shelake, N. Shekokar
  • Computer Science
  • 2017 International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT)
  • 2017
TLDR
This survey provides the various challenges, review of existing work and research directions for privacy preserving data integration, and some of them compromise accuracy while maintaining privacy of integrated data. Expand
Efficient privacy-aware record integration
TLDR
A novel model for practical PRL is introduced, which affords controlled and limited information leakage, avoids false matches resulting from data transformation, and enables efficiency and privacy. Expand
A Hybrid Approach to Private Record Matching
TLDR
This work proposes a hybrid technique that operates over sanitized data to filter out in a privacy-preserving manner pairs of records that do not satisfy the matching condition and provides a formal definition of privacy. Expand
Blind attribute pairing for privacy-preserving record linkage
TLDR
A novel privacy-preserving approach for attribute pairing to aid PPRL applications and demonstrates that this approach improves considerably the efficiency and effectiveness in comparison to a state-of-the-art baseline. Expand
Scalable Blocking for Privacy Preserving Record Linkage
TLDR
This work proposes Multi-Sampling Transitive Closure for Encrypted Fields (MS-TCEF), a novel privacy preserving blocking technique based on the use of reference sets that effectively prunes records based on redundant assignments to blocks, providing better fault-tolerance and maintaining result quality while scaling linearly with respect to the dataset size. Expand
Privacy Preserving Record Linkage via grams Projections
TLDR
This paper develops an embedding strategy based on frequent variable length grams mined in a private way from the original data and introduces personalized threshold for matching individual records in the embedded space which achieves better linkage accuracy than the existing global threshold approach. Expand
An Integrated Approach For Efficient Privacy Preserving Distributed Data
Privacy and security, particularly maintaining confidentiality of data, have become challenging issues with advances in information and communication technology. The ability to communicate and shareExpand
An Efficient Two-Party Protocol for Approximate Matching in Private Record Linkage
TLDR
A novel two-party protocol for PPRL that addresses scalability, security and quality/accuracy, and allows quality approximate matching while still preserving the privacy of the databases that are matched, the protocol can be useful for many real-world applications requiring P PRL. Expand
Tree Based Scalable Indexing for Multi-Party Privacy-Preserving Record Linkage
TLDR
Experiments conducted with datasets of sizes up-to one million records show that the proposed protocol is scalable with both the size of the datasets and the number of parties, while providing better blocking quality and privacy than a phonetic based indexing approach. Expand
A taxonomy of privacy-preserving record linkage techniques
TLDR
This paper presents an overview of techniques that allow the linking of databases between organizations while at the same time preserving the privacy of these data, and presents a taxonomy of PPRL techniques to characterize these techniques along 15 dimensions. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 37 REFERENCES
Privacy Preserving Query Processing Using Third Parties
TLDR
A new query processing technique using third parties in a peer-to-peer system that is able to answer queries without revealing any useful information to the data sources or to the third parties is proposed. Expand
Tools for privacy preserving distributed data mining
TLDR
This paper presents some components of a toolkit of components that can be combined for specific privacy-preserving data mining applications, and shows how they can be used to solve several Privacy preserving data mining problems. Expand
Blocking-aware private record linkage
TLDR
The proposed blocking-aware private record linkage can perform large-scale record linkage without revealing privacy and has the potential to improve the performance of record linkage significantly while being secure. Expand
Efficient Private Matching and Set Intersection
TLDR
This work considers the problem of computing the intersection of private datasets of two parties, where the datasets contain lists of elements taken from a large domain, and presents protocols, based on the use of homomorphic encryption and balanced hashing, for both semi-honest and malicious environments. Expand
A Secure Protocol for Computing String Distance Metrics
TLDR
This paper proposes a stochastic scalar product protocol that is provably consistent, and is also as secure as an underlying set-intersection cryptographic protocol, and uses it to perform secure computation of some standard distance metrics like TFIDF, SoftTFIDF and the Euclidean Distance Metric. Expand
Information sharing across private databases
TLDR
This work formalizes the notion of minimal information sharing across private databases, and develops protocols for intersection, equijoin, intersection size, and Equijoin size. Expand
Sovereign Joins
TLDR
This work presents a secure network service for sovereign information sharing whose only trusted component is an off-theshelf secure coprocessor, and specifies criteria for proving the security of a join algorithm and provides provably safe algorithms. Expand
Distributed Privacy Preserving Information Sharing
TLDR
A measure of privacy leakage for information sharing systems is defined and protocols that can effectively and efficiently protect privacy against different kinds of malicious adversaries are proposed. Expand
A knowledge-based approach for duplicate elimination in data cleaning
TLDR
Experimental study with two real-world datasets show that the generic knowledge-based framework for effective data cleaning can accurately identify duplicates and anomalies with high recall and precision, thus effectively resolving the recall–precision dilemma. Expand
Secure and private sequence comparisons
We give an efficient protocol for sequence comparisons of the edit-distance kind, such that neither party reveals anything about their private sequence to the other party (other than what can beExpand
...
1
2
3
4
...