• Corpus ID: 221081368

InstaHide: Instance-hiding Schemes for Private Distributed Learning

@inproceedings{Huang2020InstaHideIS,
  title={InstaHide: Instance-hiding Schemes for Private Distributed Learning},
  author={Yangsibo Huang and Zhao Song and K. Li and Sanjeev Arora},
  booktitle={ICML},
  year={2020}
}
How can multiple distributed entities collaboratively train a shared deep net on their private data while preserving privacy? This paper introduces InstaHide, a simple encryption of training images, which can be plugged into existing distributed deep learning pipelines. The encryption is efficient and applying it during training has minor effect on test accuracy. InstaHide encrypts each training image with a "one-time secret key" which consists of mixing a number of randomly chosen images and… 

Figures and Tables from this paper

Perfect Subset Privacy for Data Sharing and Learning

This work extends the entire privacy-utility tradeoff, accounting for the privatization of any subset of the dataset, and provides a coding scheme for doing so, and introduces a necessary algebraic condition for applying unaltered learning algorithms on encrypted data, termed signal preservation, and presents an additional scheme which guarantees it.

Image Obfuscation for Privacy-Preserving Machine Learning

Two existing image quality metrics are shown to be well suited to measure the level of privacy in accordance with human subjects as well as AI-based recognition, and can therefore be used for quantifying privacy resulting from obfuscation.

A Fusion-Denoising Attack on InstaHide with Data Augmentation

An attack for recovering private images from the outputs of InstaHide even when data augmentation is present is provided, by devising an attack to identify encrypted images that are likely to correspond to the same private image, and then employ a fusion-denoising network for restoring the private image from the encrypted ones.

On the Importance of Encrypting Deep Features

This study analyzes model inversion attacks with only two assumptions: feature vectors of user data are known, and a black-box API for inference is provided, and presents a simple yet effective method termed ShuffleBits, where the binary sequence of each floating-point number gets shuffled.

Privacy Safe Representation Learning via Frequency Filtering Encoder

This work introduces a novel ARL method enhanced through low-pass filtering, limiting the available information amount to be encoded in the frequency domain, which withstands reconstruction attacks while outperforming previous state-of-the-art methods regarding the privacy-utility trade-off.

DAUnTLeSS: Data Augmentation and Uniform Transformation for Learning with Scalability and Security

A novel security analysis framework, termed probably approximately correct (PAC) inference resistance, which bridges the information loss in data processing and prior knowledge is proposed, and the advantages of this new random transform approach with respect to underlying privacy guarantees, computational efficiency and utility for fully-connected neural networks are shown.

Federated Learning without Revealing the Decision Boundaries

A method to encrypt the images, and have a decryption module hidden inside the model so that the entity in charge of federated learning will not see the training images and they will not know the location of the decision boundaries of the model.

Defense against Privacy Leakage in Federated Learning

This paper presents a straightforward yet effective defense strategy based on obfuscating the gradients of sensitive data with concealing data using a gradient projection technique, which offers the highest level of protection while preserving FL performance.

DarKnight: An Accelerated Framework for Privacy and Integrity Preserving Deep Learning Using Trusted Hardware

DarKnight is presented, a framework for large DNN training while protecting input privacy and computation integrity in the cloud servers that relies on cooperative execution between trusted execution environments (TEE) and accelerators and uses a customized data encoding strategy based on matrix masking to create input obfuscation within a TEE.

DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations

This work shows that strong data augmentations, such as mixup and random additive noise, nullify poison attacks while enduring only a small accuracy trade-off, and proposes a training method, DP-InstaHide, which combines the mixup regularizer with additive noise.
...

References

SHOWING 1-10 OF 89 REFERENCES

Privacy-Preserving Deep Learning via Additively Homomorphic Encryption

This work revisits the previous work by Shokri and Shmatikov (ACM CCS 2015) and builds an enhanced system with the following properties: no information is leaked to the server and accuracy is kept intact, compared with that of the ordinary deep learning system also over the combined dataset.

Multi-key privacy-preserving deep learning in cloud computing

ML Confidential: Machine Learning on Encrypted Data

A new class of machine learning algorithms in which the algorithm's predictions can be expressed as polynomials of bounded degree, and confidential algorithms for binary classification based on polynomial approximations to least-squares solutions obtained by a small number of gradient descent steps are proposed.

SecureML: A System for Scalable Privacy-Preserving Machine Learning

This paper presents new and efficient protocols for privacy preserving machine learning for linear regression, logistic regression and neural network training using the stochastic gradient descent method, and implements the first privacy preserving system for training neural networks.

On hiding information from an oracle

The framework defined in this paper enables us to prove precise statements about what an encrypted instance hides and what it leaks, in an information-theoretic sense, about some natural problems in NP ⋒ CoNP.

Privacy-Preserving Secret Shared Computations Using MapReduce

Algorithms for data outsourcing based on Shamir’s secret-sharing scheme and for executing privacy-preserving SQL queries such as count, selection including range selection, projection, and join while using MapReduce as an underlying programming model are presented.

Secret-Sharing Schemes: A Survey

  • A. Beimel
  • Computer Science, Mathematics
    IWCC
  • 2011
This survey describes the most important constructions of secret-sharing schemes and explains the connections between secret- sharing schemes and monotone formulae and monOTone span programs, and presents the known lower bounds on the share size.

Our Data, Ourselves: Privacy Via Distributed Noise Generation

This work provides efficient distributed protocols for generating shares of random noise, secure against malicious participants, and introduces a technique for distributing shares of many unbiased coins with fewer executions of verifiable secret sharing than would be needed using previous approaches.

Practical Secure Aggregation for Federated Learning on User-Held Data

This work considers training a deep neural network in the Federated Learning model, using distributed stochastic gradient descent across user-held training data on mobile devices, wherein Secure Aggregation protects each user's model gradient.

Privacy-preserving deep learning

  • R. ShokriVitaly Shmatikov
  • Computer Science
    2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)
  • 2015
This paper presents a practical system that enables multiple parties to jointly learn an accurate neural-network model for a given objective without sharing their input datasets, and exploits the fact that the optimization algorithms used in modern deep learning, namely, those based on stochastic gradient descent, can be parallelized and executed asynchronously.
...