SoK: Privacy-Preserving Collaborative Tree-based Model Learning

  title={SoK: Privacy-Preserving Collaborative Tree-based Model Learning},
  author={Sylvain Chatel and Apostolos Pyrgelis and Juan Ram{\'o}n Troncoso-Pastoriza and Jean-Pierre Hubaux},
  journal={Proceedings on Privacy Enhancing Technologies},
  pages={182 - 203}
Abstract Tree-based models are among the most efficient machine learning techniques for data mining nowadays due to their accuracy, interpretability, and simplicity. The recent orthogonal needs for more data and privacy protection call for collaborative privacy-preserving solutions. In this work, we survey the literature on distributed and privacy-preserving training of tree-based models and we systematize its knowledge based on four axes: the learning algorithm, the collaborative model, the… 
3 Citations

Figures and Tables from this paper

SoK: Secure Aggregation based on cryptographic schemes for Federated Learning

This work provides a formal definition of the problem and suggests a systematic categorization of existing solutions and proposes an improved definition of secure aggregation that better fits federated learning.

Report: State of the Art Solutions for Privacy Preserving Machine Learning in the Medical Context

Evaluated which cryptographic mechanism can be used in the scenario stated in Figure 1, where one party has a big amount of data in the clear and wants to learn something on the data and the research institute wants to gain new information from patient data.

XORBoost: Tree Boosting in the Multiparty Computation Setting

A novel protocol XORBoost is presented for both training gradient boosted tree models and for using these models for inference in the multiparty computation (MPC) setting and is agnostic to the underlying MPC framework or implementation.



Boosted and Differentially Private Ensembles of Decision Trees

This paper starts with the proof that the privacy vs boosting picture for DT involves a notable and general technical tradeoff: the sensitivity tends to increase with the boosting rate of the loss, for any proper loss, and introduces objective calibration as a method to adaptively tune the tradeoff during DT induction.

MP-SPDZ: A Versatile Framework for Multi-Party Computation

  • Marcel Keller
  • Computer Science, Mathematics
    IACR Cryptol. ePrint Arch.
  • 2020
The variety of protocols implemented and the design choices made in the development of MP-SPDZ are outlined as well as the capabilities of the programming interface.

Revocable Federated Learning: A Benchmark of Federated Forest

In RevFRF, a suite of homomorphic encryption based secure protocols are designed for federated RF construction, prediction and revocation and it is shown that the protocols can securely and efficiently implement collaborative training of an RF and ensure that the memories of a revoked participant in the trained RF are securely removed.

Boosting Privately: Privacy-Preserving Federated Extreme Boosting for Mobile Crowdsensing

A secret sharing based federated extreme boosting learning frame-work (FedXGB) to achieve privacy-preserving model training for mobile crowdsensing and is secure in the honest-but-curious model, and attains approximate accuracy and convergence rate with the original model in low runtime.

Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

It is shown that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants.

Encrypted statistical machine learning: new privacy preserving methods

Two new statistical machine learning methods designed to learn on fully homomorphic encrypted (FHE) data are presented, involving a new cryptographic stochastic fraction estimator and semi-parametric model for the class decision boundary, and shown how they can be used to learn and predict from encrypted data.

Practical Secure Decision Tree Learning in a Teletreatment Application

A range of practical cryptographic protocols for secure decision tree learning, a primary problem in privacy preserving data mining, focus on particular variants of the well-known ID3 algorithm allowing a high level of security and performance at the same time.

Building decision tree classifier on private data

This paper presents a protocol that allows Alice and Bob to conduct such a classifier building without having to compromise their privacy, and is built upon a useful building block, the scalar product protocol.

Public-Key Cryptosystems Based on Composite Degree Residuosity Classes

A new trapdoor mechanism is proposed and three encryption schemes are derived : a trapdoor permutation and two homomorphic probabilistic encryption schemes computationally comparable to RSA, which are provably secure under appropriate assumptions in the standard model.

Towards Privacy-Preserving Collaborative Gradient Boosted Decision Trees

This paper explores the collaborative training of gradient boosted decision trees while hiding each party’s data from all other parties by extending the popular XGBoost framework to two different modes.