Practical Federated Gradient Boosting Decision Trees

@inproceedings{Li2020PracticalFG,
  title={Practical Federated Gradient Boosting Decision Trees},
  author={Q. Li and Zeyi Wen and Bingsheng He},
  booktitle={AAAI},
  year={2020}
}
Gradient Boosting Decision Trees (GBDTs) have become very successful in recent years, with many awards in machine learning and data mining competitions. There have been several recent studies on how to train GBDTs in the federated learning setting. In this paper, we focus on horizontal federated learning, where data samples with the same features are distributed among multiple parties. However, existing studies are not efficient or effective enough for practical use. They suffer either from the… 

Figures and Tables from this paper

Large-scale Secure XGB for Vertical Federated Learning
TLDR
This paper aims to build large-scale secure XGB under vertically federated learning setting, and guarantees data privacy from three aspects, and proposes secure permutation protocols to improve the training efficiency and make the framework scale to large dataset.
Adaptive Histogram-Based Gradient Boosted Trees for Federated Learning
TLDR
The Party-Adaptive XGBoost (PAX) is proposed, a novel implementation of gradient boosting which utilizes a party adaptive histogram aggregation method, without the need for data encryption, which makes the use of gradient boosted trees practical in enterprise federated learning.
VF2Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning
TLDR
This paper introduces VF^2Boost, a novel and efficient vertical federated GBDT system that can be 12.8-18.9 times faster than the existing vertical federation implementations and support much larger datasets.
Fed-EINI: An Efficient and Interpretable Inference Framework for Decision Tree Ensembles in Vertical Federated Learning
TLDR
This paper protects data privacy and allow the disclosure of feature meaning by concealing decision paths and adapt a communication-efficient secure computation method for inference outputs to improve the interpretability of the model by disclosing the meaning of features while ensuring efficiency and accuracy.
A Hybrid-Domain Framework for Secure Gradient Tree Boosting
TLDR
A novel framework for two parties to build secure XGB with vertically partitioneddata is proposed by associate Homomorphic Encryption domain with Secret Sharing domain by providing the two-way transformation primitives.
SecureBoost: A Lossless Federated Learning Framework
TLDR
The SecureBoost framework is shown to be as accurate as other nonfederated gradient tree-boosting algorithms that require centralized data, and thus, it is highly scalable and practical for industrial applications such as credit risk analysis.
Federated Learning Versus Classical Machine Learning: A Convergence Comparison
TLDR
This paper performs a convergence comparison between classical machine learning and federated learning on two publicly available datasets, namely, logistic-regression-MNIST dataset and image-classificationCIFAR-10 dataset, and demonstrates that federatedLearning achieves higher convergence within limited communication rounds while maintaining participants' anonymity.
Towards End-to-End Secure and Efficient Federated Learning for XGBoost
TLDR
This paper proposes CryptoBoost, a federated XGBoost system based on multi-party homomorphic encryption techniques, and proposes a set of new secure computation algorithms and protocols for CryptoBoost which achieve improved performance and communication efficiency compared with existing approaches.
Scalable Multi-Party Privacy-Preserving Gradient Tree Boosting over Vertically Partitioned Dataset with Outsourced Computations
TLDR
This work proposes SSXGB which is a scalable and secure multi-party gradient tree boosting framework for vertically partitioned datasets with partially outsourced computations and employs an additive homomorphic encryption (HE) scheme for security.
Federated Learning on Non-IID Data Silos: An Experimental Study
TLDR
It is found that non-IID does bring significant challenges in learning accuracy of FL algorithms, and none of the existing state-of-the-art FL algorithms outperforms others in all cases.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 44 REFERENCES
SecureBoost: A Lossless Federated Learning Framework
TLDR
The SecureBoost framework is shown to be as accurate as other nonfederated gradient tree-boosting algorithms that require centralized data, and thus, it is highly scalable and practical for industrial applications such as credit risk analysis.
InPrivate Digging: Enabling Tree-based Distributed Data Mining with Differential Privacy
TLDR
This paper designs and implements a privacy-preserving system for gradient boosting decision tree (GBDT), where different regression trees trained by multiple data owners can be securely aggregated into an ensemble and demonstrates that the system can provide a strong privacy protection for individual data owners while maintaining the prediction accuracy of the original trained model.
Boosting Privately: Privacy-Preserving Federated Extreme Boosting for Mobile Crowdsensing
TLDR
A secret sharing based federated extreme boosting learning frame-work (FedXGB) to achieve privacy-preserving model training for mobile crowdsensing and is secure in the honest-but-curious model, and attains approximate accuracy and convergence rate with the original model in low runtime.
Secure Federated Transfer Learning
TLDR
A new technique and framework, known as federated transfer learning (FTL), to improve statistical models under a data federation, which requires minimal modifications to the existing model structure and provides the same level of accuracy as the nonprivacy-preserving approach.
A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection
TLDR
A comprehensive review on federated learning systems is conducted and a thorough categorization is provided according to six different aspects, including data distribution, machine learning model, privacy mechanism, communication architecture, scale of federation and motivation of federation.
A Secure Federated Transfer Learning Framework
TLDR
This work introduces a new technique and framework, known as federated transfer learning (FTL), to improve statistical modeling under a data federation, which allows knowledge to be shared without compromising user privacy and enables complementaryknowledge to be transferred across domains in a data Federation.
LightGBM: A Highly Efficient Gradient Boosting Decision Tree
TLDR
It is proved that, since the data instances with larger gradients play a more important role in the computation of information gain, GOSS can obtain quite accurate estimation of the information gain with a much smaller data size, and is called LightGBM.
Agnostic Federated Learning
TLDR
This work proposes a new framework of agnostic federated learning, where the centralized model is optimized for any target distribution formed by a mixture of the client distributions, and shows that this framework naturally yields a notion of fairness.
Federated Learning of Deep Networks using Model Averaging
TLDR
This work presents a practical method for the federated learning of deep networks that proves robust to the unbalanced and non-IID data distributions that naturally arise, and allows high-quality models to be trained in relatively few rounds of communication.
CryptoML: Secure outsourcing of big data machine learning applications
TLDR
This work proposes a novel interactive delegation protocol based on the provably secure Shamir's secret sharing, and suggests the dominant components of delegation performance cost, and creates a matrix sketching technique that aims at minimizing the cost by data pre-processing.
...
1
2
3
4
5
...