• Corpus ID: 244117859

A Simple and Fast Baseline for Tuning Large XGBoost Models

@article{Kapoor2021ASA,
  title={A Simple and Fast Baseline for Tuning Large XGBoost Models},
  author={Sanyam Kapoor and Valerio Perrone},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.06924}
}
XGBoost, a scalable tree boosting algorithm, has proven effective for many prediction tasks of practical interest, especially using tabular datasets. Hyperparameter tuning can further improve the predictive performance, but unlike neural networks, full-batch training of many models on large datasets can be time consuming. Owing to the discovery that (i) there is a strong linear relation between dataset size & training time, (ii) XGBoost models satisfy the ranking hypothesis, and (iii) lower… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 33 REFERENCES

Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets

A generative model for the validation error as a function of training set size is proposed, which learns during the optimization process and allows exploration of preliminary configurations on small subsets, by extrapolating to the full dataset.

XGBoost: A Scalable Tree Boosting System

This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.

Tabular Data: Deep Learning is Not All You Need

Regularization is all you Need: Simple Neural Nets can Excel on Tabular Data

This paper proposes regularizing plain Multilayer Perceptron (MLP) networks by searching for the optimal combination/cocktail of 13 regularization techniques for each dataset using a joint optimization over the decision on which regularizers to apply and their subsidiary hyperparameters.

Multi-Task Bayesian Optimization

This paper proposes an adaptation of a recently developed acquisition function, entropy search, to the cost-sensitive, multi-task setting and demonstrates the utility of this new acquisition function by leveraging a small dataset to explore hyper-parameter settings for a large dataset.

Amazon SageMaker Automatic Model Tuning: Scalable Gradient-Free Optimization

Amazon SageMaker Automatic Model Tuning (AMT) is presented, a fully managed system for gradient-free optimization at scale and finds the best version of a trained machine learning model by repeatedly evaluating it with different hyperparameter configurations.

Small Data, Big Decisions: Model Selection in the Small-Data Regime

This paper empirically study the generalization performance as the size of the training set varies over multiple orders of magnitude, finding that training on smaller subsets of the data can lead to more reliable model selection decisions whilst simultaneously enjoying smaller computational costs.

Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization

A novel algorithm is introduced, Hyperband, for hyperparameter optimization as a pure-exploration non-stochastic infinite-armed bandit problem where a predefined resource like iterations, data samples, or features is allocated to randomly sampled configurations.

BOHB: Robust and Efficient Hyperparameter Optimization at Scale

This work proposes a new practical state-of-the-art hyperparameter optimization method, which consistently outperforms both Bayesian optimization and Hyperband on a wide range of problem types, including high-dimensional toy functions, support vector machines, feed-forward neural networks, Bayesian Neural networks, deep reinforcement learning, and convolutional neural networks.

SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

SAINT consistently improves performance over previous deep learning methods, and it even performs competitively with gradient boosting methods, including XGBoost, CatBoost, and LightGBM, on average over 30 benchmark datasets in regression, binary classification, and multi-class classification tasks.