- Published 2012 in Journal of Machine Learning Research

Many machine learning algorithms have hyperparameters flags, values, and other configuration information that guides the algorithm. Sometimes this configuration applies to the space of functions that the learning algorithm searches (e.g. the number of nearest neighbours to use in KNN). Sometimes this configuration applies to the way in which the search is conducted (e.g. the step size in stochastic gradient descent). For better or for worse, it is common practice to judge a learning algorithm by its best-casescenario performance. Researchers are expected to maximize the performance of their algorithm by optimizing over hyper-parameter values by e.g. cross-validating using data withheld from the training set. Despite decades of research into global optimization (e.g. [8, 4, 9, 10]) and the publishing of several hyper-parameter optimization algorithms (e.g. [7, 1, 3]), it would seem that most machine learning researchers still prefer to carry out this optimization by hand, and by grid search (e.g. [6, 5, 2]). Here, we argue that in theory and experiment grid search (i.e. lattice-based brute force search) should almost never be used. Instead, quasirandom or even pseudo-random experiment designs (random experiments) should be preferred. Random experiments are just as easily parallelized as grid search, just as simple to design, and more reliable. Looking forward, we would like to investigate sequential hyper-parameter optimization algorithms and we hope that random search will serve as a credible baseline. Does random search work better? We did an experiment (Fig. 1) similar to [5] using random search instead of grid search. We op1 2 4 8 16 32 # trials 0.0 0.2 0.4 0.6 0.8 1.0

Citations per Year

Semantic Scholar estimates that this publication has **1,151** citations based on the available data.

See our **FAQ** for additional information.

Showing 1-10 of 579 extracted citations

Highly Influenced

8 Excerpts

Highly Influenced

8 Excerpts

Highly Influenced

7 Excerpts

Highly Influenced

10 Excerpts

Highly Influenced

5 Excerpts

Highly Influenced

9 Excerpts

Highly Influenced

5 Excerpts

Highly Influenced

4 Excerpts

Highly Influenced

10 Excerpts

Highly Influenced

6 Excerpts

@article{Bergstra2012RandomSF,
title={Random Search for Hyper-Parameter Optimization},
author={James Bergstra and Yoshua Bengio},
journal={Journal of Machine Learning Research},
year={2012},
volume={13},
pages={281-305}
}