How to tune the RBF SVM hyperparameters?: An empirical evaluation of 18 search algorithms

@article{Wainer2021HowTT,
  title={How to tune the RBF SVM hyperparameters?: An empirical evaluation of 18 search algorithms},
  author={Jacques Wainer and Pablo Fonseca},
  journal={Artif. Intell. Rev.},
  year={2021},
  volume={54},
  pages={4771-4797}
}
SVM with an RBF kernel is usually one of the best classification algorithms for most data sets, but it is important to tune the two hyperparameters $C$ and $\gamma$ to the data itself. In general, the selection of the hyperparameters is a non-convex optimization problem and thus many algorithms have been proposed to solve it, among them: grid search, random search, Bayesian optimization, simulated annealing, particle swarm optimization, Nelder Mead, and others. There have also been proposals to… 
Heuristical choice of SVM parameters
TLDR
A wide study of various heuristics for SVM parameter selection on over thirty datasets, in both supervised and semi-supervised scenarios, shows that heuristical parameter selection may be preferable for high dimensional and unbalanced datasets or when a small number of examples is available.
Application of Dirichlet Process and Support Vector Machine Techniques for Mapping Alteration Zones Associated with Porphyry Copper Deposit Using ASTER Remote Sensing Imagery
The application of machine learning (ML) algorithms for processing remote sensing data is momentous, particularly for mapping hydrothermal alteration zones associated with porphyry copper deposits.

References

SHOWING 1-10 OF 73 REFERENCES
Effectiveness of Random Search in SVM hyper-parameter tuning
TLDR
The experimental results show that the predictive performance of models using Random Search is equivalent to those obtained using meta-heuristics and Grid Search, but with a lower computational cost.
Empirical Evaluation of Resampling Procedures for Optimising SVM Hyperparameters
TLDR
An extensive empirical evaluation of resampling procedures for SVM hyperparameter selection concludes that a 2-fold procedure is appropriate to select the hyperparameters of an SVM for data sets for 1000 or more datapoints, while a 3-fold Procedure is appropriate for smaller data sets.
Practical Bayesian Optimization of Machine Learning Algorithms
TLDR
This work describes new algorithms that take into account the variable cost of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation and shows that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms.
Model selection for support vector machines via uniform design
A stochastic optimization approach for parameter tuning of support vector machines
  • F. Imbault, K. Lebart
  • Computer Science
    Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004.
  • 2004
TLDR
This paper investigates in this paper the use of global minimization techniques, namely genetic algorithms and simulated annealing, which is compared to the standard tuning frameworks and provides a more reliable tuning method.
A stochastic optimization approach for parameter tuning of support vector machines
TLDR
This paper investigates in this paper the use of global minimization techniques, namely genetic algorithms and simulated annealing, which is compared to the standard tuning frameworks and provides a more reliable tuning method.
Algorithms for Hyper-Parameter Optimization
TLDR
This work contributes novel techniques for making response surface models P(y|x) in which many elements of hyper-parameter assignment (x) are known to be irrelevant given particular values of other elements.
Analysis of the Distance Between Two Classes for Tuning SVM Hyperparameters
TLDR
This paper proposes a novel method for tuning the hyperparameters by maximizing the distance between two classes (DBTC) in the feature space by developing a gradient-based algorithm to search the optimal kernel parameter.
Random Search for Hyper-Parameter Optimization
TLDR
This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid, and shows that random search is a natural baseline against which to judge progress in the development of adaptive (sequential) hyper- parameter optimization algorithms.
Do we need hundreds of classifiers to solve real world classification problems?
TLDR
The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in theTop-20, respectively).
...
1
2
3
4
5
...