• Publications
  • Influence
Decoupled Weight Decay Regularization
TLDR
This work proposes a simple modification to recover the original formulation of weight decay regularization by decoupling the weight decay from the optimization steps taken w.r.t. the loss function, and provides empirical evidence that this modification substantially improves Adam's generalization performance.
Sequential Model-Based Optimization for General Algorithm Configuration
TLDR
This paper extends the explicit regression models paradigm for the first time to general algorithm configuration problems, allowing many categorical parameters and optimization for sets of instances, and yields state-of-the-art performance.
SGDR: Stochastic Gradient Descent with Warm Restarts
TLDR
This paper proposes a simple warm restart technique for stochastic gradient descent to improve its anytime performance when training deep neural networks and empirically studies its performance on the CIFAR-10 and CIFARS datasets.
Efficient and Robust Automated Machine Learning
TLDR
This work introduces a robust new AutoML system based on scikit-learn, which improves on existing AutoML methods by automatically taking into account past performance on similar datasets, and by constructing ensembles from the models evaluated during the optimization.
Deep learning with convolutional neural networks for EEG decoding and visualization
TLDR
This study shows how to design and train convolutional neural networks to decode task‐related information from the raw EEG without handcrafted features and highlights the potential of deep ConvNets combined with advanced visualization techniques for EEG‐based brain mapping.
Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms
TLDR
This work considers the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that attacks these issues separately and shows classification performance often much better than using standard selection and hyperparameter optimization methods.
Neural Architecture Search: A Survey
TLDR
An overview of existing work in this field of research is provided and neural architecture search methods are categorized according to three dimensions: search space, search strategy, and performance estimation strategy.
ParamILS: An Automatic Algorithm Configuration Framework
TLDR
An automatic framework for this algorithm configuration problem is described and methods for optimizing a target algorithm's performance on a given class of problem instances by varying a set of ordinal and/or categorical parameters are provided.
BOHB: Robust and Efficient Hyperparameter Optimization at Scale
TLDR
This work proposes a new practical state-of-the-art hyperparameter optimization method, which consistently outperforms both Bayesian optimization and Hyperband on a wide range of problem types, including high-dimensional toy functions, support vector machines, feed-forward neural networks, Bayesian Neural networks, deep reinforcement learning, and convolutional neural networks.
Fixing Weight Decay Regularization in Adam
TLDR
This work decouples the optimal choice of weight decay factor from the setting of the learning rate for both standard SGD and Adam and substantially improves Adam's generalization performance, allowing it to compete with SGD with momentum on image classification datasets.
...
1
2
3
4
5
...