• Corpus ID: 236213878

Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning

@inproceedings{Feurer2020AutoSklearn2H,
  title={Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning},
  author={Matthias Feurer and Katharina Eggensperger and Stefan Falkner and Marius Thomas Lindauer and Frank Hutter},
  year={2020}
}
Automated Machine Learning (AutoML) supports practitioners and researchers with the tedious task of designing machine learning pipelines and has recently achieved substantial success. In this paper we introduce new AutoML approaches motivated by our winning submission to the second ChaLearn AutoML challenge. We develop PoSH Auto-sklearn, which enables AutoML systems to work well on large datasets under rigid time limits using a new, simple and meta-feature-free meta-learning technique and… 
SubStrat: A Subset-Based Strategy for Faster AutoML
TLDR
SubStrat is presented, an AutoML optimization strategy that tackles the data size, rather than configuration space, and reduces their running times by 79% with less than 2% average loss in the accuracy of the resulted ML pipeline.
Towards Green Automated Machine Learning: Status Quo and Future Directions
TLDR
This paper identifies four categories of actions the community may take towards more sustainable research on AutoML, i.e. Green AutoML: design of AutoML systems, benchmarking, transparency and research incentives.
BERT-Sort: A Zero-shot MLM Semantic Encoder on Ordinal Features for AutoML
TLDR
BERT-Sort is introduced, a novel approach to semantically encode ordinal categorical values via zero-shot Masked Language 13 Models (MLM) and apply it to AutoML for tabular data and significantly improves semantic encoding of ordinal values in comparison to existing approaches with 27% improvement.
A meta-feature selection method based on the Auto-sklearn framework
  • N. I. Kulin, S.B. Muravyov
  • Computer Science
    Scientific and Technical Journal of Information Technologies, Mechanics and Optics
  • 2021
TLDR
The Auto-sklearn framework is discussed as one of the best solutions for automated selection and tuning machine learning algorithms and a new method of operation based on meta-database optimization based on BIRCH clustering is proposed to speed up the search for the best machine learning algorithm for classification tasks.
ARLO: A Framework for Automated Reinforcement Learning
TLDR
This work proposes a general and exible framework, namely ARLO: Automated Reinforcement Learning Optimizer, to construct automated pipelines for AutoRL, and provides a Python implementation of such pipelines, released as an open-source library.
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
TLDR
The proposed Language-Interfaced Fine-Tuning (LIFT) does not make any changes to the model architecture or loss function, and it solely relies on the natural language interface, enabling “no-code machine learning with LMs,” and performs relatively well across a wide range of lowdimensional classification and regression tasks.
Consolidated learning - a domain-specific model-free optimization strategy with examples for XGBoost and MIMIC-IV
TLDR
A new formulation of the tuning problem, called consolidated learning, more suited to practical challenges faced by model developers, in which a large number of predictive models are created on similar data sets is proposed, interested in the total optimization time rather than tuning for a single task.
MLOps - Definitions, Tools and Challenges
TLDR
An concentrated overview of the Machine Learning Operations (MLOps) area, identifying them not only as the answer for the incorporation of ML models in production but also as a possible tool for efficient, robust and accurate machine learning models.
Towards AutoQML: A Cloud-Based Automated Circuit Architecture Search Framework
TLDR
This work takes the first steps towards Automated Quantum Machine Learning (AutoQML) by proposing a concrete description of the problem, and developing a classical-quantum hybrid cloud architecture that allows for parallelized hyperparameter exploration and model training.
OpenML Benchmarking Suites
TLDR
The use of curated, comprehensive suites of machine learning tasks to standardize the setup, execution, and reporting of benchmarks are advocated through software tools that help to create and leverage these benchmarking suites.
...
...

References

SHOWING 1-10 OF 141 REFERENCES
Towards Automatically-Tuned Deep Neural Networks
TLDR
Two versions of Auto-Net are presented, which provide automatically-tuned deep neural networks without any human intervention, and empirical results show that ensembling Auto- Net 1.0 with Auto-sklearn can perform better than either approach alone, and that Auto- net 2.0 can perform even better yet.
Auto-sklearn: Efficient and Robust Automated Machine Learning
TLDR
A robust new AutoML system based on the Python machine learning package scikit-learn, which improves on existing AutoML methods by automatically taking into account past performance on similar datasets, and by constructing ensembles from the models evaluated during the optimization.
Design of the 2015 ChaLearn AutoML challenge
TLDR
The AutoML contest for IJCNN 2015 challenges participants to solve classification and regression problems without any human intervention, and will push the state of the art in fully automatic machine learning on a wide range of real-world problems.
Analysis of the AutoML Challenge Series 2015-2018
TLDR
This chapter analyzes the results of a machine learning competition of progressive difficulty, which was followed by a one-round AutoML challenge (PAKDD 2018), and provides details about the datasets, which were not revealed to the participants.
Practical Automated Machine Learning for the AutoML Challenge 2018
TLDR
The winning entry to the AutoML challenge 2018 is described, dubbed PoSH Auto-sklearn, which combines an automatically preselected portfolio, ensemble building and Bayesian optimization with successive halving.
Efficient and Robust Automated Machine Learning
TLDR
This work introduces a robust new AutoML system based on scikit-learn, which improves on existing AutoML methods by automatically taking into account past performance on similar datasets, and by constructing ensembles from the models evaluated during the optimization.
DeepLine: AutoML Tool for Pipelines Generation using Deep Reinforcement Learning and Hierarchical Actions Filtering
TLDR
This study presents DeepLine, a reinforcement learning-based approach for automatic pipeline generation that utilizes an efficient representation of the search space together with a novel method for operating in environments with large and dynamic action spaces.
Automatic Frankensteining: Creating Complex Ensembles Autonomously
TLDR
This work presents an approach that automates the process of creating a top performing ensemble of several layers, different algorithms and hyperparameter configurations called Frankenstein ensembles and can show that it outperforms them on the majority using the same training time.
Auto-Pytorch: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL
TLDR
This paper introduces Auto-PyTorch, which combines multi-fidelity optimization with portfolio construction for warmstarting and ensembling of deep neural networks (DNNs) and common baselines for tabular data to enable fully automated deep learning (AutoDL).
Autostacker: a compositional evolutionary learning system
TLDR
An automatic machine learning modeling architecture called Autostacker is introduced that combines an innovative hierarchical stacking architecture and an evolutionary algorithm to perform efficient parameter search without the need for prior domain knowledge about the data or feature preprocessing.
...
...