GAMA: a General Automated Machine learning Assistant

  title={GAMA: a General Automated Machine learning Assistant},
  author={P. J. A. Gijsbers and Joaquin Vanschoren},
The General Automated Machine learning Assistant (GAMA) is a modular AutoML system developed to empower users to track and control how AutoML algorithms search for optimal machine learning pipelines, and facilitate AutoML research itself. In contrast to current, often black-box systems, GAMA allows users to plug in different AutoML and post-processing techniques, logs and visualizes the search process, and supports easy benchmarking. It currently features three AutoML search algorithms, two… 
1 Citations
Online AutoML: An adaptive AutoML framework for online learning
An adaptive Online Automated Machine Learning system is designed that combines the inherent adaptation capabilities of online learners with the fast automated pipeline (re)optimization capabilities of AutoML, and evaluates asynchronous genetic programming and asynchronous successive halving to optimize these pipelines continually.


Ensemble selection from libraries of models
A method for constructing ensembles from libraries of thousands of models using forward stepwise selection to be optimized to performance metric such as accuracy, cross entropy, mean precision, or ROC Area is presented.
An Open Source AutoML Benchmark
An open, ongoing, and extensible benchmark framework which follows best practices and avoids common mistakes is introduced which is used to conduct a thorough comparison of 4 AutoML systems across 39 datasets and analyze the results.
GAMA: Genetic Automated Machine learning Assistant
GAMA is an AutoML package for end-users and AutoML researchers. It uses genetic programming to efficiently generate optimized machine learning pipelines given specific input data and resource
A System for Massively Parallel Hyperparameter Tuning
This work introduces a simple and robust hyperparameter optimization algorithm called ASHA, which exploits parallelism and aggressive early-stopping to tackle large-scale hyperparameters optimization problems, and shows that ASHA outperforms existing state-of-the-art hyper parameter optimization methods.
ML-Plan: Automated machine learning via hierarchical planning
An extensive series of experiments show that ML-Plan is highly competitive and often outperforms existing approaches to AutoML, and is compared to the state-of-the-art frameworks Auto-WEKA, auto-sklearn, and TPOT.
Massively Parallel Hyperparameter Tuning
This work introduces the large-scale regime for parallel hyperparameter tuning, where one needs to evaluate orders of magnitude more configurations than available parallel workers in a small multiple of the wall-clock time needed to train a single model.
Automating Biomedical Data Science Through Tree-Based Pipeline Optimization
This work implements a Tree-based Pipeline Optimization Tool (TPOT) and shows that TPOT can build machine learning pipelines that achieve competitive classification accuracy and discover novel pipeline operators—such as synthetic feature constructors—that significantly improve classification accuracy on these data sets.
Efficient and Robust Automated Machine Learning
This work introduces a robust new AutoML system based on scikit-learn, which improves on existing AutoML methods by automatically taking into account past performance on similar datasets, and by constructing ensembles from the models evaluated during the optimization.
A fast and elitist multiobjective genetic algorithm: NSGA-II
This paper suggests a non-dominated sorting-based MOEA, called NSGA-II (Non-dominated Sorting Genetic Algorithm II), which alleviates all of the above three difficulties, and modify the definition of dominance in order to solve constrained multi-objective problems efficiently.