• Corpus ID: 207870445

RankML: a Meta Learning-Based Approach for Pre-Ranking Machine Learning Pipelines

  title={RankML: a Meta Learning-Based Approach for Pre-Ranking Machine Learning Pipelines},
  author={Doron Laadan and Roman Vainshtein and Yarden Curiel and Gilad Katz and Lior Rokach},
The explosion of digital data has created multiple opportunities for organizations and individuals to leverage machine learning (ML) to transform the way they operate. However, the shortage of experts in the field of machine learning - data scientists - is often a setback to the use of ML. In an attempt to alleviate this shortage, multiple approaches for the automation of machine learning have been proposed in recent years. While these approaches are effective, they often require a great deal… 

Figures and Tables from this paper

MetaTPOT: Enhancing A Tree-based Pipeline Optimization Tool Using Meta-Learning

MetaTPOT is proposed, an enhanced variant that uses a meta learning-based approach to predict the performance of TPOT's pipeline candidates and leverages domain knowledge in the form of pipelines pre-ranking to improve TPot's speed and performance.

Autoencoder-kNN meta-model based data characterization approach for an automated selection of AI algorithms

A new Autoencoder-kNN (AeKNN) based meta-model with built-in latent features extraction is proposed, which shows that AeKNN offers considerable improvements of the classical kNN as well as traditional meta-models in terms of performance.

A Meta Learning-Based Approach for Zero-Shot Co-Training

This work proposes Co-training using Meta-learning (CoMet), a novel approach that addresses many of the shortcomings of existing co-training methods and employs a meta-learning approach that enables it to leverage insights from previously-evaluated datasets and apply these insights to other datasets.

Towards the Automation of Industrial Data Science: A Meta-learning based Approach

A meta-learning based approach is proposed that may serve an effective decision support system for the AutoML process and may help to better control such data evolution.

Zero-Shot AutoML with Pretrained Models

This work learns a zero-shot surrogate model, which, at test time, allows to select the right deep learning pipeline for a new dataset D given only trivial meta-features describing D, such as image resolution or the number of classes.



Probabilistic Matrix Factorization for Automated Machine Learning

This paper uses probabilistic matrix factorization techniques and acquisition functions from Bayesian optimization to identify high-performing pipelines across a wide range of datasets, significantly outperforming the current state-of-the-art.

AutoGRD: Model Recommendation Through Graphical Dataset Representation

AutoGRD first represents datasets as graphs and then extracts their latent representation that is used to train a ranking meta-model capable of accurately recommending top-performing algorithms for previously unseen datasets, which outperforms state-of-the-art meta-learning and Bayesian methods.

Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results

A meta-learning method that uses a k-Nearest Neighbor algorithm to identify the datasets that are most similar to the one at hand and leads to significantly better rankings than the baseline ranking method.

TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning

This chapter presents TPOT v0.3, an open source genetic programming-based AutoML system that optimizes a series of feature preprocessors and machine learning models with the goal of maximizing classification accuracy on a supervised classification task.

A Hybrid Approach for Automatic Model Recommendation

This work presents AutoDi, a novel and resource-efficient approach for model selection that combines two sources of information: metafeatures extracted from the data itself and word-embedding features extracted from a large corpus of academic publications that enables AutoDi to select top-performing algorithms both for widely and rarely used datasets.

Efficient and Robust Automated Machine Learning

This work introduces a robust new AutoML system based on scikit-learn, which improves on existing AutoML methods by automatically taking into account past performance on similar datasets, and by constructing ensembles from the models evaluated during the optimization.

Autostacker: a compositional evolutionary learning system

An automatic machine learning modeling architecture called Autostacker is introduced that combines an innovative hierarchical stacking architecture and an evolutionary algorithm to perform efficient parameter search without the need for prior domain knowledge about the data or feature preprocessing.

Learning to rank: from pairwise approach to listwise approach

It is proposed that learning to rank should adopt the listwise approach in which lists of objects are used as 'instances' in learning, and introduces two probability models, respectively referred to as permutation probability and top k probability, to define a listwise loss function for learning.

Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms

This work considers the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that attacks these issues separately and shows classification performance often much better than using standard selection and hyperparameter optimization methods.

Practical Automated Machine Learning for the AutoML Challenge 2018

The winning entry to the AutoML challenge 2018 is described, dubbed PoSH Auto-sklearn, which combines an automatically preselected portfolio, ensemble building and Bayesian optimization with successive halving.