On the predictive power of meta-features in OpenML

@article{Bilalli2017OnTP,
  title={On the predictive power of meta-features in OpenML},
  author={Besim Bilalli and A. Abell{\'o} and Tom{\`a}s Aluja-Banet},
  journal={International Journal of Applied Mathematics and Computer Science},
  year={2017},
  volume={27},
  pages={697 - 712}
}
Abstract The demand for performing data analysis is steadily rising. As a consequence, people of different profiles (i.e., nonexperienced users) have started to analyze their data. However, this is challenging for them. A key step that poses difficulties and determines the success of the analysis is data mining (model/algorithm selection problem). Meta-learning is a technique used for assisting non-expert users in this step. The effectiveness of meta-learning is, however, largely dependent on… 

Figures and Tables from this paper

Characterizing classification datasets: a study of meta-features for meta-learning.

This paper presents MFE, a new tool for extracting meta-features from datasets and identifying more subtle reproducibility issues in the literature, proposing guidelines for data characterization that strengthen reproducible empirical research in meta-learning.

A Decision Support Framework for AutoML Systems: A Meta-Learning Approach

In general, the process of building a high-quality machine learning model is an iterative, complex and timeconsuming process that involves exploring the performance of various machine learning

Towards Explainable Meta-learning

This paper proposes techniques developed for eXplainable Artificial Intelligence (XAI) to examine and extract knowledge from black-box surrogate models and is the first paper that shows how post-hoc explainability can be used to improve the meta-learning.

Towards meta-learning for multi-target regression problems

A meta-learning system to recommend the best predictive method for a given multi-target regression problem with a balanced accuracy superior to 70% using a Random Forest meta-model, which statistically outperformed the meta- learning baselines.

Towards Meta-Learning for Multi-Target Regression Problems

Results showed that induced meta-models were able to recommend the best method for different base level datasets with a balanced accuracy superior to 70% using a Random Forest meta-model, which statistically outperformed the meta-learning baselines.

PRESISTANT: Data Pre-processing Assistant

Leveraging ideas from meta-learning, PRESISTANT is capable of assisting the user by recommending pre-processing operators that ultimately improve the classification performance and proposes candidate transformations to improve the result of the analysis.

PRESISTANT: Learning based assistant for data pre-processing

When DevOps Meets Meta-Learning: A Portfolio to Rule them all

A set of criteria to be respected are presented, and a pipeline-based meta-model is proposed, to support a DevOps approach in the context of Machine Learning Pipelines, to ensure continuity between Dev and Ops.

When DevOps Meets Meta-Learning: A Portfolio to Rule them all

A set of criteria to be respected are presented, and a pipeline-based meta-model is proposed, to support a DevOps approach in the context of Machine Learning Pipelines, to ensure continuity between Dev and Ops.

Discretization of numerical meta-features into categorical: analysis of educational and business data sets

The objective of this study is to discretize numerical meta-features into categorical values, and a survey of significant discretization methods is provided.

References

SHOWING 1-10 OF 29 REFERENCES

Meta-data: Characterization of Input Features for Meta-learning

This paper focuses on the characterization of meta-data, through an analysis ofMeta-features that can capture the properties of specific tasks to be solved at base level, which represents a first step toward the development of a meta-learning system capable of suggesting the proper bias for base-learning different specific task domains.

Intelligent assistance for data pre-processing

Improved Dataset Characterisation for Meta-learning

New measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms, and their effectiveness is illustrated through extensive experiments.

A survey of intelligent assistants for data analysis

The types of help IDAs can provide to users and the kinds of (background) knowledge they leverage to provide this help are explicated.

Towards Intelligent Data Analysis: The Metadata Challenge

This paper presents a comprehensive classification of all the metadata required to provide user support in KDD and presents the implementation of a metadata repository for storing and managing this metadata and explains its benefits in a real Big Data analytics project.

Metalearning - Applications to Data Mining

This book discusses several approaches to obtaining knowledge concerning the performance of machine learning and data mining algorithms and shows how this knowledge can be reused to select, combine, compose and adapt both algorithms and models to yield faster, more effective solutions to data mining problems.

Experiments in Meta-level Learning with ILP

The aim of meta-level learning is to relate the performance of different machine learning algorithms to the characteristics of the dataset to induced on the basis of empirical data about the performance on the different datasets.

Feature Selection for Meta-learning

This work discovered that the best set of discriminating attributes is different for every pair of inducers, and applied a feature selection method on the meta-learning problems, to get thebest set of attributes for each problem.

Estimating the Predictive Accuracy of a Classifier

It is shown that it is possible to estimate classifier performance with a high degree of confidence and gain knowledge about the classifier through the regression models generated and exploit the results of the models to predict the ranking of the inducers.

Characterizing the Applicability of Classification Algorithms Using Meta-Level Learning

It is shown that machine learning methods themselves can be used in organizing this knowledge and that the method is viable and useful.