AutoGRD: Model Recommendation Through Graphical Dataset Representation

@article{CohenShapira2019AutoGRDMR,
  title={AutoGRD: Model Recommendation Through Graphical Dataset Representation},
  author={Noy Cohen-Shapira and Lior Rokach and Bracha Shapira and Gilad Katz and Roman Vainshtein},
  journal={Proceedings of the 28th ACM International Conference on Information and Knowledge Management},
  year={2019}
}
The widespread use of machine learning algorithms and the high level of expertise required to utilize them have fuelled the demand for solutions that can be used by non-experts. One of the main challenges non-experts face in applying machine learning to new problems is algorithm selection - the identification of the algorithm(s) that will deliver top performance for a given dataset, task, and evaluation measure. We present AutoGRD, a novel meta-learning approach for algorithm recommendation… 
RankML: a Meta Learning-Based Approach for Pre-Ranking Machine Learning Pipelines
TLDR
This study proposes RankML, a meta-learning based approach for predicting the performance of whole machine learning pipelines, and shows that it achieves results that are equal to those of state-of-the-art, computationally heavy approaches.
LEGION: Visually compare modeling techniques for regression
People construct machine learning (ML) models for various use cases in varied domains such as in healthcare, finance, public-policy, etc. In doing so they aim to improve a models’ performance by
Assassin: an Automatic claSSificAtion system baSed on algorithm SelectIoN
TLDR
This work develops Assassin, a system that can automatically extract experiences from previous tasks and train a meta-classifier to implement algorithm recommendations by embedding meta-learning techniques and reinforced policy.
DJEnsemble: a Cost-Based Selection and Allocation of a Disjoint Ensemble of Spatio-temporal Models
TLDR
DJEnsemble is presented, a cost-based strategy for the automatic selection and allocation of a disjoint ensemble of black-box predictors to answer predictive spatio-temporal queries.
Correlation-driven framework based on graph convolutional network for clinical disease classification
TLDR
A novel sample connection driven framework named RFG-GCN, which employs a random forest based graph generation algorithm (RFG) to convert structured data into graph data, which considers the correlation between samples and shows that the classification performance has been significantly improved compared withother methods.
Towards the Automation of Industrial Data Science: A Meta-learning based Approach
TLDR
A meta-learning based approach is proposed that may serve an effective decision support system for the AutoML process and may help to better control such data evolution.
...
1
2
...

References

SHOWING 1-10 OF 37 REFERENCES
A Hybrid Approach for Automatic Model Recommendation
TLDR
This work presents AutoDi, a novel and resource-efficient approach for model selection that combines two sources of information: metafeatures extracted from the data itself and word-embedding features extracted from a large corpus of academic publications that enables AutoDi to select top-performing algorithms both for widely and rarely used datasets.
Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results
TLDR
A meta-learning method that uses a k-Nearest Neighbor algorithm to identify the datasets that are most similar to the one at hand and leads to significantly better rankings than the baseline ranking method.
Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms
TLDR
This work considers the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that attacks these issues separately and shows classification performance often much better than using standard selection and hyperparameter optimization methods.
Initializing Bayesian Hyperparameter Optimization via Meta-Learning
TLDR
This paper mimics a strategy human domain experts use: speed up optimization by starting from promising configurations that performed well on similar datasets, and substantially improves the state of the art for the more complex combined algorithm selection and hyperparameter optimization problem.
Improved Dataset Characterisation for Meta-learning
TLDR
New measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms, and their effectiveness is illustrated through extensive experiments.
Learning to rank: from pairwise approach to listwise approach
TLDR
It is proposed that learning to rank should adopt the listwise approach in which lists of objects are used as 'instances' in learning, and introduces two probability models, respectively referred to as permutation probability and top k probability, to define a listwise loss function for learning.
ExploreKit: Automatic Feature Generation and Selection
TLDR
This work presents ExploreKit, a framework for automated feature generation that uses a novel machine learning-based feature selection approach to predict the usefulness of new candidate features and can achieve classification-error reduction of 20% overall.
Automatic classifier selection for non-experts
TLDR
This paper empirically evaluate five different categories of state-of-the-art meta-features for their suitability in predicting classification accuracies of several widely used classifiers and develops the first open source meta-learning system that is capable of accurately predicting accuraciesof target classifiers.
DeepWalk: online learning of social representations
TLDR
DeepWalk is an online learning algorithm which builds useful incremental results, and is trivially parallelizable, which make it suitable for a broad class of real world applications such as network classification, and anomaly detection.
node2vec: Scalable Feature Learning for Networks
TLDR
In node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks, a flexible notion of a node's network neighborhood is defined and a biased random walk procedure is designed, which efficiently explores diverse neighborhoods.
...
1
2
3
4
...