Jan N. van Rijn

Learn More
Many sciences have made significant breakthroughs by adopting online tools that help organize, structure and mine information that is too detailed to be printed in journals. In this paper, we introduce OpenML, a place for machine learning researchers to share and organize data in fine detail, so that they can work more effectively, be more visible, and(More)
Identifying the best machine learning algorithm for a given problem continues to be an active area of research. In this paper we present a new method which exploits both meta-level information acquired in past experiments and active testing, an algorithm selection strategy. Active testing attempts to iteratively identify an algorithm whose performance will(More)
OpenML is an online platform where scientists can automatically log and share machine learning data sets, code, and experiments, organize them online, and build directly on the work of others. It helps to automate many tedious aspects of research, is readily integrated into several machine learning tools, and offers easy-to-use APIs. It also enables(More)
We present OpenML, a novel open science platform that provides easy access to machine learning data, software and results to encourage further study and application. It organizes all submitted results online so they can be easily found and reused, and features a web API which is being integrated in popular machine learning tools such asWeka, KNIME,(More)
We explore the possibilities of meta-learning on data streams, in particular algorithm selection. In a first experiment we calculate the characteristics of a small sample of a data stream, and try to predict which classifier performs best on the entire stream. This yields promising results and interesting patterns. In a second experiment, we build a(More)
We present a RapidMiner extension for OpenML, an open science platform for sharing machine learning datasets, algorithms and experiments. In order to share machine learning experiments as easily as possible, it is being integrated into various popular data mining and machine learning tools, including RapidMiner. Through this plugin, data can be downloaded,(More)
Ensembles of classifiers are among the best performing classifiers available in many data mining applications. However, most ensembles developed specifically for the dynamic data stream setting rely on only one type of base-level classifier, most often Hoeffding Trees. In this paper, we study the use of heterogeneous ensembles, comprised of fundamentally(More)