PMLB: a large benchmark suite for machine learning evaluation and comparison
@article{Olson2017PMLBAL, title={PMLB: a large benchmark suite for machine learning evaluation and comparison}, author={R. S. Olson and W. L. Cava and P. Orzechowski and R. Urbanowicz and J. Moore}, journal={BioData Mining}, year={2017}, volume={10} }
BackgroundThe selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists… CONTINUE READING
Supplemental Code
Github Repo
Via Papers with Code
PMLB: A large, curated repository of benchmark datasets for evaluating supervised machine learning algorithms.
133 Citations
Benchmark AFLOW Data Sets for Machine Learning
- Computer Science
- Integrating Materials and Manufacturing Innovation
- 2020
- 3
ShinyLearner: A containerized benchmarking tool for machine-learning classification of tabular data
- Computer Science, Medicine
- GigaScience
- 2020
- 1
- PDF
ShinyLearner: A containerized benchmarking tool for machine-learning classification of tabular data
- Computer Science, Biology
- 2019
- PDF
TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning
- Computer Science
- AutoML@ICML
- 2016
- 175
- PDF
Scaling tree-based automated machine learning to biomedical big data with a dataset selector
- Biology
- 2018
- 1
- PDF
Efficient and Robust Model Benchmarks with Item Response Theory and Adaptive Testing
- Computer Science
- Int. J. Interact. Multim. Artif. Intell.
- 2021
- PDF
Where are we now?: a large benchmark study of recent symbolic regression methods
- Computer Science
- GECCO
- 2018
- 39
- PDF
Evolutionary dataset optimisation: learning algorithm quality through evolution
- Computer Science
- Applied Intelligence
- 2019
- 3
- PDF
References
SHOWING 1-10 OF 33 REFERENCES
A Comprehensive Dataset for Evaluating Approaches of Various Meta-learning Tasks
- Computer Science
- ICPRAM
- 2012
- 19
- PDF
ExSTraCS 2.0: description and evaluation of a scalable learning classifier system
- Computer Science, Medicine
- Evol. Intell.
- 2015
- 58
A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction
- Computer Science, Medicine
- Genetic epidemiology
- 2007
- 315