PyRelationAL: A Library for Active Learning Research and Development

  title={PyRelationAL: A Library for Active Learning Research and Development},
  author={Paul Scherer and Thomas Gaudelet and Alison Pouplin and S SurajM and Jyothish Soman and Lindsay Edwards and Jake P. Taylor-King},
In constrained real-world scenarios where it is challenging or costly to generate data, disciplined methods for acquiring informative new data points are of fun-damental importance for the efficient training of machine learning (ML) models. Active learning (AL) is a subfield of ML focused on the development of methods to iteratively and economically acquire data through strategically querying new data points that are the most useful for a particular task. Here, we introduce PyRelationAL, an open… 



ALiPy: Active Learning in Python

A Python toobox ALiPy for active learning provides a module based implementation of active learning framework, which allows users to conveniently evaluate, compare and analyze the performance of activeLearning methods.

A Comparative Survey: Benchmarking for Pool-based Active Learning

This paper surveys and compares various AL strategies used in both recently proposed and classic highly-cited methods, and proposes to benchmark pool-based AL methods with a variety of datasets and quantitative metric, and draws insights from the comparative empirical results.

Bayesian active learning for production, a systematic study and a reusable library

A systematic study on the effects of the most common issues of real-world datasets on the deep active learning process: model convergence, annotation error, and dataset imbalance is done.

Less is more: sampling chemical space with active learning

This work presents a fully automated approach for the generation of datasets with the intent of training universal ML potentials based on the concept of active learning (AL) via Query by Committee (QBC), which uses the disagreement between an ensemble ofML potentials to infer the reliability of the ensemble's prediction.

A benchmark and comparison of active learning for logistic regression

Pool-Based Sequential Active Learning for Regression

  • Dongrui Wu
  • Computer Science
    IEEE Transactions on Neural Networks and Learning Systems
  • 2019
This paper proposes a new ALR approach using passive sampling, which considers both the representativeness and the diversity in both the initialization and subsequent iterations, and can be integrated with other existing ALR approaches in the literature to further improve the performance.

modAL: A modular active learning framework for Python

modAL is a modular active learning framework for Python, aimed to make active learning research and practice simpler. Its distinguishing features are (i) clear and modular object oriented design (ii)

Incorporating Diversity in Active Learning with Support Vector Machines

This work presents a new approach that is especially designed to construct batches and incorporates a diversity measure that has low computational requirements making it feasible for large scale problems with several thousands of examples.

JCLAL: A Java Framework for Active Learning

JCLAL is a Java Class Library for Active Learning which has an architecture that follows strong principles of object-oriented design, and it allows the developers to adapt, modify and extend the framework according to their needs.

Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development

Therapeutics Data Commons is introduced, the first unifying platform to systematically access and evaluate machine learning across the entire range of therapeutics, and it is envisioned that TDC can facilitate algorithmic and scientific advances and considerably accelerate machinelearning model development, validation and transition into biomedical and clinical implementation.