• Corpus ID: 22323514

modAL: A modular active learning framework for Python

  title={modAL: A modular active learning framework for Python},
  author={Tivadar Danka and P{\'e}ter Horv{\'a}th},
modAL is a modular active learning framework for Python, aimed to make active learning research and practice simpler. Its distinguishing features are (i) clear and modular object oriented design (ii) full compatibility with scikit-learn models and workflows. These features make fast prototyping and easy extensibility possible, aiding the development of real-life active learning pipelines and novel algorithms as well. modAL is fully open source, hosted on GitHub at this https URL To assure code… 
Small-text: Active Learning for Text Classification in Python
We present small-text, a simple modular active learning library, which offers pool-based active learning for text classification in Python. It comes with various pre-implemented state-of-the-art
AstronomicAL: An interactive dashboard for visualisation, integration and classification of data using Active Learning
AstronomicAL is a human-in-the-loop interactive labelling and training dashboard that allows users to create reliable datasets and robust classifiers using active learning and is a tool for experimenting with both domain-specific classifications and more general-purpose machine learning strategies.
Addressing practical challenges in Active Learning via a hybrid query strategy
This work presents a hybrid query strategy-based AL framework that addresses three practical challenges simultaneously: cold-start, oracle uncertainty and performance evaluation of Active Learner in the absence of ground truth.
Practical considerations for active machine learning in drug discovery.
  • D. Reker
  • Computer Science, Medicine
    Drug discovery today. Technologies
  • 2019
This review recapitulates key findings from previous active learning studies to highlight the challenges and opportunities of applying adaptive machine learning to drug discovery and provides insights for scientists planning to implement active learning workflows in their discovery pipelines.
Practical Galaxy Morphology Tools from Deep Supervised Representation Learning
This work shows that deep learning models trained to answer every Galaxy Zoo DECaLS question learn meaningful semantic representations of galaxies that are useful for new tasks on which the models were never trained, and exploits these representations to outperform existing approaches at several practical tasks crucial for investigating large galaxy samples.
QuAX: Mining the Web for High-utility FAQ
This work proposes QuAX: a framework for extracting high-utility (i.e., general and self-contained) domain-specific FAQ lists from the Web, which receives a set of keywords from a user, and works in a pipelined fashion to find relevant web pages and extract general andSelf-contained questions-answer pairs.
Active Learning Query Strategies for Classification, Regression, and Clustering: A Survey
This survey reviews AL query strategies for classification, regression, and clustering under the pool-based AL scenario and presents a comparative analysis of these strategies.
ZeroER: Entity Resolution using Zero Labeled Examples
This paper investigates an important problem that vexes practitioners: is it possible to design an effective algorithm for ER that requires Zero labeled examples, yet can achieve performance comparable to supervised approaches, and presents a proposed approach dubbed ZeroER.
Active Learning for Arabic Text Classification
Active Learning explores the use of minimal human intervention to improve the efficiency of supervised machine learning algorithms during the learning/training phase. Active learning improves machine
Unsupervised Instance Selection with Low-Label, Supervised Learning for Outlier Detection
This work investigates the unsupervised instance selection (UNISEL) technique followed by a Random Forest (RF) classifier on 10 outlier detection datasets under low-label conditions and investigates the combination of UNISEL and AL.


libact: Pool-based Active Learning in Python
libact is a Python package that implements several popular active learning strategies, but also features the active-learning-by-learning meta-algorithm that assists the users to automatically select the best strategy on the fly.
API design for machine learning software: experiences from the scikit-learn project
The simple and elegant interface shared by all learning and processing units in the Scikit-learn library is described and its advantages in terms of composition and reusability are discussed.
JCLAL: A Java Framework for Active Learning
JCLAL is a Java Class Library for Active Learning which has an architecture that follows strong principles of object-oriented design, and it allows the developers to adapt, modify and extend the framework according to their needs.
Scikit-learn: Machine Learning in Python
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing
Improving generalization with active learning
A formalism for active concept learning called selective sampling is described and it is shown how it may be approximately implemented by a neural network.
On Active Learning in Multi-label Classification
A novel active learning strategy for reducing the labeling effort is proposed and an experimental study is conducted on the well-known Reuters-21578 text categorization benchmark dataset to demonstrate the efficiency of this approach.
Active Learning Strategies for Multi-Label Text Classification
A number of realistic strategies for tackling active learning for multi-label classification are examined, each consisting of a rule for combining the outputs returned by the individual binary classifiers as a result of classifying a given unlabeled document.
Active deep learning method for semi-supervised sentiment classification
Experiments on five sentiment classification datasets show that ADN and IADN outperform classical semi-supervised learning algorithms, and deep learning techniques applied for sentiment classification.
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
This paper presents an active learning method that directly optimizes expected future error. This is in contrast to many other popular techniques that instead aim to reduce version space size. These
Taking the Human Out of the Loop: A Review of Bayesian Optimization
This review paper introduces Bayesian optimization, highlights some of its methodological aspects, and showcases a wide range of applications.