Corpus ID: 22323514

modAL: A modular active learning framework for Python

@article{Danka2018modALAM,
  title={modAL: A modular active learning framework for Python},
  author={Tivadar Danka and P{\'e}ter Horv{\'a}th},
  journal={ArXiv},
  year={2018},
  volume={abs/1805.00979}
}
modAL is a modular active learning framework for Python, aimed to make active learning research and practice simpler. Its distinguishing features are (i) clear and modular object oriented design (ii) full compatibility with scikit-learn models and workflows. These features make fast prototyping and easy extensibility possible, aiding the development of real-life active learning pipelines and novel algorithms as well. modAL is fully open source, hosted on GitHub at this https URL To assure code… Expand
Small-text: Active Learning for Text Classification in Python
We present small-text, a simple modular active learning library, which offers pool-based active learning for text classification in Python. It comes with various pre-implemented state-of-the-artExpand
AstronomicAL: An interactive dashboard for visualisation, integration and classification of data using Active Learning
TLDR
AstronomicAL is a human-in-the-loop interactive labelling and training dashboard that allows users to create reliable datasets and robust classifiers using active learning and is a tool for experimenting with both domain-specific classifications and more general-purpose machine learning strategies. Expand
Addressing practical challenges in Active Learning via a hybrid query strategy
TLDR
This work presents a hybrid query strategy-based AL framework that addresses three practical challenges simultaneously: cold-start, oracle uncertainty and performance evaluation of Active Learner in the absence of ground truth. Expand
Practical considerations for active machine learning in drug discovery.
  • D. Reker
  • Computer Science, Medicine
  • Drug discovery today. Technologies
  • 2019
TLDR
This review recapitulates key findings from previous active learning studies to highlight the challenges and opportunities of applying adaptive machine learning to drug discovery and provides insights for scientists planning to implement active learning workflows in their discovery pipelines. Expand
Practical Galaxy Morphology Tools from Deep Supervised Representation Learning
TLDR
This work shows that deep learning models trained to answer every Galaxy Zoo DECaLS question learn meaningful semantic representations of galaxies that are useful for new tasks on which the models were never trained, and exploits these representations to outperform existing approaches at several practical tasks crucial for investigating large galaxy samples. Expand
QuAX: Mining the Web for High-utility FAQ
TLDR
This work proposes QuAX: a framework for extracting high-utility (i.e., general and self-contained) domain-specific FAQ lists from the Web, which receives a set of keywords from a user, and works in a pipelined fashion to find relevant web pages and extract general andSelf-contained questions-answer pairs. Expand
Active Learning Query Strategies for Classification, Regression, and Clustering: A Survey
TLDR
This survey reviews AL query strategies for classification, regression, and clustering under the pool-based AL scenario and presents a comparative analysis of these strategies. Expand
ZeroER: Entity Resolution using Zero Labeled Examples
TLDR
This paper investigates an important problem that vexes practitioners: is it possible to design an effective algorithm for ER that requires Zero labeled examples, yet can achieve performance comparable to supervised approaches, and presents a proposed approach dubbed ZeroER. Expand
Active Learning for Arabic Text Classification
Active Learning explores the use of minimal human intervention to improve the efficiency of supervised machine learning algorithms during the learning/training phase. Active learning improves machineExpand
Unsupervised Instance Selection with Low-Label, Supervised Learning for Outlier Detection
TLDR
This work investigates the unsupervised instance selection (UNISEL) technique followed by a Random Forest (RF) classifier on 10 outlier detection datasets under low-label conditions and investigates the combination of UNISEL and AL. Expand
...
1
2
3
4
...

References

SHOWING 1-10 OF 17 REFERENCES
libact: Pool-based Active Learning in Python
TLDR
libact is a Python package that implements several popular active learning strategies, but also features the active-learning-by-learning meta-algorithm that assists the users to automatically select the best strategy on the fly. Expand
API design for machine learning software: experiences from the scikit-learn project
TLDR
The simple and elegant interface shared by all learning and processing units in the Scikit-learn library is described and its advantages in terms of composition and reusability are discussed. Expand
JCLAL: A Java Framework for Active Learning
TLDR
JCLAL is a Java Class Library for Active Learning which has an architecture that follows strong principles of object-oriented design, and it allows the developers to adapt, modify and extend the framework according to their needs. Expand
Scikit-learn: Machine Learning in Python
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringingExpand
Improving generalization with active learning
TLDR
A formalism for active concept learning called selective sampling is described and it is shown how it may be approximately implemented by a neural network. Expand
On Active Learning in Multi-label Classification
TLDR
A novel active learning strategy for reducing the labeling effort is proposed and an experimental study is conducted on the well-known Reuters-21578 text categorization benchmark dataset to demonstrate the efficiency of this approach. Expand
Active Learning Strategies for Multi-Label Text Classification
TLDR
A number of realistic strategies for tackling active learning for multi-label classification are examined, each consisting of a rule for combining the outputs returned by the individual binary classifiers as a result of classifying a given unlabeled document. Expand
Active deep learning method for semi-supervised sentiment classification
TLDR
Experiments on five sentiment classification datasets show that ADN and IADN outperform classical semi-supervised learning algorithms, and deep learning techniques applied for sentiment classification. Expand
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
This paper presents an active learning method that directly optimizes expected future error. This is in contrast to many other popular techniques that instead aim to reduce version space size. TheseExpand
Taking the Human Out of the Loop: A Review of Bayesian Optimization
TLDR
This review paper introduces Bayesian optimization, highlights some of its methodological aspects, and showcases a wide range of applications. Expand
...
1
2
...