Integrated Weak Learning

  title={Integrated Weak Learning},
  author={Peter Hayes and Mingtian Zhang and Raza Habib and Jordan Burgess and Emine Yilmaz and David Barber},
We introduce Integrated Weak Learning , a principled framework that integrates weak supervision into the training process of machine learning models. Our approach jointly trains the end-model and a label model that aggregates multiple sources of weak supervision. We introduce a label model that can learn to ag-gregate weak supervision sources differently for different datapoints and takes into consideration the performance of the end-model during training. We show that our approach outperforms… 

Figures and Tables from this paper



Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods

FlyingSquid is built, a weak supervision framework that runs orders of magnitude faster than previous weak supervision approaches and requires fewer assumptions, and proves bounds on generalization error without assuming that the latent variable model can exactly parameterize the underlying data distribution.

WRENCH: A Comprehensive Benchmark for Weak Supervision

A benchmark platform, WRENCH, for thorough and standardized evaluation of WS approaches, consisting of 22 varied real-world datasets for classification and sequence tagging; a range of real, synthetic, and procedurally-generated weak supervision sources; and a modular, extensible framework for WS evaluation, including implementations for popular WS methods.

Denoising Multi-Source Weak Supervision for Neural Text Classification

A label denoiser is designed, which estimates the source reliability using a conditional soft attention mechanism and then reduces label noise by aggregating rule-annotated weak labels, which address the rule coverage issue.

BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision

A new computational framework -- BOND, which leverages the power of pre-trained language models to improve the prediction performance of NER models and demonstrates the superiority of BOND over existing distantly supervised NER methods.

Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning

This paper proposes a novel approach which can partially solve the above problems of distant supervision for NER, and applies partial annotation learning to reduce the effect of unknown labels of characters in incomplete and noisy annotations.

Learning From Incomplete and Inaccurate Supervision

This paper investigates the problem of learning from incomplete and inaccurate supervision, where only a limited subset of training data is labeled but potentially with noise and proposes novel approaches that effectively alleviate the negative influence of label noise with the help of a vast number of unlabeled data.

Learning to Reweight Examples for Robust Deep Learning

This work proposes a novel meta-learning algorithm that learns to assign weights to training examples based on their gradient directions that can be easily implemented on any type of deep network, does not require any additional hyperparameter tuning, and achieves impressive performance on class imbalance and corrupted label problems where only a small amount of clean validation data is available.

Confident Learning: Estimating Uncertainty in Dataset Labels

This work combines the assumption of a class-conditional noise process to directly estimate the joint distribution between noisy (given) labels and uncorrupted (unknown) labels, and presents a generalized CL which is provably consistent and experimentally performant.

Learning with Noisy Labels for Sentence-level Sentiment Classification

A novel DNN model called NetAb is proposed to handle noisy labels during training for sentence-level sentiment classification, which consists of two convolutional neural networks with a noise transition layer for dealing with the input noisy labels and the other for predicting ‘clean’ labels.