Corpus ID: 220250564

Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings

  title={Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings},
  author={Mayee F. Chen and Daniel Y. Fu and Frederic Sala and Sen Wu and Ravi Teja Mullapudi and Fait Poms and K. Fatahalian and Christopher R'e},
Our goal is to enable machine learning systems to be trained interactively. This requires models that perform well and train quickly, without large amounts of hand-labeled data. We take a step forward in this direction by borrowing from weak supervision (WS), wherein models can be trained with noisy sources of signal instead of hand-labeled data. But WS relies on training downstream deep networks to extrapolate to unseen data points, which can take hours or days. Pre-trained embeddings can… Expand
CHECKER: Detecting Clickbait Thumbnails with Weak Supervision and Co-teaching
Clickbait thumbnails on video-sharing platforms (e.g., YouTube, Dailymotion) are small catchy images that are designed to entice users to click to view the linked videos. Despite their usefulness,Expand
Goodwill Hunting: Analyzing and Repurposing Off-the-Shelf Named Entity Linking Systems
This work lays out and investigates two challenges faced by individuals or organizations building NEL systems, and shows how tailoring a simple technique for patching models using weak labeling can provide a 25% absolute improvement in accuracy of sport-related errors. Expand
Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling
This work develops the first framework for interactive weak supervision in which a method proposes heuristics and learns from user feedback given on each proposed heuristic, demonstrating that only a small number of feedback iterations are needed to train models that achieve highly competitive test set performance without access to ground truth training labels. Expand


Training Complex Models with Multi-Task Weak Supervision
This work shows that by solving a matrix completion-style problem, it can recover the accuracies of these multi-task sources given their dependency structure, but without any labeled data, leading to higher-quality supervision for training an end model. Expand
Label Propagation for Deep Semi-Supervised Learning
This work employs a transductive label propagation method that is based on the manifold assumption to make predictions on the entire dataset and use these predictions to generate pseudo-labels for the unlabeled data and train a deep neural network. Expand
Snorkel: Rapid Training Data Creation with Weak Supervision
Snorkel is a first-of-its-kind system that enables users to train state- of- the-art models without hand labeling any training data and proposes an optimizer for automating tradeoff decisions that gives up to 1.8× speedup per pipeline execution. Expand
Data Programming: Creating Large Training Sets, Quickly
A paradigm for the programmatic creation of training sets called data programming is proposed in which users express weak supervision strategies or domain heuristics as labeling functions, which are programs that label subsets of the data, but that are noisy and may conflict. Expand
Learning Dependency Structures for Weak Supervision Models
It is shown that the amount of unlabeled data needed can scale sublinearly or even logarithmically with the number of sources, improving over previous efforts that ignore the sparsity pattern in the dependency structure and scale linearly in $m$. Expand
Neural Ranking Models with Weak Supervision
This paper proposes to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources, and suggests that supervised neural ranking models can greatly benefit from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models. Expand
Learning to Learn from Weak Supervision by Full Supervision
This paper proposes to control the magnitude of the gradient updates to the target network using the scores provided by the second confidence network, which is trained on a small amount of supervised data, to avoid that the weight updates computed from noisy labels harm the quality of thetarget network model. Expand
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
This work presents two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT, and uses a self-supervised loss that focuses on modeling inter-sentence coherence. Expand
Exploring the Limits of Weakly Supervised Pretraining
This paper presents a unique study of transfer learning with large convolutional networks trained to predict hashtags on billions of social media images and shows improvements on several image classification and object detection tasks, and reports the highest ImageNet-1k single-crop, top-1 accuracy to date. Expand
Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning
The DkNN algorithm is evaluated on several datasets, and it is shown the confidence estimates accurately identify inputs outside the model, and that the explanations provided by nearest neighbors are intuitive and useful in understanding model failures. Expand