• Publications
  • Influence
Interpretable Decision Sets: A Joint Framework for Description and Prediction
TLDR
We propose interpretable decision sets, a framework for building predictive models that are highly accurate, yet also highly interpretable. Expand
  • 311
  • 39
  • PDF
Snorkel: Rapid Training Data Creation with Weak Supervision
TLDR
We present Snorkel, a first- of-its-kind system that enables users to train state-of- the-art models without hand labeling any training data. Expand
  • 283
  • 37
  • PDF
Hinge-Loss Markov Random Fields and Probabilistic Soft Logic: A Scalable Approach to Structured Prediction
  • Stephen H. Bach
  • Computer Science, Mathematics
  • J. Mach. Learn. Res.
  • 17 May 2015
TLDR
A fundamental challenge in developing high-impact machine learning technologies is balancing the need to model rich, structured domains with the ability to scale to big data. Expand
  • 142
  • 33
  • PDF
A short introduction to probabilistic soft logic
TLDR
This paper provides an overview of the PSL language and its techniques for inference and weight learning. Expand
  • 179
  • 31
  • PDF
Hinge-loss Markov Random Fields: Convex Inference for Structured Prediction
TLDR
We use hinge-loss Markov random fields (HL-MRFs), an expressive class of graphical models with log-concave density functions over continuous variables, which can represent confidences in discrete predictions. Expand
  • 100
  • 11
  • PDF
Paired Learners for Concept Drift
TLDR
To cope with concept drift, we paired a stable online learner with a reactive one. Expand
  • 112
  • 10
  • PDF
Snorkel: Fast Training Set Generation for Information Extraction
TLDR
Snorkel enables users to generate large volumes of training data by writing labeling functions that express heuristics and other weak supervision strategies. Expand
  • 41
  • 4
  • PDF
Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale
TLDR
We present a first-of-its-kind study showing how existing knowledge resources from across an organization can be used as weak supervision in order to bring development time and cost down by an order of magnitude, and introduce Snorkel DryBell, a new weak supervision management system for this setting. Expand
  • 37
  • 4
  • PDF
Learning the Structure of Generative Models without Labeled Data
TLDR
We propose a structure estimation method that maximizes the ℓ 1-regularized marginal pseudolikelihood of the observed data. Expand
  • 68
  • 3
  • PDF
Graph Summarization in Annotated Data Using Probabilistic Soft Logic
TLDR
We propose the use of probabilistic soft logic (PSL) as a general framework for reasoning about annotation graphs, similarities, and the possibly confounding evidence arising from these. Expand
  • 20
  • 3
  • PDF