• Publications
  • Influence
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
TLDR
We present conditional random fields , a framework for building probabilistic models to segment and label sequence data. Expand
  • 11,839
  • 1769
  • PDF
A comparison of event models for naive bayes text classification
TLDR
This paper aims to clarify the confusion by describing the differences and details of these two models, and by empirically comparing their classification performance on five text corpora. Expand
  • 3,567
  • 300
  • PDF
Text Classification from Labeled and Unlabeled Documents using EM
TLDR
We introduce an algorithm for learning from labeled and unlabeled documents based on the combination of Expectation-Maximization (EM) and a naive Bayes classifier. Expand
  • 2,968
  • 211
  • PDF
Modeling Relations and Their Mentions without Labeled Text
TLDR
We present a novel approach to distant supervision that can alleviate this problem based on the following two ideas: First, we use a factor graph to explicitly model the decision whether two entities are related, and the decisionwhether this relation is mentioned in a given sentence; second, we apply constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in our training KB. Expand
  • 809
  • 160
  • PDF
Maximum Entropy Markov Models for Information Extraction and Segmentation
TLDR
This paper presents a new Markovian sequence model, closely related to HMMs, that allows observations to be represented as arbitrary overlapping features (such as word, capitalization, formatting, part-of-speech), and defines the conditional probability of state sequences given observation sequences. Expand
  • 1,504
  • 139
  • PDF
Topics over time: a non-Markov continuous-time model of topical trends
TLDR
We present an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Expand
  • 1,191
  • 110
  • PDF
An Introduction to Conditional Random Fields for Relational Learning
TLDR
A CRF can be viewed as an extension of logistic regression to arbitrary graphical structures. Expand
  • 897
  • 96
  • PDF
Optimizing Semantic Coherence in Topic Models
TLDR
An analysis of the ways in which topics can be flawed; (2) an automated evaluation metric for identifying such topics that does not rely on human annotators or reference collections outside the training data. Expand
  • 909
  • 92
  • PDF
Automating the Construction of Internet Portals with Machine Learning
TLDR
We present new machine learning methods for spidering in an efficient topic-directed manner, extracting topic-relevant information, and building a browseable topic hierarchy. Expand
  • 671
  • 88
  • PDF
An Introduction to Conditional Random Fields
TLDR
This survey describes conditional random fields, a popular probabilistic method for structured prediction, including methods for inference and parameter estimation. Expand
  • 823
  • 85
  • PDF