Inducing Features of Random Fields

Abstract

—We present a technique for constructing random fields from a set of training samples. The learning paradigm builds increasingly complex fields by allowing potential functions, or features, that are supported by increasingly large subgraphs. Each feature has a weight that is trained by minimizing the Kullback-Leibler divergence between the model and the empirical distribution of the training data. A greedy algorithm determines how features are incrementally added to the field and an iterative scaling algorithm is used to estimate the optimal values of the weights. The random field models and techniques introduced in this paper differ from those common to much of the computer vision literature in that the underlying random fields are non-Markovian and have a large number of parameters that must be estimated. Relations to other learning approaches, including decision trees, are given. As a demonstration of the method, we describe its application to the problem of automatic word classification in natural language processing.

DOI: 10.1109/34.588021

Extracted Key Phrases

1 Figure or Table

Showing 1-10 of 20 references

Generalized iterative scaling for log-linear models

  • J Darroch, D Ratcliff
  • 1972
Highly Influential
6 Excerpts

A variational method for estimating the parameters of MRF from complete or incomplete data

  • M Almeida, B Gidas
  • 1993

Automatic word classification using features of spellings

  • J Lafferty, R Mercer
  • 1993

Convergence of some partially parallel Gibbs samplers with annealing

  • P Ferrari, A Frigessi, R Schonmann
  • 1993

Constrained Monte Carlo maximum likelihood for dependent data (with discussion)

  • C Geyer, E Thomson
  • 1992

Optimal spectral structure of reversible stochastic matrices, Monte Carlo methods and the simulation of Markov random fields

  • A Frigessi, C Hwang, L Younes
  • 1992
Showing 1-10 of 616 extracted citations
050'97'99'01'03'05'07'09'11'13'15'17
Citations per Year

1,063 Citations

Semantic Scholar estimates that this publication has received between 927 and 1,221 citations based on the available data.

See our FAQ for additional information.