#### Filter Results:

#### Publication Year

1982

2017

#### Publication Type

#### Co-author

#### Publication Venue

#### Data Set Used

#### Key Phrases

Learn More

- John D. Lafferty, Andrew McCallum, Fernando Pereira
- ICML
- 2001

We present conditional random fields, a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. Conditional random fields also avoid… (More)

- Xiaojin Zhu, Zoubin Ghahramani, John D. Lafferty
- ICML
- 2003

An approach to semi-supervised learning is proposed that is based on a Gaussian random field model. Labeled and unlabeled data are represented as vertices in a weighted graph, with edge weights encoding the similarity between instances. The learning problem is then formulated in terms of a Gaussian random field on this graph, where the mean of the field is… (More)

- ChengXiang Zhai, John D. Lafferty
- SIGIR
- 2001

Language modeling approaches to information retrieval are attractive and promising because they connect the problem of retrieval with that of language model estimation, which has been studied extensively in other application areas such as speech recognition. The basic idea of these approaches is to estimate a language model for each document, and then rank… (More)

- ChengXiang Zhai, John D. Lafferty
- ACM Trans. Inf. Syst.
- 2004

Language modeling approaches to information retrieval are attractive and promising because they connect the problem of retrieval with that of language model estimation, which has been studied extensively in other application areas such as speech recognition. The basic idea of these approaches is to estimate a language model for each document, and to then… (More)

- Peter F. Brown, John Cocke, +5 authors Paul S. Roossin
- Computational Linguistics
- 1990

In this paper, we present a statistical approach to machine translation. We describe the application of our approach to translation from French to English and give preliminary results. The field of machine translation is almost as old as the modern digital computer. In 1949 Warren Weaver suggested that the problem be attacked with statistical methods and… (More)

- Stephen Della Pietra, Vincent J. Della Pietra, John D. Lafferty
- IEEE Trans. Pattern Anal. Mach. Intell.
- 1997

—We present a technique for constructing random fields from a set of training samples. The learning paradigm builds increasingly complex fields by allowing potential functions, or features, that are supported by increasingly large subgraphs. Each feature has a weight that is trained by minimizing the Kullback-Leibler divergence between the model and the… (More)

- David M. Blei, John D. Lafferty
- ICML
- 2006

A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections. The approach is to use state space models on the natural parameters of the multinomial distributions that represent the topics. Variational approximations based on Kalman filters and nonparametric wavelet regression are developed… (More)

- Adam L. Berger, John D. Lafferty
- SIGIR
- 1999

We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a statistical model of how a user might distill or \translate" a given document i n to a query. To assess the relevance of a document to a user's query, we estimate the probability… (More)

- ChengXiang Zhai, John D. Lafferty
- CIKM
- 2001

The language modeling approach to retrieval has been shown to perform well empirically. One advantage of this new approach is its statistical foundations. However, feedback, as one important component in a retrieval system, has only been dealt with heuristically in this new retrieval approach: the original query is usually literally expanded by adding… (More)

Topic models, such as latent Dirichlet allocation (LDA), can be useful tools for the statistical analysis of document collections and other discrete data. The LDA model assumes that the words of each document arise from a mixture of topics, each of which is a distribution over the vocabulary. A limitation of LDA is the inability to model topic correlation… (More)