Michael Collins

Learn More
HEAD DRIVEN STATISTICAL MODELS FOR NATURAL LANGUAGE PARSING Michael Collins Supervisor Professor Mitch Marcus Statistical models for parsing natural language have recently shown considerable suc cess in broad coverage domains Ambiguity often leads to an input sentence having many possible parse trees statistical approaches assign a probability to each tree(More)
We describe a method for incorporating syntactic information in statistical machine translation systems. The first step of the method is to parse the source language string that is being translated. The second step is to apply a series of transformations to the parse tree, effectively reordering the surface string on the source language side of the(More)
We describe the application of kernel methods to Natural Language Processing (NLP) problems. In many NLP tasks the objects being modeled are strings, trees, graphs or other discrete structures which require some mechanism to convert them into feature vectors. We describe kernels for various natural language structures, allowing rich, high dimensional(More)
This paper considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon this initial ranking, using additional features of the tree as(More)
This paper introduces new learning algorithms for natural language processing based on the perceptron algorithm. We show how the algorithms can be efficiently applied to exponential sized representations of parse trees, such as the “all subtrees” (DOP) representation described by (Bod 1998), or a representation tracking all sub-fragments of a tagged(More)
We present a discriminative latent variable model for classification problems in structured domains where inputs can be represented by a graph of local observations. A hidden-state conditional random field framework learns a set of latent variables conditioned on local features. Observations need not be independent and may overlap in space and time.
We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of the approach in a series of dependency parsing experiments on the Penn Treebank and Prague(More)
This paper addresses the problem of mapping natural language sentences to lambda–calculus encodings of their meaning. We describe a learning algorithm that takes as input a training set of sentences labeled with expressions in the lambda calculus. The algorithm induces a grammar for the problem, along with a log-linear model that represents a distribution(More)