Learn More
We present a simple and effective semi-supervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of the approach in a series of dependency parsing experiments on the Penn Treebank and(More)
Basic language processing such as tokenizing, morphological analyzers, lemmatizing, PoS tagging, chunking, etc. is a need for most NL applications such as Machine Translation, Summarization, Dialogue systems, etc. A large part of the effort required to develop such applications is devoted to the adaptation of existing software resources to the platform,(More)
We describe a parsing approach that makes use of the perceptron algorithm, in conjunction with dynamic programming methods, to recover full constituent-based parse trees. The formalism allows a rich set of parse-tree features, including PCFG-based features, bigram and trigram dependency features , and surface features. A severe challenge in applying such an(More)
Log-linear and maximum-margin models are two commonly-used methods in supervised machine learning, and are frequently used in structured prediction problems. Efficient learning of parameters in these models is therefore an important problem, and becomes a key factor when learning from very large data sets. This paper describes exponentiated gradient (EG)(More)
This paper introduces and analyzes a battery of inference models for the problem of semantic role labeling: one based on constraint satisfaction, and several strategies that model the inference as a meta-learning problem using discriminative classifiers. These classifiers are developed with a rich set of novel features that encode proposition and(More)
This paper describes an empirical study of high-performance dependency parsers based on a semi-supervised learning approach. We describe an extension of semi-supervised structured conditional models (SS-SCMs) to the dependency parsing problem, whose framework is originally proposed in (Suzuki and Isozaki, 2008). Moreover, we introduce two extensions related(More)
This work introduces a general phrase recognition system based on perceptrons, and a global online learning algorithm to train them together. The method applies to complex domains in which some structure has to be recognized. This global problem is broken down into two layers of local subproblems: a filtering layer, which reduces the search space by(More)
This paper describes a set of comparative experiments for the problem of automatically filtering unwanted electronic mail messages. Several variants of the AdaBoost algorithm with confidence– rated predictions (Schapire & Singer 99) have been applied, which differ in the complexity of the base learners considered. Two main conclusions can be drawn from our(More)