Learn More
We describe a methodology for rapid experimentation in statistical machine translation which we use to add a large number of features to a baseline system exploiting features from a wide range of levels of syntactic representation. Feature values were combined in a log-linear model to select the highest scoring candidate translation from an n-best list.(More)
Acknowledgments I owe my thanks to a number of people, each of whom contributed in their own way towards this research and in the preparation of this document. First of all, I thank Prof. Aravind Joshi for his continued support during the period of this research. I have beneeted signiicantly from his deep insights and his passion for subtle details which(More)
  • Jeerey C Reynar, Mitchell P Marcus Adviser, Mark Steedman, Breck Baldwin, Mike Collins, Jason Eisner +19 others
  • 1998
COPYRIGHT Jeerey C. Reynar 1998 In memory of Mom and Dad iii Acknowledgements Like all research, the work described in this dissertation was not conducted in isolation. It is impossible to thank everyone who has in some way contributed to it and equally impossible to thank individuals for all of their contributions. I am indebted for technical insights,(More)
We present a practical co-training method for bootstrapping statistical parsers using a small amount of manually parsed training material and a much larger pool of raw sentences. Experimental results show that unlabelled sentences can be used to improve the performance of statistical parsers. In addition , we consider the problem of boot-strapping parsers(More)
In this paper we present TroFi (Trope Finder), a system for automatically classifying literal and nonliteral usages of verbs through nearly unsupervised word-sense disambiguation and clustering techniques. TroFi uses sentential context instead of selectional constraint violations or paths in semantic hierarchies. It also uses literal and nonliteral seed(More)
Statistical machine translation (SMT) models need large bilingual corpora for training , which are unavailable for some language pairs. This paper provides the first serious experimental study of active learning for SMT. We use active learning to improve the quality of a phrase-based SMT system, and show significant improvements in translation compared to a(More)
This paper describes the application of discrim-inative reranking techniques to the problem of machine translation. For each sentence in the source language, we obtain from a baseline statistical machine translation system, a ranked Ò-best list of candidate translations in the target language. We introduce two novel perceptron-inspired reranking algorithms(More)
This paper investigates bootstrapping for statistical parsers to reduce their reliance on manually annotated training data. We consider both a mostly-unsupervised approach, co-training, in which two parsers are iteratively retrained on each other's output; and a semi-supervised approach, corrected co-training, in which a human corrects each parser's output(More)