• Publications
  • Influence
Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques
The model combines full and partial parsing techniques to reach full grammar coverage on unseen data, and on a gold standard of manually annotated f-structures for a subset of the WSJ treebank, reaches 79% F-score. Expand
Inducing a Semantically Annotated Lexicon via EM-Based Clustering
A technique for automatic induction of slot annotations for subcategorization frames, based on induction of hidden classes in the EM framework of statistical estimation, and an interpretation of the learned representations as theoretical-linguistic decompositional lexical entries are presented. Expand
Estimators for Stochastic "Unification-Based" Grammars
Two computationally-tractable ways of estimating the parameters of Stochastic "Unification-Based" Grammars from a training corpus of syntactic analyses are described and applied to estimate a stochastic version of Lexical-Functional Grammar. Expand
The PARC 700 Dependency Bank
The PARC 700 DEPBANK is a dependency bank containing predicate-argument relations and a wide variety of other grammatical features that was semi-automatically produced and boot-strapped from the output of a deep parser. Expand
Speed and Accuracy in Shallow and Deep Stochastic Parsing
This paper reports some experiments that Compare the accuracy and performance of two stochastic parsing systems and found the deep-parsing system to be more accurate than the Collins parser with only a slight reduction in parsing speed. Expand
On Some Pitfalls in Automatic Evaluation and Significance Testing for MT
In an experimental comparison of two statistical significance tests, it is shown that p-values are estimated more conservatively by approximate randomization than by bootstrap tests, thus increasing the likelihood of type-I error for the latter. Expand
Statistical Machine Translation for Query Expansion in Answer Retrieval
We present an approach to query expansion in answer retrieval that uses Statistical Machine Translation (SMT) techniques to bridge the lexical gap between questions and answers. SMT-based queryExpand
Statistical Sentence Condensation using Ambiguity Packing and Stochastic Disambiguation Methods for Lexical-Functional Grammar
Overall summarization quality of the proposed system is state-of-the-art, with guaranteed grammaticality of the system output due to the use of a constraint-based parser/generator. Expand
QUality Estimation from ScraTCH (QUETCH): Deep Learning for Word-level Translation Quality Estimation
The submitted system combines a continuous space deep neural network, that learns a bilingual feature representation from scratch, with a linear combination of the manually defined baseline features provided by the task organizers, which shows significant improvements over the combined systems. Expand
Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT
This paper deploys local features for SCFG-based SMT that can be read off from rules at runtime, and presents a learning algorithm that applies l1/l2 regularization for joint feature selection over distributed stochastic learning processes. Expand