• Publications
  • Influence
Incremental Parsing with the Perceptron Algorithm
TLDR
It is demonstrated that training a perceptron model to combine with the generative model during search provides a 2.1 percent F-measure improvement over the Generative model alone, to 88.8 percent. Expand
Probabilistic Top-Down Parsing and Language Modeling
TLDR
A lexicalized probabilistic top-down parser is presented, which performs very well, in terms of both the accuracy of returned parses and the efficiency with which they are found, relative to the best broad-coverage statistical parsers. Expand
Spoken Language Derived Measures for Detecting Mild Cognitive Impairment
TLDR
The results indicate that using multiple, complementary measures can aid in automatic detection of MCI, and demonstrate a statistically significant improvement in the area under the ROC curve (AUC) when using automatic spoken language derived features in addition to the neuropsychological test scores. Expand
Discriminative n-gram language modeling
TLDR
This paper describes a method based on regularized likelihood that makes use of the feature set given by the perceptron algorithm, and initialization with the perceptRON's weights; this method gives an additional 0.5% reduction in word error rate (WER) over training withThe perceptron alone. Expand
Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing
TLDR
Novel methods for calculating separate lexical and syntactic surprisal measures from a single incremental parser using a lexicalized PCFG and an approximation to entropy measures that would otherwise be intractable to calculate for a grammar of that size are presented. Expand
Generalized Algorithms for Constructing Statistical Language Models
TLDR
An algorithm for computing efficiently the expected counts of any sequence in a word lattice output by a speech recognizer or any arbitrary weighted automaton is given and a new technique for creating exact representations of n-gram language models by weighted automata is described. Expand
Unsupervised language model adaptation
TLDR
Unsupervised language model adaptation, from ASR transcripts, shows an error rate reduction of 3.9% over the unadapted baseline performance, from 28% to 24.1%, using 17 hours of unsupervised adaptation material. Expand
Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm
TLDR
This paper compares two parameter estimation methods: the perceptron algorithm, and a method based on conditional random fields (CRFs), which have the benefit of automatically selecting a relatively small feature set in just a couple of passes over the training data. Expand
Noun-Phrase Co-Occurence Statistics for Semi-Automatic Semantic Lexicon Construction
TLDR
This paper presents an algorithm for extracting potential entries for a category from an on-line corpus, based upon a small set of exemplars, that could be viewed as an "enhancer" of existing broad-coverage resources. Expand
Discriminative Syntactic Language Modeling for Speech Recognition
TLDR
A reranking model makes use of syntactic features together with a parameter estimation method that is based on the perception algorithm that provides an additional 0.3% reduction in test-set error rate beyond the model of (Roark et al., 2004a; Roark etAl., 2004b). Expand
...
1
2
3
4
5
...