Learn More
This paper describes a method of semi-automatically acquiring an En-glish HPSG grammar from the Penn Treebank. First, heuristic rules are employed to annotate the treebank with partially-specified derivation trees. Lexical entries are automatically extracted from the annotated corpus by inversely applying schemata to partially-specified derivation trees. 1(More)
Sentence compression is a task of creating a short grammatical sentence by removing extraneous words or phrases from an original sentence while preserving its meaning. Existing methods learn statistics on trimming context-free grammar (CFG) rules. However, these methods sometimes eliminate the original meaning by incorrectly removing important parts of(More)
This paper introduces a novel framework for the accurate retrieval of relational concepts from huge texts. Prior to retrieval, all sentences are annotated with predicate argument structures and ontological iden-tifiers by applying a deep parser and a term recognizer. During the run time, user requests are converted into queries of region algebra on these(More)
Many parsing techniques including parameter estimation assume the use of a packed parse forest for efficient and accurate parsing. However, they have several inherent problems deriving from the restriction of locality in the packed parse forest. Deterministic parsing is one of solutions that can achieve simple and fast parsing without the mechanisms of the(More)
This paper describes an extremely lexi-calized probabilistic model for fast and accurate HPSG parsing. In this model, the probabilities of parse trees are defined with only the probabilities of selecting lexical entries. The proposed model is very simple, and experiments revealed that the implemented parser runs around four times faster than the previous(More)
We investigated the performance efficacy of beam search parsing and deep parsing techniques in probabilistic HPSG parsing using the Penn treebank. We first tested the beam thresholding and iterative parsing developed for PCFG parsing with an HPSG. Next, we tested three techniques originally developed for deep parsing: quick check, large constituent(More)
This paper describes a log-linear model with an n-gram reference distribution for accurate probabilistic HPSG parsing. In the model, the n-gram reference distribution is simply defined as the product of the probabilities of selecting lexical entries, which are provided by the discriminative method with machine learning features of word and POS n-gram as(More)
The Passive Aggressive framework [1] is a principled approach to online linear classification that advocates minimal weight updates i.e., the least required so that the current training instance is correctly classified. While the PA framework allows integration with different loss functions, it is yet to be combined with a multiclass loss function that(More)
This paper describes new default unification, lenient default unification. It works efficiently, and gives more informative results because it maximizes the amount of information in the result, while other default unification maximizes it in the default. We also describe robust processing within the framework of HPSG. We extract grammar rules from the(More)