- Full text PDF available (139)
Data Set Used
A new method is presented to quickly adapt a given language model to local text characteristics. The basic approach i s t o c hoose the adaptive models as close as possible to the background estimates while constraining them to respect the locally estimated uni-gram probabilities. Several means are investigated to speed up the calculations. We measure both… (More)
A new method to combine language models is derived. This method of log-linear interpolation LLI is used for adaptation and for combining models of diierent context length. In both cases LLI is better than linear interpolation.
In this paper, we explore correlation of dependency relation paths to rank candidate answers in answer extraction. Using the correlation measure, we compare dependency relations of a candidate answer and mapped question phrases in sentence with the corresponding relations in question. Different from previous studies, we propose an approximate phrase mapping… (More)
Spoken Language Systems at Saarland University (LSV) participated this year with 5 runs at the TAC KBP English slot filling track. Effective algorithms for all parts of the pipeline, from document retrieval to relation prediction and response post-processing, are bundled in a modular end-to-end relation extraction system called RelationFactory. The main run… (More)
This paper describes approaches for decomposing words of huge vocabularies (up to 2 million) into smaller particles that are suitable for a recognition lexicon. Results on a Finnish dictation task and a flat list of German street names are given.
It is questionable whether words are really the best basic units for the estimation of stochastic language models-grouping frequent w ord sequences to phrases can improve language models. More generally, w e h a ve investigated various coding schemes for a corpus. In this paper, this applied to optimize the perplexity o f n-gram language models. In tests on… (More)