Victoria Rosén

Learn More
We present a relatively large-scale initiative in high-quality MT based on semantic transfer, reviewing the motivation for this approach, general architecture and components involved, and preliminary experience from a first round of system integration (to be accompanied by a hands-on system demonstration, if appropriate). The translation problem is one(More)
We present a hybrid MT architecture, combining state-of-the-art linguistic processing with advanced stochastic techniques. Grounded in a theoretical reflection on the division of labor between rule-based and probabilistic elements in the MT task, we summarize per-component approaches to ranking, including empirical results when evaluated in isolation.(More)
In this paper we present a method for greatly reducing parse times in LFG parsing , while at the same time maintaining parse accuracy. We evaluate the methodology on data from English, German and Norwegian and show that the same patterns hold across languages. We achieve a speedup of 67% on the English data and 49% on the German data. On a small amount of(More)
This paper discusses the construction of a parallel treebank currently involving ten languages from six language families. The treebank is based on deep LFG (Lexical-Functional Grammar) grammars that were developed within the framework of the ParGram (Parallel Grammar) effort. The grammars produce output that is maximally parallelized across languages and(More)
Parallel grammars and parallel treebanks can be a useful method for studying linguistic diversity and commonality. We use this approach to study how arguments to similar predicates are realized across languages. To that end, we formulate formal principles for aligning at phrase and word levels based on translational correspondences at predicate-argument(More)
We extend discriminant-based disambiguation techniques to LFG grammars. We present the design and implementation of lexical, morphological, c-structure and f-structure discriminants for an LFG-based parser. Chief considerations in the computation of discriminants are capturing all distinctions between analyses and relating linguistic properties to words in(More)
We review the techniques and tools used for regression testing, the primary quality assurance measure, in a multi-site research project working towards a high-quality Norwegian – English MT demonstrator. A combination of hand-constructed test suites, domain-specific corpora, specialized software tools, and somewhat rigid release procedures is used for(More)