• Publications
  • Influence
Universal Dependencies 2.1
TLDR
The annotation scheme is based on (universal) Stanford dependencies, Google universal part-of-speech tags, and the Interset interlingua for morphosyntactic tagsets for morpho-lingual tagsets. Expand
KLcpos3 - a Language Similarity Measure for Delexicalized Parser Transfer
We present KLcpos3 , a language similarity measure based on Kullback-Leibler divergence of coarse part-of-speech tag trigram distributions in tagged corpora. It has been designed for multilingualExpand
Two-step translation with grammatical post-processing
TLDR
It is proved that it is possible to improve the MT quality of majority of the systems participating in WMT shared task by making both automatic (BLEU) and manual evaluations. Expand
Universal Dependencies 2.0 – CoNLL 2017 Shared Task Development and Test Data
TLDR
This release contains the test data used in the CoNLL 2017 shared task on parsing Universal Dependencies, and complements the UD 2.0 release with 18 new parallel test sets and 4 test sets in surprise languages. Expand
How Language-Neutral is Multilingual BERT?
TLDR
This work shows that mBERT representations can be split into a language-specific component and a language -neutral component, and that the language-neutral component is sufficiently general in terms of modeling semantics to allow high-accuracy word-alignment and sentence retrieval but is not yet good enough for the more difficult task of MT quality estimation. Expand
DEPFIX: A System for Automatic Correction of Czech MT Outputs
TLDR
An improved version of DEPFIX (Marecek et al., 2011), a system for automatic rule-based post-processing of English-to-Czech MT outputs designed to increase their fluency and is able to improve the quality of the state-of-the-art MT systems. Expand
Universal Dependencies 1.4
TLDR
The annotation scheme is based on (universal) Stanford dependencies, Google universal part-of-speech tags, and the Interset interlingua for morphosyntactic tagsets for morpho-lingual tagsets. Expand
Adaptation of machine translation for multilingual information retrieval in the medical domain
TLDR
Most of the MT techniques employed in the experiments improve MT of medical search queries and translation quality does not appear to correlate with the IR performance - better translation does not necessarily yield better retrieval. Expand
HamleDT 2.0: Thirty Dependency Treebanks Stanfordized
TLDR
The stanfordization of HamleDT 2.0 is regarded as generally successful, although it admits several shortcomings, especially in the distinction between direct and indirect objects, that have to be addressed in future. Expand
Findings of the WMT 2017 Biomedical Translation Shared Task
TLDR
The second edition of the Biomedical Translation task in the Conference of Machine Translation focused on the automatic translation of biomedical-related documents between English and various European languages. Expand
...
1
2
3
4
5
...