Jenna Kanerva

Learn More
There has been substantial recent interest in annotation schemes that can be applied consistently to many languages. Building on several recent efforts to unify morphological and syntactic annotation, the Universal Dependencies (UD) project seeks to introduce a cross-linguistically applicable part-of-speech tagset, feature inventory, and set of dependency(More)
This paper summarises the contributions of the teams at the University of Helsinki, Uppsala University and the University of Turku to the news translation tasks for translating from and to Finnish. Our models address the problem of treating morphology and data coverage in various ways. We introduce a new efficient tool for word alignment and discuss(More)
OBJECTIVES In this paper, we study the development and domain-adaptation of statistical syntactic parsers for three different clinical domains in Finnish. METHODS AND MATERIALS The materials include text from daily nursing notes written by nurses in an intensive care unit, physicians' notes from cardiology patients' health records, and daily nursing notes(More)
In this paper we present our winning system in the WMT16 Shared Task on CrossLingual Pronoun Prediction, where the objective is to predict a missing target language pronoun based on the target and source sentences. Our system is a deep recurrent neural network, which reads both the source language and target language context with a softmax layer making the(More)
In this paper, we report on the development of a large-scale Finnish Internet parsebank, currently consisting of 1.5 billion tokens in 116 million sentences. The data is fully morphologically and syntactically analyzed and it has been used to extract flat and syntactic n-gram collections, as well as verb-argument and nounargument n-grams. Additionally,(More)
This paper describes baseline systems for Finnish-English and English-Finnish machine translation using standard phrasebased and factored models including morphological features. We experiment with compound splitting and morphological segmentation and study the effect of adding noisy out-of-domain data to the parallel and the monolingual training data. Our(More)
We present a syntactic analysis query toolkit geared specifically towards massive dependency parsebanks and morphologically rich languages. The query language allows arbitrary tree queries, including negated branches, and is suitable for querying analyses with rich morphological annotation. Treebanks of over a million words can be comfortably queried on a(More)
In this paper, we introduce several vector space manipulation methods that are applied to trained vector space models in a post-hoc fashion, and present an application of these techniques in semantic role labeling for Finnish and English. Specifically, we show that the vectors can be circularly shifted to encode syntactic information and subsequently(More)
In this paper we introduce our system capable of producing semantic parses of sentences using three different annotation formats. The system was used to participate in the SemEval-2014 Shared Task on broad-coverage semantic dependency parsing and it was ranked third with an overall F1-score of 80.49%. The system has a pipeline architecture, consisting of(More)
The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, the task was devoted to learning dependency parsers for a large number of languages, in a real-world setting without any gold-standard annotation on input. All test sets(More)