• Publications
  • Influence
Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation
TLDR
A model for constructing vector representations of words by composing characters using bidirectional LSTMs that requires only a single vector per character type and a fixed set of parameters for the compositional model, which yields state- of-the-art results in language modeling and part-of-speech tagging. Expand
A linguistically motivated taxonomy for Machine Translation error analysis
TLDR
This paper significantly extends previous error taxonomies so that translation errors associated with Romance language specificities can be accommodated and carries out an extensive analysis of the errors generated by four different systems. Expand
Automating live and batch subtitling of multimedia contents for several European languages
TLDR
This article contains a detailed description of the live and batch automatic subtitling applications developed by the SAVAS consortium for several European languages based on proprietary LVCSR technology specifically tailored to the subtitled needs, together with results of their quality evaluation. Expand
BP2EP - Adaptation of Brazilian Portuguese texts to European Portuguese
TLDR
The work carried out at LF from INESC-ID is described, in the scope of the PT-STAR project, with the objective of obtaining larger and better training resources by adapting Brazilian TED talks translations to European Portuguese. Expand
Translation errors from English to Portuguese: an annotated corpus
TLDR
Although Google’s overall performance was better in the translation task, there are some error types that Moses was better at coping with, specially discourse level errors. Expand
Towards a general and extensible phrase-extraction algorithm
TLDR
This paper presents a general and extensible phrase extraction algorithm, where several control points are highlighted, which allows the simulation of previous approaches and proposes alternative heuristics, showing their impact on the final translation results. Expand
Reordering Modeling using Weighted Alignment Matrices
TLDR
This work proposes two algorithms to generate the well known MSD reordering model using weighted alignment matrices, and shows that these methods produce more accurate reordering models. Expand
CLUE-Aligner : An Alignment Tool to Annotate Pairs of Paraphrastic and Translation Units
Currently available alignment tools and procedures for marking-up alignments overlook non-contiguous multiword units for being too complex within the bounds of the proposed alignment methodologies.Expand
High-Performance High-Volume Layered Corpora Annotation
TLDR
This work proposes a framework that simplifies the integration of independently existing NLP tools to build language-independent NLP systems capable of creating layered annotations, that allows the development of scalable N LP systems, that executes N LP tools in parallel, while offering an easy-to-use programming environment and a transparent handling of distributed computing problems. Expand
The INESC-ID machine translation system for the IWSLT 2010
TLDR
The main goal for this evaluation was to employ several state-of-the-art methods applied to phrase-based machine translation in order to improve the translation quality. Expand
...
1
2
...