Learn More
Languages are born, evolve and, eventually, die. During this evolution their spelling rules (and sometimes the syntactic and semantic ones) change, putting old documents out of use. In Portugal, a pair of political agreements with Brazil forced relevant changes on the way the Portuguese language is written. In this article we will detail these two(More)
This document presents the TerminUM project and the work done in its statistical word aligner workbench (NATools). It shows a variety of alignment methods for parallel corpora and discusses the resulting terminological dictionaries and their use: evaluation of sentence translations; construction of a multi-level navigation system for linguistic studies or(More)
Dissertação submetidà a Universidade do Minho para obtenção do grau de Mestre em Informática, elaborada sob a orientação de Abstract Resumo Parallel corpora are valuable resources on natural language processing and, in special, on the translation area. They can be used not only by translators, but also analyzed and processed by computers to learn and(More)
According to recent research, nearly 95 percent of a corporate information is stored in documents. Further studies indicate that companies spent between 6 and 10 percent of their gross revenues printing and distributing documents in several ways: web and cdrom publishing, database storage and retrieval and printing. In this context documents exist in some(More)
Multilingual resources are useful for linguistic studies, translation, and many other tasks. Unfortunately, these resources are difficult to obtain and organize. In this document we describe a set of tools designed to help in the task of mining bilingual resources from the web, from a specific site, from a file system, from a list of URLs, or from a(More)
Translation Memories are very useful for translators but are difficult to share and reuse in a community of translators. This article presents the concept of Distributed Translation Memories, where all users can contribute and sharing translations. Implementation details using WebServices are shown, as well as an example of a distributed system between(More)
Resumo Neste trabalho apresentamos o projecto Procura-PALvras (P-PAL) cujo principal objectivó e de-senvolver uma ferramenta electrónica que disponibilize informação sobré ındices psicolinguísticos ob-jectivos e subjectivos de palavras do Português Europeu (PE). O P-PAL será disponibilizado gratuita-mentè a comunidade científica num formato amigável a(More)
Music Classification is a particular area of Computational Musicology that provides valuable insights about the evolving of composition patterns and assists in catalogue generation. The proposed work detaches from former works by classifying music based on music score information. Text Mining techniques support music score processing while Classification(More)
Resumen: Los corpora paralelos son fuentes ricas en recursos de traducción. Este documento presenta una metodología para la extracción de sintagmas nominales bil-ingües (candidatos terminológicos) a partir de corpora paralelos, utilizando reglas de traducción. Los modelos propuestos en este trabajo especifican las alteraciones en el orden de las palabras(More)