José João Almeida

Learn More
This document presents the TerminUM project and the work done in its statistical word aligner workbench (NATools). It shows a variety of alignment methods for parallel corpora and discusses the resulting terminological dictionaries and their use: evaluation of sentence translations; construction of a multi-level navigation system for linguistic studies or(More)
Languages are born, evolve and, eventually, die. During this evolution their spelling rules (and sometimes the syntactic and semantic ones) change, putting old documents out of use. In Portugal, a pair of political agreements with Brazil forced relevant changes on the way the Portuguese language is written. In this article we will detail these two(More)
According to recent research, nearly 95 percent of a corporate information is stored in documents. Further studies indicate that companies spent between 6 and 10 percent of their gross revenues printing and distributing documents in several ways: web and cdrom publishing, database storage and retrieval and printing. In this context documents exist in some(More)
News can contain information which may provide an indication of the future direction of a share or stock market index. The possibility of predicting future stock market prices has attracted an increasing numbers of industry practitioners and academic researchers to this area of investigation. Popular approaches have relied upon either: models constructed(More)
Resumen: Los corpora paralelos son fuentes ricas en recursos de traducción. Este documento presenta una metodología para la extracción de sintagmas nominales bil-ingües (candidatos terminológicos) a partir de corpora paralelos, utilizando reglas de traducción. Los modelos propuestos en este trabajo especifican las alteraciones en el orden de las palabras(More)
The analysis of business/financial news has become a popular area of research because of the possibility to infer the future prospects of companies, economies and economic actors in general on information contained in the media. The classical approaches rely upon a "coarse" polarity classification of a news story, however this may not be an optimal solution(More)
In this paper we describe how Dicionário-Aberto, an online dictionary for the Portuguese language, is being used as the base to construct diverse resources that are relevant in the processing of the Portuguese language. We will briefly present its history, explaining how we got here. Then, we will describe the resources already available to download and(More)