Skip to search formSkip to main contentSkip to account menu

Europarl Corpus

The Europarl Corpus is a corpus (set of documents) that consists of the proceedings of the European Parliament from 1996 to the present. In its first… 
Wikipedia (opens in a new tab)

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
2013
2013
Due to the formal variability and the irregular behaviour of MWEs on different levels of linguistic description, they are a… 
2012
2012
The Web is an ever increasing, dynamically changing, multilingual repository of text. There have been several approaches to… 
2012
2012
The technique of pruning phrase tables that are used for statistical machine translation (SMT) can achieve substantial reductions… 
2011
2011
Since the 1950s, linguists have been using short lists (40‐200 items) of basic vocabulary as the central component in a… 
2011
2011
While several recent works on dealing with large bilingual collections of texts, e.g. (Smith et al., 2010), seek for extracting… 
2011
2011
In this paper, we question the homogeneity of a large parallel corpus by measuring the similarity between various sub-parts. We… 
2011
2011
State-of-the-art Machine Translation (MT) systems are still far from being perfect. An alternative is the so-called Interactive… 
2006
2006
We are presenting a new, hybrid alignment architecture for aligning bilingual, linguistically annotated parallel corpora. It is… 
2006
2006
We present a simple and efficient approach for deriving bilingual dictionaries from sentence-aligned parallel text by extending… 
2006
2006
Following last year's participation in the monolingual question answering (QA) track of CLEF, where Priberam's QA system has…