Europarl Corpus

The Europarl Corpus is a corpus (set of documents) that consists of the proceedings of the European Parliament from 1996 to the present. In its first…

Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.

2013

Using a rich feature set for the identification of German MWEs

Due to the formal variability and the irregular behaviour of MWEs on different levels of linguistic description, they are a…

2012

Harvesting Parallel Text in Multiple Languages with Limited Supervision

Luciano BarbosaV. SridharM. YarmohammadiS. Bangalore
International Conference on Computational…
2012
Corpus ID: 2528341

The Web is an ever increasing, dynamically changing, multilingual repository of text. There have been several approaches to…

2012

Conditional Significance Pruning: Discarding More of Huge Phrase Tables

Howard Johnson
Conference of the Association for Machine…
2012
Corpus ID: 37568633

The technique of pruning phrase tables that are used for statistical machine translation (SMT) can achieve substantial reductions…

2011

Estimating Language Relationships from a Parallel Corpus. A Study of the Europarl Corpus

Taraka RamaL. Borin
Nordic Conference of Computational Linguistics
2011
Corpus ID: 2288422

Since the 1950s, linguists have been using short lists (40‐200 items) of basic vocabulary as the central component in a…

2011

Identifying Parallel Documents from a Large Bilingual Collection of Texts: Application to Parallel Article Extraction in Wikipedia.

Alexandre PatryP. Langlais
BUCC@ACL
2011
Corpus ID: 7002961

While several recent works on dealing with large bilingual collections of texts, e.g. (Smith et al., 2010), seek for extracting…

2011

How Comparable are Parallel Corpora? Measuring the Distribution of General Vocabulary and Connectives

Bruno CartoniS. ZuffereyT. MeyerAndrei Popescu-Belis
BUCC@ACL
2011
Corpus ID: 9331674

In this paper, we question the homogeneity of a large parallel corpus by measuring the similarity between various sub-parts. We…

2011

An Interactive Machine Translation System with Online Learning

Daniel Ortiz-MartínezLuis A. LeivaVicente AlabauI. García-VareaF. Casacuberta
Annual Meeting of the Association for…
2011
Corpus ID: 10521251

State-of-the-art Machine Translation (MT) systems are still far from being perfect. An alternative is the so-called Interactive…

2006

ATLAS – A New Text Alignment Architecture

We are presenting a new, hybrid alignment architecture for aligning bilingual, linguistically annotated parallel corpora. It is…

2006

Dictionary acquisition using parallel text and co-occurrence statistics

Chris BiemannU. Quasthoff
Nordic Conference of Computational Linguistics
2006
Corpus ID: 6407436

We present a simple and efficient approach for deriving bilingual dictionaries from sentence-aligned parallel text by extending…

2006

Priberam's Question Answering System in a Cross-Language Environment

A. CassanH. Figueira D. Vidal
Conference and Labs of the Evaluation Forum
2006
Corpus ID: 14237969

Following last year's participation in the monolingual question answering (QA) track of CLEF, where Priberam's QA system has…

Europarl Corpus

Related topics

Papers overview