Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 225,346,904 papers from all fields of science
Search
Sign In
Create Free Account
Europarl Corpus
The Europarl Corpus is a corpus (set of documents) that consists of the proceedings of the European Parliament from 1996 to the present. In its first…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
4 relations
BLEU
Statistical machine translation
Text corpus
Word-sense disambiguation
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
2013
2013
Using a rich feature set for the identification of German MWEs
Fabienne Cap
,
Marion Weller
,
U. Heid
Machine Translation Summit
2013
Corpus ID: 29115909
Due to the formal variability and the irregular behaviour of MWEs on different levels of linguistic description, they are a…
Expand
2012
2012
Harvesting Parallel Text in Multiple Languages with Limited Supervision
Luciano Barbosa
,
V. Sridhar
,
M. Yarmohammadi
,
S. Bangalore
International Conference on Computational…
2012
Corpus ID: 2528341
The Web is an ever increasing, dynamically changing, multilingual repository of text. There have been several approaches to…
Expand
2012
2012
Conditional Significance Pruning: Discarding More of Huge Phrase Tables
Howard Johnson
Conference of the Association for Machine…
2012
Corpus ID: 37568633
The technique of pruning phrase tables that are used for statistical machine translation (SMT) can achieve substantial reductions…
Expand
2011
2011
Estimating Language Relationships from a Parallel Corpus. A Study of the Europarl Corpus
Taraka Rama
,
L. Borin
Nordic Conference of Computational Linguistics
2011
Corpus ID: 2288422
Since the 1950s, linguists have been using short lists (40‐200 items) of basic vocabulary as the central component in a…
Expand
2011
2011
Identifying Parallel Documents from a Large Bilingual Collection of Texts: Application to Parallel Article Extraction in Wikipedia.
Alexandre Patry
,
P. Langlais
BUCC@ACL
2011
Corpus ID: 7002961
While several recent works on dealing with large bilingual collections of texts, e.g. (Smith et al., 2010), seek for extracting…
Expand
2011
2011
How Comparable are Parallel Corpora? Measuring the Distribution of General Vocabulary and Connectives
Bruno Cartoni
,
S. Zufferey
,
T. Meyer
,
Andrei Popescu-Belis
BUCC@ACL
2011
Corpus ID: 9331674
In this paper, we question the homogeneity of a large parallel corpus by measuring the similarity between various sub-parts. We…
Expand
2011
2011
An Interactive Machine Translation System with Online Learning
Daniel Ortiz-Martínez
,
Luis A. Leiva
,
Vicente Alabau
,
I. García-Varea
,
F. Casacuberta
Annual Meeting of the Association for…
2011
Corpus ID: 10521251
State-of-the-art Machine Translation (MT) systems are still far from being perfect. An alternative is the so-called Interactive…
Expand
2006
2006
ATLAS – A New Text Alignment Architecture
Bettina Schrader
Annual Meeting of the Association for…
2006
Corpus ID: 3196409
We are presenting a new, hybrid alignment architecture for aligning bilingual, linguistically annotated parallel corpora. It is…
Expand
2006
2006
Dictionary acquisition using parallel text and co-occurrence statistics
Chris Biemann
,
U. Quasthoff
Nordic Conference of Computational Linguistics
2006
Corpus ID: 6407436
We present a simple and efficient approach for deriving bilingual dictionaries from sentence-aligned parallel text by extending…
Expand
2006
2006
Priberam's Question Answering System in a Cross-Language Environment
A. Cassan
,
H. Figueira
,
+4 authors
D. Vidal
Conference and Labs of the Evaluation Forum
2006
Corpus ID: 14237969
Following last year's participation in the monolingual question answering (QA) track of CLEF, where Priberam's QA system has…
Expand
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE