Results of the Translation Inference Across Dictionaries 2019 Shared Task
@inproceedings{Gracia2019ResultsOT, title={Results of the Translation Inference Across Dictionaries 2019 Shared Task}, author={Jorge Gracia and Besim Kabashi and Ilan Kernerman and Marta Lanau-Coronas and Dorielle Lonke}, booktitle={TIAD@LDK}, year={2019} }
The objective of the Translation Inference Across Dictionar- ies (TIAD) shared task is to explore and compare methods and tech- niques that infer translations indirectly between language pairs, based on other bilingual/multilingual lexicographic resources. In its second, 2019, edition the participating systems were asked to generate new transla- tions automatically among three languages - English, French, Portuguese - based on known indirect translations contained in the Apertium RDF graph. The…
7 Citations
Multi-Strategy system for translation inference across dictionaries
- Computer ScienceGLOBALEX
- 2020
Four different strategies proposed to the TIAD 2020 Shared Task for automatic translation inference across dictionaries are described, based on the analysis of Apertium RDF graph, taking advantage of characteristics such as translation using multiple paths, synonyms and similarities between lexical entries from different lexicons and cardinality of possible translations through the graph.
Graph Exploration and Cross-lingual Word Embeddings for Translation Inference Across Dictionaries
- Computer ScienceGLOBALEX
- 2020
The task evaluation results show that graph exploration is very effective, accomplishing relatively high precision and recall values in comparison with the other participating systems, while cross-lingual word embeddings reaches high precision but smaller recall.
Bilingual dictionary generation and enrichment via graph exploration
- Computer Science
- 2021
This work explores techniques that exploit the graph nature of bilingual dictionaries to automatically infer new links (translations) and builds upon a cycle density based method: partitioning the graph into biconnected components for a speedup, and simplifying the pipeline through a careful structural analysis that reduces hyperparameter tuning requirements.
Linking Discourse Marker Inventories
- LinguisticsLDK
- 2021
The paper describes the first comprehensive edition of machine-readable discourse marker lexicons, to explore techniques for translation inference to be applied to this particular group of lexical resources that was previously largely neglected in the context of Linguistic Linked (Open) Data.
NUIG at TIAD: Combining Unsupervised NLP and Graph Metrics for Translation Inference
- Computer ScienceGLOBALEX
- 2020
The NUIG system includes graph-based metrics calculated using novel algorithms, with an unsupervised document embedding tool called ONETA and an un supervised multi-way neural machine translation method at the TIAD shard task.
Basic Linguistic Resources and Baselines for Bhojpuri, Magahi and Maithili for Natural Language Processing
- LinguisticsArXiv
- 2020
This work collected corpora for these three languages from various sources and cleaned them to the extent possible, without changing the data in them, and calculated some basic statistical measures for these corpora at character, word, syllable, and morpheme levels to give an indication of linguistic properties such as morphological, lexical, phonological, and syntactic complexities.
Linguistic Resources for Bhojpuri, Magahi, and Maithili: Statistics about Them, Their Similarity Estimates, and Baselines for Three Applications
- Linguistics, Computer ScienceACM Trans. Asian Low Resour. Lang. Inf. Process.
- 2021
The main contribution of the work is the creation of basic resources for facilitating further language processing research for these languages, providing some quantitative measures about them and their similarities among themselves and with Hindi.
References
SHOWING 1-10 OF 20 REFERENCES
The apertium bilingual dictionaries on the web of data
- Computer Science, LinguisticsSemantic Web
- 2018
This paper describes the conversion of the Apertium family of bilingual dictionaries and lexicons into RDF (Resource Description Framework) and how their data have been made accessible on the Web as linked data.
Exploring cross-lingual word embeddings for the inference of bilingual dictionaries
- Computer Science, LinguisticsTIAD@LDK
- 2019
We describe four systems to generate automatically bilingual dictionaries based on existing ones: three transitive systems differing only in the pivot language used, and a system based on a different…
TIAD Shared Task 2019: orthonormal explicit topic analysis for translation inference across dictionaries
- Computer ScienceTIAD@LDK
- 2019
The Orthonormal Explicit Topic Anal- ysis (ONETA) model is used, which has been shown to be the state-of-the-art explicit topic model through its elimination of correlations between top- ics.
Translation inference through multi-lingual word embedding similarity
- Computer ScienceTIAD@LDK
- 2019
A multi-lingual word embedding space is constructed by projecting new languages in the feature space of a language for which a pretrained em- bedding model exists by using the similarity of the word embeddings to predict candidate translations.
Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries
- Computer ScienceLREC
- 2016
The experiments presented here exploit the properties of the Apertium RDF Graph, principally cycle density and nodes’ degree, to automatically generate new translation relations between words, and…
The CQC Algorithm: Cycling in Graphs to Semantically Enrich and Enhance a Bilingual Dictionary: Extended abstract
- Computer Science, LinguisticsIJCAI
- 2012
Cycles and Quasi-Cycles (CQC), a novel algorithm for the automated disambiguation of ambiguous translations in the lexical entries of a bilingual machine-readable dictionary, and is successfully applied to the task of synonym extraction.
Compiling a Massive, Multilingual Dictionary via Probabilistic Inference
- Computer Science, LinguisticsACL
- 2009
The paper introduces a novel algorithm that solves this problem for 10,000,000 words in more than 1,000 languages and yields PanDictionary, a novel multilingual dictionary.
TIAD 2019 shared task: Leveraging knowledge graphs with neural machine translation for automatic multilingual dictionary generation
- Computer ScienceTIAD@LDK
- 2019
Three methods based on graph analysis and neural machine translation are presented and it is shown that they can generate translations without parallel data.
Bilingual dictionary generation and enrichment via graph exploration
- Computer Science
- 2021
This work explores techniques that exploit the graph nature of bilingual dictionaries to automatically infer new links (translations) and builds upon a cycle density based method: partitioning the graph into biconnected components for a speedup, and simplifying the pipeline through a careful structural analysis that reduces hyperparameter tuning requirements.
Apertium: a free/open-source platform for rule-based machine translation
- Computer ScienceMachine Translation
- 2011
The Apertium platform is summarised: the translation engine, the encoding of linguistic data, and the tools developed around the platform are discussed.