Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation
- Antonio Toral, Sheila Castilho, Ke Hu, Andy Way
- Computer ScienceConference on Machine Translation
- 30 August 2018
We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and…
A proposal to automatically build and maintain gazetteers for Named Entity Recognition by using Wikipedia
- Antonio Toral, R. Muñoz
- Computer ScienceWorkshop On New Text Wikis And Blogs And Other…
- 2006
This paper describes a method to automatically create and maintain gazetteers for Named Entity Recognition (NER) based on the analysis of an on-line encyclopedia entries by using a noun hierarchy and optionally a PoS tagger.
Automatic Extraction of Arabic Multiword Expressions
- Mohammed Attia, Antonio Toral, L. Tounsi, Pavel Pecina, Josef van Genabith
- Computer ScienceMWE@COLING
- 1 August 2010
This paper investigates the automatic acquisition of Arabic Multiword Expressions by proposing three complementary approaches to extract MWEs from available data resources and measuring the quality and coverage of the output against gold standards.
A Multifaceted Evaluation of Neural versus Phrase-Based Machine Translation for 9 Language Directions
- Antonio Toral, V. M. Sánchez-Cartagena
- Computer ScienceConference of the European Chapter of the…
- 11 January 2017
It is found that translations produced by neural machine translation systems are considerably different, more fluent and more accurate in terms of word order compared to those produced by phrase-based systems.
A Set of Recommendations for Assessing Human-Machine Parity in Language Translation
- Samuel Läubli, Sheila Castilho, Graham Neubig, Rico Sennrich, Qinlan Shen, Antonio Toral
- Computer ScienceJournal of Artificial Intelligence Research
- 23 March 2020
It is shown that the professional human translations contained significantly fewer errors, and that perceived quality in human evaluation depends on the choice of raters, the availability of linguistic context, and the creation of reference translations.
Fine-Grained Human Evaluation of Neural Versus Phrase-Based Machine Translation
- Filip Klubicka, Antonio Toral, V. M. Sánchez-Cartagena
- Computer SciencePrague Bulletin of Mathematical Linguistics
- 1 June 2017
This work compares three approaches to statistical machine translation by performing a fine-grained manual evaluation via error annotation of the systems’ outputs by finding the best performing system that reduces the errors produced by the worst system by 54%.
Translators’ perceptions of literary post-editing using statistical and neural machine translation
- Joss Moorkens, Antonio Toral, Sheila Castilho, Andy Way
- PsychologyTranslation Spaces
- 28 November 2018
In the context of recent improvements in the quality of machine translation (MT) output and new use cases being found for that
output, this article reports on an experiment using statistical and…
Post-editing Effort of a Novel With Statistical and Neural Machine Translation
- Antonio Toral, M. Wieling, Andy Way
- ArtFrontiers in Digital Humanities
- 15 May 2018
This first experiment in the literature in which a novel is translated automatically and then post-edited by professional literary translators is conducted, and both MT approaches result in increases in translation productivity: PBMT by 18%, and NMT by 36%.
Quantitative fine-grained human evaluation of machine translation systems: a case study on English to Croatian
- Filip Klubicka, Antonio Toral, V. M. Sánchez-Cartagena
- Computer ScienceMachine Translation
- 2 February 2018
A quantitative fine-grained manual evaluation approach to comparing the performance of different machine translation (MT) systems and shows that the best-performing system (neural) reduces the errors produced by the worst system (pure phrase-based) by more than half (54%).
Named Entity WordNet
- Antonio Toral, R. Muñoz, M. Monachini
- Computer ScienceInternational Conference on Language Resources…
- 1 May 2008
This paper presents the automatic extension of Princeton WordNet with Named Entities (NEs) and explores different aspects of the methodology such as the treatment of polysemous terms, the identification of hyponyms within the Wikipedia categorization system, the Identification of Wikipedia articles which are NEs and the design of a NE repository compliant with the LMF ISO standard.
...
...