Corpus ID: 202565504

Problems with automating translation of movie/TV show subtitles

  title={Problems with automating translation of movie/TV show subtitles},
  author={Prabhakar Gupta and Mayank Sharma and Kartik Pitale and Keshav Kumar},
We present 27 problems encountered in automating the translation of movie/TV show subtitles. [...] Key Result We show that the systems working at the frontiers of Natural Language Processing do not perform well for subtitles and require some post-processing solutions for redressal of these problemsExpand
DeepSubQE: Quality estimation for subtitle translations
This work shows how existing QE methods are inadequate and proposes the method DeepSubQE as a system to estimate quality of translation given subtitles data for a pair of languages and creates a hybrid network which learns semantic and syntactic features of bilingual data and compares it with only-LSTM and only-CNN networks. Expand
Detecting over/under-translation errors for determining adequacy in human translations
This work presents a novel approach to detecting over and under translations (OT/UT) as part of adequacy error checks in translation evaluation, and aims to identify OT/UT errors from human translated video subtitles with high error recall. Expand
A Context-Sensitive Real-Time Spell Checker with Language Adaptability
  • Prabhakar Gupta
  • Computer Science, Mathematics
  • 2020 IEEE 14th International Conference on Semantic Computing (ICSC)
  • 2020
A novel language adaptable spell checking system that detects spelling errors and suggests context-sensitive corrections in real-time and can be extended to new languages with minimal language-specific processing is presented. Expand
Tigrigna language spellchecker and correction system for mobile phone devices
This paper presents on the implementation of spellchecker and corrector system in mobile phone devices, such as a smartphone for the low-resourced Tigrigna language, and shows clearly that the system model is efficient in spellchecking and correcting relevant suggested correct words and reduces the misspelled input words for writing Tigrigninga words on mobile phone device. Expand


Machine Translation of TV Subtitles for Large Scale Production
The work on building and employing Statistical Machine Translation systems for TV subtitles in Scandinavia, which have built translation systems for Danish, English, Norwegian and Swedish, is described. Expand
Statistical Machine Translation of Subtitles: From OpenSubtitles to TED
The results show that OpenSubtiles and TED contain very different kinds of subtitles that warrant a subclassification of the genre, and a closer look at the translation of questions as a sentence type with special word order found the BLEU scores for questions to be higher than for random sentences. Expand
The Automatic Translation of Film Subtitles. A Machine Translation Success Story?
  • M. Volk
  • History, Computer Science
  • J. Lang. Technol. Comput. Linguistics
  • 2009
This paper investigates whether the automatic translation of film subtitles can be considered a machine translation success story, and argues that the text genre "film subtitles" is well suited for MT, in particular for Statistical MT. Expand
Unsupervised Quality Estimation Without Reference Corpus for Subtitle Machine Translation Using Word Embeddings
A novel automated evaluation method of calculating edits to indicate translation quality and human aided post edit requirements to perfect machine translation is proposed. Expand
Evaluation of Machine Translation Performance Across Multiple Genres and Languages
The results of these experiments show that the multi-genre benchmarks used to evaluate the impact of genre differences on machine translation (MT) can serve to advance research on text genre adaptation for MT. Expand
German Compounds and Statistical Machine Translation. Can they get along?
This paper summarizes the results of the experiments and attempts to yield better translations of German nominal compounds into Spanish and shows how the approach improves by up to 1.4 Bleu points with respect to the baseline. Expand
Neural Machine Translation into Language Varieties
This work investigates the problem of training neural machine translation from English to specific pairs of language varieties, assuming both labeled and unlabeled parallel texts, and low-resource conditions, and shows significant BLEU score improvements over baseline systems when translation into similar languages is learned as a multilingual task with shared representations. Expand
Addressing the Rare Word Problem in Neural Machine Translation
This paper proposes and implements an effective technique to address the problem of end-to-end neural machine translation's inability to correctly translate very rare words, and is the first to surpass the best result achieved on a WMT’14 contest task. Expand
Improving Sequence to Sequence Neural Machine Translation by Utilizing Syntactic Dependency Information
This work proposes an approach that utilize more grammatical information such as syntactic dependencies, so that the output can be generated based on more abundant information and solved the two existing problems, ineffective translation for long sentences and over-translation in Neural Machine Translation. Expand
Modeling Coverage for Neural Machine Translation
This paper proposes coverage-based NMT, which maintains a coverage vector to keep track of the attention history and improves both translation quality and alignment quality over standard attention- based NMT. Expand