Dual Subtitles as Parallel Corpora

Abstract

In this paper, we leverage the existence of dual subtitles as a source of parallel data. Dual subtitles present viewers with two languages simultaneously, and are generally aligned in the segment level, which removes the need to automatically perform this alignment. This is desirable as extracted parallel data does not contain alignment errors present in… (More)

Topics

5 Figures and Tables

Slides referencing similar topics