Word Alignment by Fine-tuning Embeddings on Parallel Corpora

@article{Dou2021WordAB,
  title={Word Alignment by Fine-tuning Embeddings on Parallel Corpora},
  author={Zi-Yi Dou and Graham Neubig},
  journal={ArXiv},
  year={2021},
  volume={abs/2101.08231}
}
Word alignment over parallel corpora has a wide variety of applications, including learning translation lexicons, cross-lingual transfer of language processing tools, and automatic evaluation or analysis of translation outputs. The great majority of past work on word alignment has worked by performing unsupervised learning on parallel text. Recently, however, other work has demonstrated that pre-trained contextualized word embeddings derived from multilingually trained language models (LMs… 

Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment

The proposed Cross-Align is proposed to model deep interactions between the input sentence pairs, in which the source and target sentences are encoded separately with the shared self-attention modules in the shallow layers, while cross-lingual interactions are explicitly con-structed by the cross-att attention modules inThe upper layers.

Extending Word-Level Quality Estimation for Post-Editing Assistance

Compared to original word-level QE, the new task is able to directly point out editing operations, thus improves post-editing assistance efficiency.

Multilingual Transformer Encoders: a Word-Level Task-Agnostic Evaluation

This work proposes a word-level task-agnostic method to evaluate the alignment of contextualized representations built by transformer-based models and shows that this method provides more accurate translated word pairs than previous methods to evaluate word- level alignment.

Third-Party Aligner for Neural Word Alignments

This paper proposes to use word alignments generated by a third-party word aligner to supervise the neural word alignment training, and shows that this approach can surprisingly do self-correction over the third- party supervision, leading to better performance than various third- Party word aligners, including the currently best one.

Zero-shot Cross-Lingual Counterfactual Detection via Automatic Extraction and Prediction of Clue Phrases

This paper proposes a novel loss function based on the clue phrase prediction for generalising a CFD model trained on a source language to multiple target languages, without requiring any human-labelled data.

An automatic model and Gold Standard for translation alignment of Ancient Greek

A fine-tuning strategy that employs unsupervised training with mono- and bilingual texts and supervised training using manually aligned sentences is proposed, and it achieved good results on language pairs that were not part of the training data.

Zero-shot Cross-lingual Conversational Semantic Role Labeling

The usefulness of CSRL to non-Chinese conversational tasks such as the question-in-context rewriting task in English and the multi-turn dialogue response generation tasks in English, German and Japanese is improved by incorporating the CSRL information into the downstream conversation-based models.

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

This work explores techniques including data projection and self-training, and how different pretrained encoders impact them, and finds that a combination of approaches leads to better performance than any one cross-lingual strategy in particular.

Controllable Abstractive Summarization Using Multilingual Pretrained Language Model

This work shows that CTRLSum improves baseline summarization system in four languages: English, Indonesian, Spanish, and French by 1.57 in terms of average ROUGE-1, with the Indonesian model achieving state-of-the-art results.

Universal Proposition Bank 2.0

This paper introduces Universal Proposition Bank 2.0 (UP2.0), with significant enhancements over UP1.0, including propbanks with higher quality by using a state-of-the-art monolingual SRL and improved auto-generation of annotations; expanded language coverage (from 7 to 9 languages); and span annotation for the decoupling of syntactic analysis.
...

References

SHOWING 1-10 OF 84 REFERENCES

SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings

This work proposes word alignment methods that require no parallel data and finds that alignments created from embeddings are superior for four and comparable for two language pairs compared to those produced by traditional statistical aligners – even with abundant parallel data.

RoBERTa: A Robustly Optimized BERT Pretraining Approach

It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

Cross-lingual Language Model Pretraining

This work proposes two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingsual language model objective.

Optimization of word alignment clues

  • J. Tiedemann
  • Computer Science
    Natural Language Engineering
  • 2005
The clue alignment approach and the optimization of its parameters using a genetic algorithm is described, which shows a significant improvement of about 6% in F-scores compared to the baseline produced by statistical word alignment.

Alignment-based Profiling of Europarl Data in an English-Swedish Parallel Corpus

This paper profiles the Europarl part of an English-Swedish parallel corpus and compares it with three other subcorpora of the sameparallel corpus. We first describe our method for comparison which

A Simple, Fast, and Effective Reparameterization of IBM Model 2

We present a simple log-linear reparameterization of IBM Model 2 that overcomes problems arising from Model 1’s strong assumptions and Model 2’s overparameterization. Efficient inference, likelihood

Empirical lower bounds on translation unit error rate for the full class of inversion transduction grammars

This paper estimates the difference and shows that the average reduction in lower bounds on TUER is 2.48 in absolute difference (16.01 in average parse failure rate).

Explicit Alignment Objectives for Multilingual Bidirectional Encoders

A new method for learning multilingual encoders, AMBER (Aligned Multilingual Bidirectional EncodeR), trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities is presented.

End-to-End Neural Word Alignment Outperforms GIZA++

This work presents the first end-to-end neural word alignment method that consistently outperforms GIZA++ on three data sets and repurposes a Transformer model trained for supervised translation to also serve as an unsupervised word alignment model in a manner that is tightly integrated and does not affect translation quality.

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization

The Cross-lingual TRansfer Evaluation of Multilingual Encoders XTREME benchmark is introduced, a multi-task benchmark for evaluating the cross-lingually generalization capabilities of multilingual representations across 40 languages and 9 tasks.
...