Multilingual AMR Parsing with Noisy Knowledge Distillation

  title={Multilingual AMR Parsing with Noisy Knowledge Distillation},
  author={Deng Cai and Xin Li and Jackie Chun-Sing Ho and Lidong Bing and Wai Lam},
We study multilingual AMR parsing from the perspective of knowledge distillation, where the aim is to learn and improve a multilingual AMR parser by using an existing English parser as its teacher. We constrain our exploration in a strict multilingual setting: there is but one model to parse all different languages including English. We identify that noisy input and precise output are the key to successful distillation. Together with extensive pre-training, we obtain an AMR parser whose… 

Figures and Tables from this paper

Maximum Bayes Smatch Ensemble Distillation for AMR Parsing

This paper proposes to overcome diminishing returns of silver data by combining Smatch-based ensembling techniques with ensemble distillation, and shows that it can produce gains rivaling those of human annotated data for QALD-9 and achieve a new state-of-the-art for BioAMR.

Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation

Experimental results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic textual similarity and transfer tasks.

xAMR: Cross-lingual AMR End-to-End Pipeline

A cross-lingual AMR (xAMR) pipeline that incorporates the intuitive translation approach to and from the English language as a baseline for further utilization of the AMR parsing and generation models and can be used as an alternative approach for abstract meaning representation of low-resource languages.

Multilingual Abstract Meaning Representation for Celtic Languages

This work presents an approach to create a multilingual text-to-AMR model for three Celtic languages, Welsh, Welsh and the closely related Irish and Scottish-Gaelic (Q-Celtic) and shows that machine translated test corpora unfairly improve the AMR evaluation.

Étiquetage ou génération de séquences pour la compréhension automatique du langage en contexte d’interaction? (Sequence tagging or sequence generation for Natural Language Understanding ?)

La tâche de compréhension automatique du langage en contexte d’interaction (NLU pour Natural Language Understanding) est souvent réduite à la détection d’intentions et de concepts sur des corpus

From Graph to Graph: AMR to SPARQL

Using AMR graph for multilingual QA systems to generate SPARQL queries for Wikidata shows promising results and has scope for further improvement.

Spanish Abstract Meaning Representation: Annotation of a General Corpus

This work presents the first sizable, general annotation project for Spanish Abstract Meaning Representation, which makes use of Spanish rolesets from the AnCora-Net lexicon and extends English AMR with semantic features specific to Spanish.



Bootstrapping Multilingual AMR with Contextual Word Alignments

A novel technique for foreign-text-to-English AMR alignment, using the contextual word alignment between English and foreign language tokens, which achieves a highly competitive performance that surpasses the best published results for German, Italian, Spanish and Chinese.

Multilingual Neural Machine Translation with Knowledge Distillation

One model is enough to handle multiple languages, with comparable or even better accuracy than individual models, in this distillation-based approach to boost the accuracy of multilingual machine translation.

Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

An architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts using a single BiLSTM encoder with a shared byte-pair encoding vocabulary for all languages, coupled with an auxiliary decoder and trained on publicly available parallel corpora.

Enabling Cross-Lingual AMR Parsing with Transfer Learning Techniques

This work explores different transfer learning techniques for producing automatic AMR annotations across languages and develops a cross-lingual AMR parser, XL-AMR, which can be trained on the produced data and does not rely on AMR aligners or source-copy mechanisms.

Translate, then Parse! A Strong Baseline for Cross-Lingual AMR Parsing

This paper revisits this simple two-step base-line, and enhances it with a strong NMT system and a strong AMR parser, showing that T+P outperforms a recent state-of-the-art system across all tested languages.

Improving AMR Parsing with Sequence-to-Sequence Pre-training

This paper proposes a seq2seq pre-training approach to build pre-trained models in both single and joint way on three relevant tasks, i.e., machine translation, syntactic parsing, and AMR parsing itself, and extends the vanilla fine-tuning method to a multi-task learning fine- Tuning method that optimizes for the performance of AMR parse while endeavors to preserve the response of pre- trained models.

Ensemble Distillation for Neural Machine Translation

This work introduces a data filtering method based on the knowledge of the teacher model that not only speeds up the training, but also leads to better translation quality.

XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding and Generation

A recent cross-lingual pre-trained model Unicoder is extended to cover both understanding and generation tasks, which is evaluated on XGLUE as a strong baseline and the base versions of Multilingual BERT, XLM and XLM-R are evaluated for comparison.

Multilingual Translation with Extensible Multilingual Pretraining and Finetuning

This work shows that multilingual translation models can be created through multilingual finetuning, and demonstrates that pretrained models can been extended to incorporate additional languages without loss of performance.

Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

This work trains the distillation parser using a structured hinge loss objective with a novel cost that incorporates ensemble uncertainty estimates for each possible attachment, thereby avoiding the intractable cross-entropy computations required by applying standard distillation objectives to problems with structured outputs.