Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

  title={Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction},
  author={Mahsa Yarmohammadi and Shijie Wu and Marc Marone and Haoran Xu and Seth Ebner and Guanghui Qin and Yunmo Chen and Jialiang Guo and Craig Harman and Kenton W. Murray and Aaron Steven White and Mark Dredze and Benjamin Van Durme},
Zero-shot cross-lingual information extraction (IE) describes the construction of an IE model for some target language, given existing annotations exclusively in some other language, typically English. While the advance of pretrained multilingual encoders suggests an easy optimism of "train on English, run on any language", we find through a thorough exploration and extension of techniques that a combination of approaches, both new and old, leads to better performance than any one crosslingual… 
1 Citations

Figures and Tables from this paper

BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation
This paper demonstrates that simply using the output of a tailored and suitable bilingual pre-trained language model (dubbed BIBERT) as the input of the NMT encoder achieves state-of-the-art translation performance and proposes a stochastic layer selection approach and a dual-directional translation model to ensure the sufficient utilization of contextualized embeddings.


X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset
This work proposes a method to automatically construct an SRL corpus that is parallel in four languages: English, French, German, Spanish, with unified predicate and role annotations that are fully comparable across languages.
Inducing Information Extraction Systems for New Languages via Cross-language Projection
This paper presents a novel method for rapidly creating IE systems for new languages by exploiting existing IE systems via cross-language projection, and explores several ways of realizing both the transfer and learning processes using off-the-shelf machine translation systems, induced word alignment, attribute projection and transformation-based learning.
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing.
Unsupervised Cross-lingual Representation Learning at Scale
It is shown that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks, and the possibility of multilingual modeling without sacrificing per-language performance is shown for the first time.
Emerging Cross-lingual Structure in Pretrained Language Models
It is shown that transfer is possible even when there is no shared vocabulary across the monolingual corpora and also when the text comes from very different domains, and it is strongly suggested that, much like for non-contextual word embeddings, there are universal latent symmetries in the learned embedding spaces.
A Joint Neural Model for Information Extraction with Global Features
A joint neural framework that aims to extract the globally optimal IE result as a graph from an input sentence and can be easily applied to new languages or trained in a multilingual manner, as OneIE does not use any language-specific feature.
The Tatoeba Translation Challenge – Realistic Data Sets for Low Resource and Multilingual MT
A new benchmark for machine translation that provides training and test data for thousands of language pairs covering over 500 languages and tools for creating state-of-the-art translation models from that collection is described to trigger the development of open translation tools and models with a much broader coverage of the World’s languages.
Treebank Translation for Cross-Lingual Parser Induction
This approach draws on annotation projection but avoids the use of noisy source-side annotation of an unrelated parallel corpus and instead relies on manual treebank annotation in combination with statistical machine translation, which makes it possible to train fully lexicalized parsers.
Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation
This work proposes a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages using a shared wordpiece vocabulary, and introduces an artificial token at the beginning of the input sentence to specify the required target language.
Cross-Lingual Dependency Parsing with Unlabeled Auxiliary Languages
This work explores adversarial training for learning contextual encoders that produce invariant representations across languages to facilitate cross-lingual transfer and proposes to leverage unannotated sentences from auxiliary languages to help learning language-agnostic representations.