X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset

  title={X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset},
  author={Angel Daza and Anette Frank},
Even though SRL is researched for many languages, major improvements have mostly been obtained for English, for which more resources are available. In fact, existing multilingual SRL datasets contain disparate annotation styles or come from different domains, hampering generalization in multilingual learning. In this work we propose a method to automatically construct an SRL corpus that is parallel in four languages: English, French, German, Spanish, with unified predicate and role annotations… 

Figures and Tables from this paper

UniteD-SRL: A Unified Dataset for Span- and Dependency-Based Multilingual and Cross-Lingual Semantic Role Labeling
This paper proposes UNITED-SRL, a new benchmark for multilingual and crosslingual, spanand dependency-based SRL that provides expert-curated parallel annotations using a common predicateargument structure inventory, allowing direct comparisons across languages and encouraging studies on cross-lingual transfer in SRL.
On the Benefit of Syntactic Supervision for Cross-lingual Transfer in Semantic Role Labeling
This work performs an empirical exploration of the helpfulness of syntactic supervision for crosslingual SRL within a simple multitask learning scheme and shows the effectiveness of syntactical supervision in low-resource scenarios.
Unifying Cross-Lingual Semantic Role Labeling with Heterogeneous Linguistic Resources
This work presents a unified model to perform cross-lingual SRL over heterogeneous linguistic resources, which implicitly learns a high-quality mapping for different formalisms across diverse languages without resorting to word alignment and/or translation techniques.
Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data
Experimental results on three multilingual semantic parsing datasets show that data augmentation with TaF reaches accuracies competitive with similar systems which rely on traditional alignment techniques.
Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction
This work explores techniques including data projection and self-training, and how different pretrained encoders impact them, and finds that a combination of approaches leads to better performance than any one cross-lingual strategy in particular.
Comparing Span Extraction Methods for Semantic Role Labeling
With extensive experiments on PropBank SRL datasets, it is found that more structured decoding methods outperform BIO-tagging when using static (word type) embeddings across all experimental settings, but when used in conjunction with pre-trained contextualized word representations, the benefits are diminished.
Zero-shot Cross-lingual Conversational Semantic Role Labeling
The usefulness of CSRL to non-Chinese conversational tasks such as the question-in-context rewriting task in English and the multi-turn dialogue response generation tasks in English, German and Japanese is improved by incorporating the CSRL information into the downstream conversation-based models.
Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction
The Alignment-Augmented Consistent Translation (AACTRANS) model is introduced to translate English sentences and their corresponding extractions consistently with each other — with no changes to vocabulary or semantic meaning which may result from independent translations.
Instance-adaptive training with noise-robust losses against noisy labels
Experiments on noisy and corrupted NLP datasets show that proposed instance-adaptive training frameworks help increase the noise-robustness provided by such losses, promoting the use of the frameworks and associated losses in NLP models trained with noisy data.


Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling
A Cross-lingual Encoder-Decoder model that simultaneously translates and generates sentences with Semantic Role Labeling annotations in a resource-poor target language and offers a flexible method for leveraging SRL data in multiple languages is proposed.
ZAP: An Open-Source Multilingual Annotation Projection Framework
The ZAP framework is Java-based and includes methods for preprocessing corpora, computing word-alignments between sentence pairs, transferring different layers of linguistic annotation, and visualization, and was designed for ease-of-use with lightweight APIs.
Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling
This paper presents a two-stage method to enable the construction of SRL models for resourcepoor languages by exploiting monolingual SRL and multilingual parallel data and shows that this method outperforms existing methods.
Cross-Lingual Transfer of Semantic Roles: From Raw Text to Semantic Roles
We describe a transfer method based on annotation projection to develop a dependency-based semantic role labeling system for languages for which no supervised linguistic information other than
A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling
A simple and accurate neural model for dependency-based semantic role labeling that predicts predicate-argument dependencies relying on states of a bidirectional LSTM encoder that substantially outperforms all previous local models and approaches the best reported results on the English CoNLL-2009 dataset.
Multi-source synthetic treebank creation for improved cross-lingual dependency parsing
A method of creating synthetic treebanks for cross-lingual dependency parsing using a combination of machine translation (including pivot translation), annotation projection and the spanning tree algorithm.
Improving Neural Machine Translation Models with Monolingual Data
This work pairs monolingual training data with an automatic back-translation, and can treat it as additional parallel training data, and obtains substantial improvements on the WMT 15 task English German, and for the low-resourced IWSLT 14 task Turkish->English.
CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes
The OntoNotes annotation (coreference and other layers) is described and the parameters of the shared task including the format, pre-processing information, evaluation criteria, and presents and discusses the results achieved by the participating systems.
Scaling up Automatic Cross-Lingual Semantic Role Annotation
This paper scales up previous efforts by using an automatic approach to semantic annotation that does not rely on a semantic ontology for the target language, and improves the quality of the transferred semantic annotations by using a joint syntactic-semantic parser.
End-to-end learning of semantic role labeling using recurrent neural networks
This work proposes to use deep bi-directional recurrent network as an end-to-end system for SRL, which takes only original text information as input feature, without using any syntactic knowledge.