End-to-end Multilingual Coreference Resolution with Mention Head Prediction

@article{Prak2022EndtoendMC,
  title={End-to-end Multilingual Coreference Resolution with Mention Head Prediction},
  author={Ondřej Pra{\vz}{\'a}k and Miloslav Konop'ik},
  journal={ArXiv},
  year={2022},
  volume={abs/2209.12516}
}
This paper describes our approach to the CRAC 2022 Shared Task on Multilingual Coreference Resolution. Our model is based on a state-of-the-art end-to-end coreference resolution system. Apart from joined multilingual training, we improved our results with mention head prediction. We also tried to integrate dependency information into our model. Our system ended up in third place. Moreover, we reached the best performance on two datasets out of 13. 
2 Citations

Tables from this paper

Findings of the Shared Task on Multilingual Coreference Resolution

This paper presents an overview of the shared task on multilingual coreference resolution associated with the CRAC 2022 workshop. Shared task participants were supposed to develop trainable systems

ÚFAL CorPipe at CRAC 2022: Effectivity of Multilingual Models for Coreference Resolution

One large multilingual model with sufficiently large encoder to increase performance on all datasets across the board is found, with the benefit not limited only to the underrepresented languages or groups of typologically relative languages.

References

SHOWING 1-10 OF 19 REFERENCES

End-to-end Neural Coreference Resolution

This work introduces the first end-to-end coreference resolution model, trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions.

Multilingual Coreference Resolution with Harmonized Annotations

This paper combines the training data in multilingual experiments and train two joined models - for Slavic languages and for all the languages together, relying on an end-to-end deep learning model that is slightly adapted for the CorefUD corpus.

Findings of the Shared Task on Multilingual Coreference Resolution

This paper presents an overview of the shared task on multilingual coreference resolution associated with the CRAC 2022 workshop. Shared task participants were supposed to develop trainable systems

CorefUD 1.0: Coreference Meets Universal Dependencies

This paper presents CorefUD, a multilingual collection of corpora and a standardized format for coreference resolution, compatible with morphosyntactic annotations in the UD framework and including facilities for related tasks such as named entity recognition, which forms a first step in the direction of convergence forcoreference resolution across languages.

Anaphora and Coreference Resolution: A Review

Czert – Czech BERT-like Model for Language Representation

This paper describes the training process of the first Czech monolingual language representation models based on BERT and ALBERT architectures, and establishes the new state-of-the-art results on nine datasets.

CorefUD 0 . 1 Coreference meets Universal Dependencies – a pilot experiment on harmonizing coreference datasets for 11 languages

This report describes a pilot experiment aimed at harmonizing diverse data resources that contain coreferencerelated annotations and the results of the harmonization procedure valid in March 2021.

CamemBERT: a Tasty French Language Model

This paper investigates the feasibility of training monolingual Transformer-based language models for other languages, taking French as an example and evaluating their language models on part-of-speech tagging, dependency parsing, named entity recognition and natural language inference tasks.

German’s Next Language Model

This work presents the experiments which lead to the creation of the BERT and ELECTRA based German language models, GBERT and GELECTRA and shows that these models are the best German models to date.

Training dataset and dictionary sizes matter in BERT models: the case of Baltic languages

A trilingual LitLat BERT-like model for Lithuanian, Latvian, and English, and a monolingual Est-RoBERTa model for Estonian improve the results of existing models on all tested tasks in most situations.