• Corpus ID: 64787732

Universal Dependencies 2.1

  title={Universal Dependencies 2.1},
  author={Joakim Nivre and Zeljko Agic and Lars Ahrenberg and Lene Antonsen and Mar{\'i}a Jes{\'u}s Aranzabe and Masayuki Asahara and Luma Ateyah and Mohammed Attia and Aitziber Atutxa and Liesbeth Augustinus and Elena Badmaeva and Miguel Ballesteros and Esha Banerjee and Sebastian Bank and Verginica Barbu Mititelu and John Bauer and Kepa Bengoetxea and Riyaz Ahmad Bhat and Eckhard Bick and Victoria Bobicev and Carl B{\"o}rstell and Cristina Bosco and Gosse Bouma and Sam Bowman and Aljoscha Burchardt and Marie Candito and Gauthier Caron and G. Eryigit and Giuseppe G. A. Celano and Savaş Çetin and Fabricio Chalub and Jinho Choi and Silvie Cinkov{\'a} and Çagri Ç{\"o}ltekin and Miriam Connor and Elizabeth Davidson and Marie-Catherine de Marneffe and Valeria C V de Paiva and Arantza D{\'i}az de Ilarraza and Peter Dirix and Kaja Dobrovoljc and Timothy Dozat and Kira Droganova and Puneet Dwivedi and Marhaba Eli and Ali Mamdouh Elkahky and T. Erjavec and Rich{\'a}rd Farkas and H{\'e}ctor Fern{\'a}ndez Alcalde and Jennifer Foster and Cl{\'a}udia Freitas and Katar{\'i}na Gajdo{\vs}ov{\'a} and Daniel Galbraith and Marcos Garc{\'i}a and Moa G{\"a}rdenfors and Kim Gerdes and Filip Ginter and Iakes Goenaga and Koldo Gojenola and Memduh G{\"o}kirmak and Yoav Goldberg and Xavier G{\'o}mez Guinovart and Berta Gonz{\'a}lez Saavedra and Matias Grioni and Normunds Gruzitis and Bruno Guillaume and Nizar Habash and Jan Hajic and Linh Ha My and Kim Harris and Dag Trygve Truslew Haug and Barbora Hladk{\'a} and Jaroslava Hlav{\'a}cov{\'a} and Florinel Hociung and Petter Hohle and Radu Ion and Elena Irimia and Tom{\'a}s Jel{\'i}nek and Anders Johannsen and Fredrik J{\o}rgensen and H{\"u}ner Kaşıkara and Hiroshi Kanayama and Jenna Kanerva and Tolga Kayadelen and V{\'a}clava Kettnerov{\'a} and Jesse Kirchner and Natalia Kotsyba and Simon Krek and Veronika Laippala and Lorenzo Lambertino and Tatiana Lando and John Lee and Phương L{\^e} Hồng and Alessandro Lenci and Saran Lertpradit and Herman Leung and Cheuk Ying Li and Josie Li and Keying Li and Nikola Ljubesic and Olga Aleksandrovna Loginova and O. Lyashevskaya and Teresa Lynn and Vivien Macketanz and Aibek Makazhanov and Michael Mandl and Christopher D. Manning and Catalina Maranduc and David Mare{\vc}ek and Katrin Marheinecke and H{\'e}ctor Mart{\'i}nez Alonso and Andr{\'e} Martins and Jan Masek and Yuji Matsumoto and Ryan T. McDonald and Gustavo Mendonça and Niko Miekka and Anna Missil{\"a} and Cătălina Mititelu and Yusuke Miyao and Simonetta Montemagni and Amir More and Laura Moreno Romero and Shinsuke Mori and Bohdan Moskalevskyi and K. Muischnek and Kaili M{\"u}{\"u}risep and Pinkey Nainwani and Anna Nedoluzhko and Gunta Nespore-Berzkalne and Lương Nguyễn Thị and Huyen Nguyen Thi Minh and Vitaly Nikolaev and Hanna M. Nurmi and Stina Ojala and Petya N. Osenova and Robert {\"O}stling and Lilja {\O}vrelid and Elena Oliete Pascual and Marco Passarotti and Cenel-Augusto Perez and Guy Perrier and Slav Petrov and Jussi Piitulainen and Emily Pitler and Barbara Plank and Martin Popel and Lauma Pretkalnina and Prokopis Prokopidis and Tiina Puolakainen and Sampo Pyysalo and Alexandre Rademaker and Loganathan Ramasamy and Taraka Rama and Vinit Ravishankar and Livy Real and Siva Reddy and Georg Rehm and Larissa Rinaldi and Laura Rituma and M D Romanenko and Rudolf Rosa and Davide Rovati and Beno{\^i}t Sagot and Shadi Saleh and Tanja Samardzic and Manuela Sanguinetti and Baiba Saulite and Sebastian Schuster and Djam{\'e} Seddah and Wolfgang Seeker and Mojgan Seraji and Mo Shen and Atsuko Shimada and Dmitry V. Sichinava and Natalia Silveira and Maria Simi and Radu Simionescu and Katalin Ilona Simk{\'o} and Marie {\vS}imkov{\'a} and Kiril Ivanov Simov and Aaron Smith and Antonio Stella and Milan Straka and Jana Strnadov{\'a} and Alane Suhr and Umut Sulubacak and Zsolt Sz{\'a}nt{\'o} and Dima Taji and Takaaki Tanaka and Trond Trosterud and A. A. Trukhina and Reut Tsarfaty and Francis M. Tyers and Sumire Uematsu and Zdenka Uresov{\'a} and Larraitz Uria and Hans Uszkoreit and Sowmya Vajjala and Daniel R. van Niekerk and Gertjan van Noord and Viktor Varga and Eric Villemonte de la Clergerie and Veronika Vincze and Lars Wallin and Jonathan North Washington and Mats Wir{\'e}n and Tak-sum Wong and Zhuoran Yu and Z. Žabokrtsk{\'y} and Amir Zeldes and Daniel Zeman and Hanzhi Zhu},
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic… 
Multilingual Word Segmentation: Training Many Language-Specific Tokenizers Smoothly Thanks to the Universal Dependencies Corpus
It is argued in this paper that: (1) tokenization should be considered as a supervised task; (2) language scalability requires a streamlined software engineering process across languages.
On Learning Universal Representations Across Languages
Hierarchical Contrastive Learning (HiCTL) is proposed to learn universal representations for parallel sentences distributed in one or multiple languages and distinguish the semantically-related words from a shared cross-lingual vocabulary for each sentence.
Reflexives in Universal Dependencies
The goal of the paper is to contribute to more consistent annotation of reflexives in future releases of UD, which, in turn, will enable broader cross-linguistic studies of this phenomenon.
All Roads Lead to UD: Converting Stanford and Penn Parses to English Universal Dependencies with Multilayer Annotations
It is shown that constituent-based conversion using CoreNLP (with automatic NER) performs substantially worse in all genres, including when using gold constituent trees, primarily due to underspecification of phrasal grammatical functions.
Quantitative Linguistic Investigations across Universal Dependencies Treebanks
Preliminary results show interesting differences rooted either in language-specific peculiarities or crosslingual annotation inconsistencies, with a potential impact on different application scenarios.
Delexicalised Multilingual Discourse Segmentation for DISRPT 2021 and Tense, Mood, Voice and Modality Tagging for 11 Languages
This paper describes our participating system for the Shared Task on Discourse Segmentation and Connective Identification across Formalisms and Languages. Key features of the presented approach are
Unsupervised Cross-Lingual Adaptation of Dependency Parsers Using CRF Autoencoders
This paper proposes to utilize unsupervised discriminative parsers based on the CRF autoencoder framework for the task of cross-lingual adaptation of dependency parsers without annotated target corpora and parallel corpora.
An Error Analysis Framework for Shallow Surface Realization
A framework for error analysis which permits identifying which features of the input affect the models’ results is proposed, and it is shown that dependency edge accuracy correlate with automatic metrics thereby providing a more interpretable basis for evaluation.
Syntax-augmented Multilingual BERT for Cross-lingual Transfer
This work shows that explicitly providing language syntax and training mBERT using an auxiliary objective to encode the universal dependency tree structure helps cross-lingual transfer.
Universal Dependencies for Amharic
This paper describes the process of creating an Amaric Dependency Treebank, which is the first attempt to introduce Universal Dependencies (UD) into Amharic, and describes the annotation processes for POS tagging, morphological information and dependency relations.