URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors

Abstract

We introduce the URIEL knowledge base for massively multilingual NLP and the lang2vec utility, which provides information-rich vector identifications of languages drawn from typological, geographical, and phylogenetic databases that are normalized to have straightforward and consistent formats, naming, and semantics. The goal of URIEL and lang2vec is to… (More)

Topics

5 Figures and Tables

Cite this paper

@inproceedings{Levin2017URIELAL, title={URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors}, author={Lori S. Levin and Patrick Littell and David R. Mortensen and Ke Lin and Katherine Kairis and Carlisle Turner}, booktitle={EACL}, year={2017} }