Tracking Knowledge Propagation Across Wikipedia Languages

  author={Roldolfo Valentim and Giovanni V. Comarela and Souneil Park and Diego S{\'a}ez-Trumper},
In this paper, we present a dataset of inter-language knowledge propagation in Wikipedia. Covering the entire 309 language editions and 33M articles, the dataset aims to track the full propagation history of Wikipedia concepts, and allow follow-up research on building predictive models of them. For this purpose, we align all the Wikipedia articles in a language-agnostic manner according to the concept they cover, which results in 13M propagation instances. To the best of our knowledge, this… 

