The Wikipedia XML Corpus

@inproceedings{Denoyer2006TheWX,
  title={The Wikipedia XML Corpus},
  author={Ludovic Denoyer and Patrick Gallinari},
  booktitle={INEX},
  year={2006}
}
Wikipedia is a well known free content, multilingual encyclopedia written collaboratively by contributors around the world. Anybody can edit an article using a wiki markup language that offers a simplified alternative to HTML. This encyclopedia is composed of millions of articles in different languages. 

8 Figures & Tables

Topics

Statistics

050100'06'07'08'09'10'11'12'13'14'15'16'17'18
Citations per Year

478 Citations

Semantic Scholar estimates that this publication has 478 citations based on the available data.

See our FAQ for additional information.