Multiword Expressions and Named Entities in the Wiki50 Corpus

  title={Multiword Expressions and Named Entities in the Wiki50 Corpus},
  author={Veronika Vincze and T. Nagy Istv{\'a}nNagy and G{\'a}bor Berend},
Multiword expressions (MWEs) and named entities (NEs) exhibit unique and idiosyncratic features, thus, they often pose a problem to NLP systems. In order to facilitate their identification we developed the first corpus of Wikipedia articles in which several types of multiword expressions and named entities are manually annotated at the same time. The corpus can be used for training or testing MWE-detectors or NER systems, which we illustrate with experiments and it also makes it possible to… CONTINUE READING
Highly Cited
This paper has 61 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 33 citations

61 Citations

Citations per Year
Semantic Scholar estimates that this publication has 61 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 20 references

Description of Evaluation Resource – German PP-verb data

  • Brigitte Krenn.
  • Proceedings of the LREC Workshop Towards a Shared…
  • 2008

Interpreting Compound Nominalisations

  • Jeremy Nicholson, Timothy Baldwin.
  • LREC Workshop: Towards a Shared Task for…
  • 2008

Statistical Modeling of Multiword Expressions

  • Su Nam Kim.
  • Ph.D. thesis, University of Melbourne, Melbourne.
  • 2008
1 Excerpt