Normalized (pointwise) mutual information in collocation extraction
@inproceedings{Bouma2009NormalizedM, title={Normalized (pointwise) mutual information in collocation extraction}, author={G. Bouma}, year={2009} }
In this paper, we discuss the related information theoretical association measures of mutual information and pointwise mutual information, in the context of collocation extraction. [...] Key Method We introduce normalized variants of these measures in order to make them more easily interpretable and at the same time less sensitive to occurrence frequency. We also provide a small empirical study to give more insight into the behaviour of these new measures in a collocation extraction setup.Expand
595 Citations
Clustering-based Approach to Multiword Expression Extraction and Ranking
- Computer Science
- MWE@NAACL-HLT
- 2015
- 1
- PDF
Improving Pointwise Mutual Information (PMI) by Incorporating Significant Co-occurrence
- Computer Science
- CoNLL
- 2013
- 14
- PDF
Handling the Impact of Low Frequency Events on Co-occurrence based Measures of Word Similarity - A Case Study of Pointwise Mutual Information
- Computer Science
- KDIR
- 2011
- 22
- PDF
Evaluating Topic Coherence Using Distributional Semantics
- Computer Science
- IWCS
- 2013
- 161
- Highly Influenced
- PDF
Comparing Similarity Measures for Distributional Thesauri
- Computer Science
- LREC
- 2014
- 10
- Highly Influenced
- PDF
All That Glitters is Not Gold: A Gold Standard of Adjective-Noun Collocations for German
- Computer Science
- LREC
- 2020
- PDF
References
SHOWING 1-10 OF 21 REFERENCES
An Evaluation of Methods for the Extraction of Multiword Expressions
- Computer Science
- LREC 2008
- 2008
- 69
- PDF
Word Association Norms, Mutual Information and Lexicography
- Computer Science, Psychology
- ACL
- 1989
- 4,070
- PDF
The Statistics of Word Cooccur-rences: Word Pairs and Collocations
- Computer Science
- 2004
- 535
- Highly Influential
- PDF
Europarl: A Parallel Corpus for Statistical Machine Translation
- Computer Science
- 2005
- 3,116
- Highly Influential
- PDF
Accurate Methods for the Statistics of Surprise and Coincidence
- Computer Science
- Comput. Linguistics
- 1993
- 2,705
- PDF