Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL


This paper presents a simple unsupervised learning algorithm for recognizing synonyms, based on statistical data acquired by querying a Web search engine. The algorithm, called PMI-IR, uses Pointwise Mutual Information (PMI) and Information Retrieval (IR) to measure the similarity of pairs of words. PMI-IR is empirically evaluated using 80 synonym test… (More)
DOI: 10.1007/3-540-44795-4_42


4 Figures and Tables


Citations per Year

1,429 Citations

Semantic Scholar estimates that this publication has 1,429 citations based on the available data.

See our FAQ for additional information.