Aligning Needles in a Haystack: Paraphrase Acquisition Across the Web

  title={Aligning Needles in a Haystack: Paraphrase Acquisition Across the Web},
  author={Marius Pasca and P{\'e}ter Dienes},
This paper presents a lightweight method for unsupervised extraction of paraphrases from arbitrary textual Web documents. The method differs from previous approaches to paraphrase acquisition in that 1) it removes the assumptions on the quality of the input data, by using inherently noisy, unreliable Web documents rather than clean, trustworthy, properly formatted documents; and 2) it does not require any explicit clue indicating which documents are likely to encode parallel paraphrases, as… CONTINUE READING
Highly Cited
This paper has 76 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 54 extracted citations

Automatic Extraction of Synonymous Collocation Pairs from a Text Corpus

2018 Federated Conference on Computer Science and Information Systems (FedCSIS) • 2018
View 2 Excerpts

Building the Semantic Similarity Model for Social Network Data Streams

2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP) • 2018
View 2 Excerpts

Expanding Paraphrase Lexicons by Exploiting Generalities

ACM Trans. Asian & Low-Resource Lang. Inf. Process. • 2018
View 1 Excerpt

76 Citations

Citations per Year
Semantic Scholar estimates that this publication has 76 citations based on the available data.

See our FAQ for additional information.

Similar Papers

Loading similar papers…