MapDupReducer: detecting near duplicates over massive datasets

@inproceedings{Wang2010MapDupReducerDN,
  title={MapDupReducer: detecting near duplicates over massive datasets},
  author={Changping Wang and Jianmin Wang and Xuemin Lin and Wei Wang and Haixun Wang and Hongsong Li and Wanpeng Tian and Jun Xu and Rui Li},
  booktitle={SIGMOD Conference},
  year={2010}
}
Near duplicate detection benefits many applications, e.g., on-line news selection over the Web by keyword search. The purpose of this demo is to show the design and implementation of MapDupReducer, a MapReduce based system capable of detecting near duplicates over massive datasets efficiently. 
Highly Cited
This paper has 64 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.

Citations

Publications citing this paper.
Showing 1-10 of 36 extracted citations

64 Citations

01020'11'13'15'17
Citations per Year
Semantic Scholar estimates that this publication has 64 citations based on the available data.

See our FAQ for additional information.

Similar Papers

Loading similar papers…