Hashing and Merging Heuristics for Text Reuse Detection

@inproceedings{Alvi2014HashingAM,
  title={Hashing and Merging Heuristics for Text Reuse Detection},
  author={Faisal Alvi and Mark Stevenson and Paul D. Clough},
  booktitle={CLEF},
  year={2014}
}
EXTENSION I MERGING Pairs file Find all char n-grams in each suspicious + file from the multihashmap to generate List( s) of position pairs Sort according exact matches of position pairs to source files • ~ ~ ~ ~ ~ ~ + ~ N ~ Apply rule-based merging ~ ~ ~ based on pre-defined classes Sorted Pairs file as a list s s s ...s:::: ...s:::: ...s:::: for each position pair list r.rJ r.rJ r.rJ ~ N ~ ro ro ro ...s:::: ...s:::: • • • • • • • • • • • ...s:::: (l) (l) (l… CONTINUE READING

Figures, Tables, and Topics from this paper.

References

Publications referenced by this paper.
SHOWING 1-7 OF 7 REFERENCES

Overview of the 5th International Competition on Plagiarism Detection

M. Potthast, T. Gollub, +5 authors B. Stein
  • Forner, P., Navigli, R., Tufis, D. (eds.) Working Notes Papers of the CLEF 2013 Evaluation Labs
  • 2013
VIEW 3 EXCERPTS

Overview of the 4th International Competition on Plagiarism Detection

M. Potthast, T. Gollub, +9 authors B. Stein
  • Forner, P., Karlgren, J., Womser-Hacker, C. (eds.) Working Notes Papers of the CLEF 2012 Evaluation Labs
  • 2012
VIEW 1 EXCERPT

Exact pattern matching with feed-forward bloom filters

  • ACM Journal of Experimental Algorithmics
  • 2011
VIEW 2 EXCERPTS

Efficient Randomized Pattern-Matching Algorithms

  • IBM Journal of Research and Development
  • 1987
VIEW 3 EXCERPTS