Corpus ID: 14986824

Scalable Document Fingerprinting ( Extended Abstract )

@inproceedings{Nevin1996ScalableDF,
  title={Scalable Document Fingerprinting ( Extended Abstract )},
  author={Nevin and HeintzeBell},
  year={1996}
}
  • Nevin, HeintzeBell
  • Published 1996
  • As more information becomes available electronically, document search based on textual similarity is becoming increasingly important, not only for locating documents online, but also for addressing internet variants of old problems such as plagiarism and copyright violation. This paper presents an online system that provides reliable search results using modest resources and scales up to data sets of the order of a million documents. Our system provides a practical compromise between storage… CONTINUE READING
    21 Citations
    Methods for Identifying Versioned and Plagiarized Documents
    • 370
    • Highly Influenced
    • PDF
    Winnowing , a Document Fingerprinting Algorithm
    • 6
    • PDF
    ELIMINATION OF REDUNDANT EMAILS
    • NGUYEN THE HUY
    • 2006
    • PDF
    The Similarity Index
    • 5
    • PDF
    A Sentence-Based Copy Detection Approach for Web Documents
    • 62
    • Highly Influenced
    A Dual-Method Model for Copy Detection
    • Yunyi Liu, L. Liang
    • Computer Science
    • 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops
    • 2006
    • 8
    An efficient method to detect duplicates of web documents with the use of inverted index
    • 25
    • PDF
    Detecting similar HTML documents using a fuzzy set information retrieval approach
    • Rajiv Yerra, Y. Ng
    • Computer Science
    • 2005 IEEE International Conference on Granular Computing
    • 2005
    • 24

    References

    SHOWING 1-10 OF 18 REFERENCES
    Copy detection mechanisms for digital documents
    • 576
    • PDF
    Electronic marking and identification techniques to discourage document copying
    • 438
    • PDF
    The SCAM Approach to Copy Detection in Digital Libraries
    • 41
    • PDF
    Plagiarism in the web
    • 20
    Computer Algorithms for Plagiarism Detection
    • George, Burdell, George P. Burdell
    • 1989
    • 82
    Plagiarism in the web", Ed- itorial
    • Communications of the ACM,
    • 1995
    Marking and Identiication Techniques to Discourage Document Copying
    • Journal on Selected Areas in Communications
    • 1995
    \Finding similar les in a large le system
    • Proceedings of the 1994 USENIX Conference
    • 1994