TOPSIG: topology preserving document signatures

@inproceedings{Geva2011TOPSIGTP,
  title={TOPSIG: topology preserving document signatures},
  author={S. Geva and Christopher M. De Vries},
  booktitle={CIKM '11},
  year={2011}
}
  • S. Geva, Christopher M. De Vries
  • Published in CIKM '11 2011
  • Computer Science
  • Comparisons between file signatures and inverted files for text retrieval have shown the shortcomings of traditional file signatures. It has been widely accepted that traditional file signatures are inferior alternatives to inverted files. This paper describes TopSig, a new approach to the construction of file signatures that extends recent advances in semantic hashing and dimensionality reduction. These were not so far linked to general purpose, signature file based, search engines. We… CONTINUE READING
    Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS)
    • 247
    • PDF
    BitFunnel: Revisiting Signatures for Search
    • 31
    • PDF
    Asymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment
    • 51
    • PDF
    Document Clustering Evaluation: Divergence from a Random Baseline
    • 24
    • PDF
    Pairwise similarity of TopSig document signatures
    • 11
    • PDF
    Efficient top-k retrieval with signatures
    • 8
    • PDF
    Clustering and Labeling a Web Scale Document Collection using Wikipedia clusters
    • 14
    A Signature Approach to Patent Classification
    • 8
    • PDF
    Random Manhattan Indexing
    • 9
    • Highly Influenced
    • PDF
    Parallel Streaming Signature EM-tree: A Clustering Algorithm for Web Scale Applications
    • 9
    • PDF

    References

    Publications referenced by this paper.
    SHOWING 1-8 OF 8 REFERENCES
    Indexing by Latent Semantic Analysis
    • 8,312
    • Highly Influential
    • PDF
    Inverted files versus signature files for text indexing
    • 370
    • Highly Influential
    • PDF
    Signature files: an access method for documents and its analytical performance evaluation
    • 391
    • Highly Influential
    Self-taught hashing for fast similarity search
    • 338
    • Highly Influential
    • PDF
    Rank-biased precision for measurement of retrieval effectiveness
    • 452
    • Highly Influential
    • PDF
    User performance versus precision measures for simple search tasks
    • 344
    • Highly Influential
    • PDF
    Calculating the singular values and pseudo-inverse of a matrix
    • 1,192
    • Highly Influential
    • PDF
    Extensions of Lipschitz maps into a Hilbert space
    • 1,092
    • Highly Influential