A Short Survey of Document Structure Similarity Algorithms

  • David Buttler
  • Published 2004 in International Conference on Internet Computing


This paper provides a brief survey of document structural similarity algorithms, including the optimal Tree Edit Distance algorithm and various approximation algorithms. The approximation algorithms include the simple weighted tag similarity algorithm, Fourier transforms of the structure, and a new application of the shingle technique to structural… (More)

7 Figures and Tables



Citations per Year

131 Citations

Semantic Scholar estimates that this publication has 131 citations based on the available data.

See our FAQ for additional information.

  • Presentations referencing similar topics