Edit Distance for XML Information Retrieval: Some Experiments on the Datacentric Track of INEX 2011

Abstract

In this paper we present our structured information retrieval model based on subgraphs similarity. Our approach combines a content propagation technique which handles sibling relationships with a document query matching process on structure. The latter is based on tree edit distance (TED) which is the minimum set of insert, delete, and replace operations to turn one tree to another. As the effectiveness of TED relies both on the input tree and the edit costs, we experimented various subtree extraction techniques as well as different costs based on the DTD associated to the Datacentric collection.

DOI: 10.1007/978-3-642-35734-3_11

Extracted Key Phrases

2 Figures and Tables

Cite this paper

@inproceedings{Laitang2011EditDF, title={Edit Distance for XML Information Retrieval: Some Experiments on the Datacentric Track of INEX 2011}, author={Cyril Laitang and Karen Pinel-Sauvagnat and Mohand Boughanem}, booktitle={INEX}, year={2011} }