Monadic Datalog and the Expressive Power of Languages for Web Information Extraction

@article{Gottlob2002MonadicDA,
  title={Monadic Datalog and the Expressive Power of Languages for Web Information Extraction},
  author={Georg Gottlob and Christoph Koch},
  journal={J. ACM},
  year={2002},
  volume={51},
  pages={74-113}
}
Research on information extraction from Web pages (wrapping) has seen much activity recently (particularly systems implementations), but little work has been done on formally studying the expressiveness of the formalisms proposed or on the theoretical foundations of wrapping. In this paper, we first study monadic datalog over trees as a wrapping language. We show that this simple language is equivalent to monadic second order logic (MSO) in its ability to specify wrappers. We believe that MSO… CONTINUE READING
Highly Influential
This paper has highly influenced 20 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS

From This Paper

Figures, tables, and topics from this paper.

References

Publications referenced by this paper.
Showing 1-10 of 11 references

Query Automata on Finite Trees

  • F. Neven, T. Schwentick
  • Theoretical Computer Science (to appear)
  • 2001
Highly Influential
19 Excerpts

Regular Languages

  • S. Yu
  • G. Rozenberg and A. Salomaa, editors, Handbook of…
  • 1997
Highly Influential
2 Excerpts

Similar Papers

Loading similar papers…