Open Information Extraction from the Web

  title={Open Information Extraction from the Web},
  author={M. Banko and Michael J. Cafarella and S. Soderland and M. Broadhead and Oren Etzioni},
  journal={Commun. ACM},
  • M. Banko, Michael J. Cafarella, +2 authors Oren Etzioni
  • Published 2008
  • Computer Science
  • Commun. ACM
  • Traditionally, Information Extraction (IE) has focused on satisfying precise, narrow, pre-specified requests from small homogeneous corpora (e.g., extract the location and time of seminars from a set of announcements. [...] Key Method The paper also introduces TEXTRUNNER, a fully implemented, highly scalable OIE system where the tuples are assigned a probability and indexed to support efficient extraction and exploration via user queries. We report on experiments over a 9,000,000 Web page corpus that compare…Expand Abstract
    2,111 Citations

    Figures, Tables, and Topics from this paper

    Explore Further: Topics Discussed in This Paper

    TextRunner: Open Information Extraction on the Web
    • 301
    • PDF
    Open Information Extraction Using Wikipedia
    • 609
    • PDF
    Open Language Learning for Information Extraction
    • 644
    • PDF
    RDR-based open IE for the web document
    • 10
    Unsupervised Relation Extraction with General Domain Knowledge
    • 19
    • PDF
    The Tradeoffs Between Open and Traditional Relation Extraction
    • 396
    • PDF
    Prioritization of Domain-Specific Web Information Extraction
    • 10
    • Highly Influenced
    • PDF
    Redundancy in web-scaled information extraction: probabilistic model and experimental results
    • 4
    Navigating Extracted Data with Schema Discovery
    • 38
    • PDF


    YAGO: A Large Ontology from Wikipedia and WordNet
    • 806
    • Highly Influential
    • PDF
    Automatically semantifying wikipedia
    • Proceedings of 16th Conference on Information and Knowledge Management (CIKM)
    • 2007