Joining Extractions of Regular Expressions

@article{Freydenberger2018JoiningEO,
  title={Joining Extractions of Regular Expressions},
  author={Dominik D. Freydenberger and B. Kimelfeld and L. Peterfreund},
  journal={Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems},
  year={2018}
}
  • Dominik D. Freydenberger, B. Kimelfeld, L. Peterfreund
  • Published 2018
  • Computer Science
  • Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
  • Regular expressions with capture variables, also known as "regex formulas,'' extract relations of spans (interval positions) from text. These relations can be further manipulated via the relational Algebra as studied in the context of "document spanners," Fagin et al.'s formal framework for information extraction. We investigate the complexity of querying text by Conjunctive Queries (CQs) and Unions of CQs (UCQs) on top of regex formulas. Such queries have been investigated in prior work on… CONTINUE READING
    Complexity Bounds for Relational Algebra over Document Spanners
    • 12
    • PDF
    A Logic for Document Spanners
    • 25
    • PDF
    Constant Delay Algorithms for Regular Document Spanners
    • 24
    • Highly Influenced
    • PDF
    Recursive Programs for Document Spanners
    • 14
    • PDF
    Constant-Delay Enumeration for Nondeterministic Document Spanners
    • 23
    • Highly Influenced
    • PDF
    Split-Correctness in Information Extraction
    • 11
    • PDF
    A Logic for Document Spanners
    • 17
    • PDF
    Efficient Enumeration Algorithms for Regular Document Spanners
    • 6
    • Highly Influenced

    References

    Publications referenced by this paper.
    SHOWING 1-9 OF 9 REFERENCES
    Optimal implementation of conjunctive queries in relational data bases
    • 1,272
    • Highly Influential
    • PDF
    Yago: a core of semantic knowledge
    • 2,950
    • Highly Influential
    • PDF
    Declarative Information Extraction Using Datalog with Embedded Extraction Predicates
    • 211
    • Highly Influential
    • PDF
    TextRunner: Open Information Extraction on the Web
    • 293
    • Highly Influential
    • PDF
    YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia: Extended Abstract
    • 974
    • Highly Influential
    • PDF
    SystemT: A Declarative Information Extraction System
    • 34
    • Highly Influential
    • PDF
    Efficient enumeration of words in regular languages
    • 11
    • Highly Influential
    • PDF
    A logic for document spanners Accepted. Full version available at http://ddfy
    • 2017
    Incremental Knowledge Base Construction Using DeepDive
    • 50
    • Highly Influential
    • PDF