Table Union Search on Open Data

@article{Nargesian2018TableUS,
  title={Table Union Search on Open Data},
  author={F. Nargesian and Erkang Zhu and Ken Q. Pu and R. Miller},
  journal={Proc. VLDB Endow.},
  year={2018},
  volume={11},
  pages={813-825}
}
  • F. Nargesian, Erkang Zhu, +1 author R. Miller
  • Published 2018
  • Computer Science
  • Proc. VLDB Endow.
  • We define the table union search problem and present a probabilistic solution for finding tables that are unionable with a query table within massive repositories. [...] Key Method We propose a data-driven approach that automatically determines the best model to use for each pair of attributes. Through a distribution-aware algorithm, we are able to find the optimal number of attributes in two tables that can be unioned. To evaluate accuracy, we created and open-sourced a benchmark of Open Data tables. We show…Expand Abstract
    Data-driven domain discovery for structured datasets
    Pytheas: Pattern-based Table Discovery in CSV Files
    Web Table Extraction, Retrieval and Augmentation: A Survey
    7
    Optimizing Organizations for Navigating Data Lakes
    2
    Web Table Extraction, Retrieval, and Augmentation
    1
    Data Lake Organization
    1
    Google Dataset Search by the Numbers
    Finding Related Tables in Data Lakes for Interactive Data Science

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 40 REFERENCES
    Finding related tables
    126
    Annotating and searching web tables using entities, types and relationships
    309
    Recovering Semantics of Tables on the Web
    296
    HAMSTER: Using Search Clicklogs for Schema and Taxonomy Matching
    42
    Discovering Linkage Points over Web Data
    33
    WebTables: exploring the power of tables on the web
    578
    Answering Table Queries on the Web using Column Keywords
    100
    Matching HTML Tables to DBpedia
    93
    LSH Ensemble: Internet-Scale Domain Search
    35