Sato: Contextual Semantic Type Detection in Tables
@article{Zhang2020SatoCS, title={Sato: Contextual Semantic Type Detection in Tables}, author={Dan Zhang and Yoshihiko Suhara and Jinfeng Li and M. Hulsebos and cCaugatay Demiralp and W. Tan}, journal={Proc. VLDB Endow.}, year={2020}, volume={13}, pages={1835-1848} }
Detecting the semantic types of data columns in relational tables is important for various data preparation and information retrieval tasks such as data cleaning, schema matching, data discovery, and semantic search. However, existing detection approaches either perform poorly with dirty data, support only a limited number of semantic types, fail to incorporate the table context of columns or rely on large sample sizes for training data. We introduce Sato, a hybrid machine learning model to… Expand
Figures, Tables, and Topics from this paper
10 Citations
Relational Pretrained Transformers towards Democratizing Data Preparation [Vision]
- Computer Science
- ArXiv
- 2020
- 1
Efficient Joinable Table Discovery in Data Lakes: A High-Dimensional Similarity-Based Approach
- Computer Science
- ArXiv
- 2020
- 1
- PDF
Loch Prospector: Metadata Visualization for Lakes of Open Data
- Computer Science
- 2020 IEEE Visualization Conference (VIS)
- 2020
SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle
- Computer Science
- CIDR
- 2020
- 4
- PDF
NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Language Queries
- Computer Science, Medicine
- IEEE Transactions on Visualization and Computer Graphics
- 2021
- 5
- PDF
References
SHOWING 1-10 OF 54 REFERENCES
Sherlock: A Deep Learning Approach to Semantic Data Type Detection
- Computer Science, Mathematics
- KDD
- 2019
- 27
- PDF
Meimei: An Efficient Probabilistic Approach for Semantically Annotating Tables
- Computer Science
- AAAI
- 2019
- 10
Annotating and searching web tables using entities, types and relationships
- Computer Science
- Proc. VLDB Endow.
- 2010
- 344
- Highly Influential
- PDF
Seeping Semantics: Linking Datasets Using Word Embeddings for Data Discovery
- Computer Science
- 2018 IEEE 34th International Conference on Data Engineering (ICDE)
- 2018
- 45
- PDF
Synthesizing Type-Detection Logic for Rich Semantic Data Types using Open-source Code
- Computer Science
- SIGMOD Conference
- 2018
- 15
- PDF
Semantic Labeling: A Domain-Independent Approach
- Computer Science
- International Semantic Web Conference
- 2016
- 56
- PDF
Recovering Semantics of Tables on the Web
- Computer Science
- Proc. VLDB Endow.
- 2011
- 311
- Highly Influential
- PDF
Learning to Match the Schemas of Data Sources: A Multistrategy Approach
- Computer Science
- Machine Learning
- 2004
- 290
- PDF