Sato: Contextual Semantic Type Detection in Tables

@article{Zhang2020SatoCS,
  title={Sato: Contextual Semantic Type Detection in Tables},
  author={Dan Zhang and Yoshihiko Suhara and Jinfeng Li and M. Hulsebos and cCaugatay Demiralp and W. Tan},
  journal={Proc. VLDB Endow.},
  year={2020},
  volume={13},
  pages={1835-1848}
}
Detecting the semantic types of data columns in relational tables is important for various data preparation and information retrieval tasks such as data cleaning, schema matching, data discovery, and semantic search. However, existing detection approaches either perform poorly with dirty data, support only a limited number of semantic types, fail to incorporate the table context of columns or rely on large sample sizes for training data. We introduce Sato, a hybrid machine learning model to… Expand
10 Citations
Semantic Annotation for Tabular Data
  • Highly Influenced
  • PDF
Auto-Transform: Learning-to-Transform by Patterns
  • PDF
Auto-transform
Loch Prospector: Metadata Visualization for Lakes of Open Data
NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Language Queries
  • 5
  • PDF
Auto-Validate: Unsupervised Data Validation Using Data-Domain Patterns Inferred from Data Lakes
  • Jie Song, Yeye He
  • Computer Science
  • 2021
  • PDF
DomainNet: Homograph Detection for Data Lake Disambiguation
  • PDF

References

SHOWING 1-10 OF 54 REFERENCES
Sherlock: A Deep Learning Approach to Semantic Data Type Detection
  • 27
  • PDF
Meimei: An Efficient Probabilistic Approach for Semantically Annotating Tables
  • 10
Assigning Semantic Labels to Data Sources
  • 56
  • PDF
Annotating and searching web tables using entities, types and relationships
  • 344
  • Highly Influential
  • PDF
Seeping Semantics: Linking Datasets Using Word Embeddings for Data Discovery
  • 45
  • PDF
Synthesizing Type-Detection Logic for Rich Semantic Data Types using Open-source Code
  • 15
  • PDF
Semantic Labeling: A Domain-Independent Approach
  • 56
  • PDF
Recovering Semantics of Tables on the Web
  • 311
  • Highly Influential
  • PDF
Learning to Match the Schemas of Data Sources: A Multistrategy Approach
  • 290
  • PDF
...
1
2
3
4
5
...