Bridging the Semantic Gap with SQL Query Logs in Natural Language Interfaces to Databases

@article{Baik2019BridgingTS,
  title={Bridging the Semantic Gap with SQL Query Logs in Natural Language Interfaces to Databases},
  author={Christopher Baik and H. V. Jagadish and Yunyao Li},
  journal={2019 IEEE 35th International Conference on Data Engineering (ICDE)},
  year={2019},
  pages={374-385}
}
A critical challenge in constructing a natural language interface to database (NLIDB) is bridging the semantic gap between a natural language query (NLQ) and the underlying data. Two specific ways this challenge exhibits itself is through keyword mapping and join path inference. Keyword mapping is the task of mapping individual keywords in the original NLQ to database elements (such as relations, attributes or values). It is challenging due to the ambiguity in mapping the user's mental model… Expand
MyNLIDB: A Natural Language Interface to Database
TLDR
This paper presents a system MyNLIDB that has a good performance with respect to keyword mapping, which uses a Schema-Graph created from the underlying database, Stanford part-of-speech parser and dependency parser to convert NL Query to SQL using pipeline processing. Expand
Towards a Natural Language Query Processing System
TLDR
The novelty in the study lies in defining a graph database as a middle layer to store necessary metadata needed to transform a natural language query into structured query language that can be executed on backend databases. Expand
ATHENA++: Natural Language Querying for Complex Nested SQL Queries
TLDR
This paper presents ATHENA++, an end-to-end system that can answer complex queries in natural language by translating them into nested SQL queries, and combines linguistic patterns from NL queries with deep domain reasoning using ontologies to enable nested query detection and generation. Expand
Disambiguating Natural Language Queries with Tuples
Enabling natural language access to relational databases is challenging and often requires the disambiguation of a user’s natural language query by selecting the target interpretation from among manyExpand
Duoquest: A Dual-Specification System for Expressive SQL Queries
TLDR
This work introduces the novel dual-specification Duoquest system, which leverages guided partial query enumeration to efficiently explore the space of possible queries and presents results from user studies in which Duoquest demonstrates a 62.5% absolute increase in query construction accuracy over a state-of-the-art NLI and comparable accuracy to a PBE system on a limited workload supported by the PBEsystem. Expand
Structuring Natural Language to Query Language: A Review
TLDR
This work has analyzed the existing models in Natural Language Processing, which convert a native-language query into an SQL query, and found that any novice user can use the SQL program and eliminate the need to generate any complex queries. Expand
Natural language query handling using extended knowledge provider system
TLDR
An automated query-response model termed Extended Automated Knowledge Provider System (EAKPS) that can manage various types of natural language queries from user is proposed that can handle assertive, interrogative, imperative, compound and complex type query sentences. Expand
State of the Art and Open Challenges in Natural Language Interfaces to Data
TLDR
This tutorial will review natural language interface solutions in terms of their interpretation approach, as well as the complexity of the queries they can generate, and discuss open research challenges. Expand
SpeakQL: Towards Speech-driven Multimodal Querying of Structured Data
TLDR
This work designs a speech-driven querying system and interface for structured data called SpeakQL that support a practically useful subset of regular SQL and allow users to query in any domain with novel touch/speech based human-in-the-loop correction mechanisms. Expand
Summary of Natural Language Generated SQL Statements
The entry cost of database query SQL statement is high, which is difficult for most database users. Therefore, natural language automatic generation of SQL sentences has gradually become the leadingExpand
...
1
2
3
...

References

SHOWING 1-10 OF 42 REFERENCES
ATHENA: An Ontology-Driven System for Natural Language Querying over Relational Data Stores
TLDR
ATHENA is presented, an ontology-driven system for natural language querying of complex relational databases that uses domain specific ontologies, which describe the semantic entities, and their relationships in a domain, through a unique two-stage approach. Expand
NaLIX: an interactive natural language interface for querying XML
TLDR
It is shown that NaLIX, while far from being able to pass the Turing test, is perfectly usable in practice, and able to handle even quite complex queries in a variety of application domains. Expand
Constructing an Interactive Natural Language Interface for Relational Databases
TLDR
The architecture of an interactive natural language query interface for relational databases is described, able to correctly interpret complex natural language queries, in a generic manner across a range of domains, and is good enough to be usable in practice. Expand
Towards a theory of natural language interfaces to databases
TLDR
This paper proves that, for a broad class of semantically tractable natural language questions, Precise is guaranteed to map each question to the corresponding SQL query, and shows that Precise compares favorably with Mooney's learning NLI and with Microsoft's English Query product. Expand
SQLizer: query synthesis from natural language
This paper presents a new technique for automatically synthesizing SQL queries from natural language (NL). At the core of our technique is a new NL-based program synthesis methodology that combinesExpand
Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning
TLDR
This work proposes Seq2 SQL, a deep neural network for translating natural language questions to corresponding SQL queries, and releases WikiSQL, a dataset of 80654 hand-annotated examples of questions and SQL queries distributed across 24241 tables fromWikipedia that is an order of magnitude larger than comparable datasets. Expand
An End-to-end Neural Natural Language Interface for Databases
TLDR
DBPal uses a deep model to translate natural language statements to SQL, making the translation process more robust to paraphrasing and other linguistic variations and provides a learned auto-completion model that suggests partial query extensions to users during query formulation and thus helps to write complex queries. Expand
Keyword searching and browsing in databases using BANKS
TLDR
BANKS is described, a system which enables keyword-based search on relational databases, together with data and schema browsing, and presents an efficient heuristic algorithm for finding and ranking query results. Expand
SQAK: doing more with keywords
TLDR
SQAK provides a novel and exciting way to trade-off some of the expressive power of SQL in exchange for the ability to express a large class of aggregate queries using simple keywords. Expand
Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task
TLDR
This work defines a new complex and cross-domain semantic parsing and text-to-SQL task so that different complicated SQL queries and databases appear in train and test sets and experiments with various state-of-the-art models show that Spider presents a strong challenge for future research. Expand
...
1
2
3
4
5
...