Bridging the Semantic Gap with SQL Query Logs in Natural Language Interfaces to Databases

@article{Baik2019BridgingTS,
  title={Bridging the Semantic Gap with SQL Query Logs in Natural Language Interfaces to Databases},
  author={Christopher Baik and H. V. Jagadish and Yunyao Li},
  journal={2019 IEEE 35th International Conference on Data Engineering (ICDE)},
  year={2019},
  pages={374-385}
}
A critical challenge in constructing a natural language interface to database (NLIDB) is bridging the semantic gap between a natural language query (NLQ) and the underlying data. Two specific ways this challenge exhibits itself is through keyword mapping and join path inference. Keyword mapping is the task of mapping individual keywords in the original NLQ to database elements (such as relations, attributes or values). It is challenging due to the ambiguity in mapping the user's mental model… 
MyNLIDB: A Natural Language Interface to Database
TLDR
This paper presents a system MyNLIDB that has a good performance with respect to keyword mapping, which uses a Schema-Graph created from the underlying database, Stanford part-of-speech parser and dependency parser to convert NL Query to SQL using pipeline processing.
Towards a Natural Language Query Processing System
TLDR
The novelty in the study lies in defining a graph database as a middle layer to store necessary metadata needed to transform a natural language query into structured query language that can be executed on backend databases.
ATHENA++: Natural Language Querying for Complex Nested SQL Queries
TLDR
This paper presents ATHENA++, an end-to-end system that can answer complex queries in natural language by translating them into nested SQL queries, and combines linguistic patterns from NL queries with deep domain reasoning using ontologies to enable nested query detection and generation.
Disambiguating Natural Language Queries with Tuples
Enabling natural language access to relational databases is challenging and often requires the disambiguation of a user’s natural language query by selecting the target interpretation from among many
Duoquest: A Dual-Specification System for Expressive SQL Queries
TLDR
This work introduces the novel dual-specification Duoquest system, which leverages guided partial query enumeration to efficiently explore the space of possible queries and presents results from user studies in which Duoquest demonstrates a 62.5% absolute increase in query construction accuracy over a state-of-the-art NLI and comparable accuracy to a PBE system on a limited workload supported by the PBEsystem.
Structuring Natural Language to Query Language: A Review
TLDR
This work has analyzed the existing models in Natural Language Processing, which convert a native-language query into an SQL query, and found that any novice user can use the SQL program and eliminate the need to generate any complex queries.
Natural language query handling using extended knowledge provider system
TLDR
An automated query-response model termed Extended Automated Knowledge Provider System (EAKPS) that can manage various types of natural language queries from user is proposed that can handle assertive, interrogative, imperative, compound and complex type query sentences.
Weakly Supervised Mapping of Natural Language to SQL through Question Decomposition
TLDR
This work uses the recently proposed question decomposition representation called QDMR, an intermediate between NL and formal query languages, and uses NL-QDMR pairs, along with the question answers, as supervision for automatically synthesizing SQL queries.
State of the Art and Open Challenges in Natural Language Interfaces to Data
TLDR
This tutorial will review natural language interface solutions in terms of their interpretation approach, as well as the complexity of the queries they can generate, and discuss open research challenges.
PIPELINE AND DEEP LEARNING APPROACH FOR NLIDB: A COMPARATIVE STUDY
Databases are integral part of current world’s scenario of rich technology. Greater amount of the data in the world is stored in the databases. That amount of data storages can be utilized for
...
1
2
3
...

References

SHOWING 1-10 OF 42 REFERENCES
ATHENA: An Ontology-Driven System for Natural Language Querying over Relational Data Stores
TLDR
ATHENA is presented, an ontology-driven system for natural language querying of complex relational databases that uses domain specific ontologies, which describe the semantic entities, and their relationships in a domain, through a unique two-stage approach.
NaLIX: an interactive natural language interface for querying XML
TLDR
It is shown that NaLIX, while far from being able to pass the Turing test, is perfectly usable in practice, and able to handle even quite complex queries in a variety of application domains.
Constructing an Interactive Natural Language Interface for Relational Databases
TLDR
The architecture of an interactive natural language query interface for relational databases is described, able to correctly interpret complex natural language queries, in a generic manner across a range of domains, and is good enough to be usable in practice.
Towards a theory of natural language interfaces to databases
TLDR
This paper proves that, for a broad class of semantically tractable natural language questions, Precise is guaranteed to map each question to the corresponding SQL query, and shows that Precise compares favorably with Mooney's learning NLI and with Microsoft's English Query product.
SQLizer: query synthesis from natural language
This paper presents a new technique for automatically synthesizing SQL queries from natural language (NL). At the core of our technique is a new NL-based program synthesis methodology that combines
Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning
TLDR
This work proposes Seq2 SQL, a deep neural network for translating natural language questions to corresponding SQL queries, and releases WikiSQL, a dataset of 80654 hand-annotated examples of questions and SQL queries distributed across 24241 tables fromWikipedia that is an order of magnitude larger than comparable datasets.
An End-to-end Neural Natural Language Interface for Databases
TLDR
DBPal uses a deep model to translate natural language statements to SQL, making the translation process more robust to paraphrasing and other linguistic variations and provides a learned auto-completion model that suggests partial query extensions to users during query formulation and thus helps to write complex queries.
Keyword searching and browsing in databases using BANKS
TLDR
BANKS is described, a system which enables keyword-based search on relational databases, together with data and schema browsing, and presents an efficient heuristic algorithm for finding and ranking query results.
SQAK: doing more with keywords
TLDR
SQAK provides a novel and exciting way to trade-off some of the expressive power of SQL in exchange for the ability to express a large class of aggregate queries using simple keywords.
Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task
TLDR
This work defines a new complex and cross-domain semantic parsing and text-to-SQL task so that different complicated SQL queries and databases appear in train and test sets and experiments with various state-of-the-art models show that Spider presents a strong challenge for future research.
...
1
2
3
4
5
...