Header

UZH-Logo

Maintenance Infos

A comparative survey of recent natural language interfaces for databases


Affolter, Katrin; Stockinger, Kurt; Bernstein, Abraham (2019). A comparative survey of recent natural language interfaces for databases. VLDB Journal, 28(5):793-819.

Abstract

Over the last few years, natural language interfaces (NLI) for databases have gained significant traction both in academia and industry. These systems use very different approaches as described in recent survey papers. However, these systems have not been systematically compared against a set of benchmark questions in order to rigorously evaluate their functionalities and expressive power. In this paper, we give an overview over 24 recently developed NLIs for databases. Each of the systems is evaluated using a curated list of ten sample questions to show their strengths and weaknesses. We categorize the NLIs into four groups based on the methodology they are using: keyword-, pattern-, parsing- and grammar-based NLI. Overall, we learned that keyword-based systems are enough to answer simple questions. To solve more complex questions involving subqueries, the system needs to apply some sort of parsing to identify structural dependencies. Grammar-based systems are overall the most powerful ones, but are highly dependent on their manually designed rules. In addition to providing a systematic analysis of the major systems, we derive lessons learned that are vital for designing NLIs that can answer a wide range of user questions.

Abstract

Over the last few years, natural language interfaces (NLI) for databases have gained significant traction both in academia and industry. These systems use very different approaches as described in recent survey papers. However, these systems have not been systematically compared against a set of benchmark questions in order to rigorously evaluate their functionalities and expressive power. In this paper, we give an overview over 24 recently developed NLIs for databases. Each of the systems is evaluated using a curated list of ten sample questions to show their strengths and weaknesses. We categorize the NLIs into four groups based on the methodology they are using: keyword-, pattern-, parsing- and grammar-based NLI. Overall, we learned that keyword-based systems are enough to answer simple questions. To solve more complex questions involving subqueries, the system needs to apply some sort of parsing to identify structural dependencies. Grammar-based systems are overall the most powerful ones, but are highly dependent on their manually designed rules. In addition to providing a systematic analysis of the major systems, we derive lessons learned that are vital for designing NLIs that can answer a wide range of user questions.

Statistics

Citations

Altmetrics

Downloads

29 downloads since deposited on 06 Sep 2019
29 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Language:English
Date:1 October 2019
Deposited On:06 Sep 2019 09:07
Last Modified:15 Oct 2019 01:27
Publisher:Springer
ISSN:1066-8888
OA Status:Green
Publisher DOI:https://doi.org/10.1007/s00778-019-00567-8
Official URL:https://link.springer.com/article/10.1007/s00778-019-00567-8
Other Identification Number:merlin-id:18121

Download

Green Open Access

Download PDF  'A comparative survey of recent natural language interfaces for databases'.
Preview
Content: Published Version
Filetype: PDF
Size: 1MB
View at publisher
Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)