Quick Search:

uzh logo
Browse by:

Zurich Open Repository and Archive

Maintenance: Tuesday, 5.7.2016, 07:00-08:00

Maintenance work on ZORA and JDB on Tuesday, 5th July, 07h00-08h00. During this time there will be a brief unavailability for about 1 hour. Please be patient.

Permanent URL to this publication: http://dx.doi.org/10.5167/uzh-24448

Glavic, B; Alonso, G (2009). Provenance for Nested Subqueries. In: 12th International Conference on Extending Database Technology, Saint Petersburg, Russia, 24 March 2009 - 26 March 2009, 982-993.



Data provenance is essential in applications such as scientific computing, curated databases, and data warehouses. Several systems have been developed that provide provenance functionality for the relational data model. These systems support only a subset of SQL, a severe limitation in practice since most of the application domains that benefit from provenance information use complex queries. Such queries typically involve nested subqueries, aggregation and/or user defined functions. Without support for these constructs, a provenance management system is of limited use.

In this paper we address this limitation by exploring the problem of provenance derivation when complex queries are involved. More precisely, we demonstrate that the widely used definition of Why-provenance fails in the presence of nested subqueries, and show how the definition can be modified to produce meaningful results for nested subqueries. We further present query rewrite rules to transform an SQL query into a query propagating provenance. The solution introduced in this paper allows us to track provenance information for a far wider subset of SQL than any of the existing approaches. We have incorporated these ideas into the Perm provenance management system engine and used it to evaluate the feasibility and performance of our approach.



44 downloads since deposited on 16 Dec 2009
18 downloads since 12 months

Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Uncontrolled Keywords:provenance, query rewrite, nested subqueries
Event End Date:26 March 2009
Deposited On:16 Dec 2009 08:14
Last Modified:05 Apr 2016 13:34
Series Name:ACM International Conference Proceeding Series (AICPS)
Official URL:http://dblp.uni-trier.de/db/conf/edbt/edbt2009.html

Users (please log in): suggest update or correction for this item

Repository Staff Only: item control page