UZH-Logo

Maintenance Infos

Efficient evaluation of ad-hoc range aggregates


Ammendola, Christian; Böhlen, Michael Hanspeter; Gamper, Johann (2013). Efficient evaluation of ad-hoc range aggregates. In: DaWaK, Prague, Czech Republic, 26 August 2013 - 29 August 2013, 46-59.

Abstract

θ-MDA is a flexible and efficient operator for complex ad-hoc multi-dimensional aggregation queries. It separates the specification of aggregation groups for which aggregate values are computed (base table b) and the specification of aggregation tuples from which aggregate values are computed. Aggregation tuples are subsets of the detail table r and are defined by a general θ-condition. The θ-MDA requires one scan of r, during which the aggregates are incrementally updated. In this paper, we propose a two-step evaluation strategy for θ-MDA to optimize the computation of ad-hoc range aggregates by reducing them to point aggregates. The first step scans r and computes point aggregates as a partial intermediate result x̃, which can be done efficiently. The second step combines the point aggregates to the final aggregates. This transformation significantly reduces the number of incremental updates to aggregates and reduces the runtime from (∣∣r∣∣⋅∣∣b∣∣) to (∣∣r∣∣) , provided that ∣∣b∣∣<∣∣r∣∣‾‾‾√ and |x̃| ≈ |b|, which is common for OLAP. An empirical evaluation confirms the analytical results and shows the effectiveness of our optimization: range queries are evaluated with almost the same efficiency as point queries.

θ-MDA is a flexible and efficient operator for complex ad-hoc multi-dimensional aggregation queries. It separates the specification of aggregation groups for which aggregate values are computed (base table b) and the specification of aggregation tuples from which aggregate values are computed. Aggregation tuples are subsets of the detail table r and are defined by a general θ-condition. The θ-MDA requires one scan of r, during which the aggregates are incrementally updated. In this paper, we propose a two-step evaluation strategy for θ-MDA to optimize the computation of ad-hoc range aggregates by reducing them to point aggregates. The first step scans r and computes point aggregates as a partial intermediate result x̃, which can be done efficiently. The second step combines the point aggregates to the final aggregates. This transformation significantly reduces the number of incremental updates to aggregates and reduces the runtime from (∣∣r∣∣⋅∣∣b∣∣) to (∣∣r∣∣) , provided that ∣∣b∣∣<∣∣r∣∣‾‾‾√ and |x̃| ≈ |b|, which is common for OLAP. An empirical evaluation confirms the analytical results and shows the effectiveness of our optimization: range queries are evaluated with almost the same efficiency as point queries.

Altmetrics

Downloads

94 downloads since deposited on 07 Oct 2013
34 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Language:English
Event End Date:29 August 2013
Deposited On:07 Oct 2013 14:52
Last Modified:05 Apr 2016 17:01
Publisher:Springer
Series Name:Lecture Notes in Computer Science
ISSN:0302-9743
ISBN:978-3-642-40130-5
Publisher DOI:https://doi.org/10.1007/978-3-642-40131-2_5
Other Identification Number:merlin-id:8372
Permanent URL: https://doi.org/10.5167/uzh-81613

Download

[img]
Preview
Content: Accepted Version
Filetype: PDF
Size: 372kB
View at publisher

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations