Permanent URL to this publication: http://dx.doi.org/10.5167/uzh-44858
Nguyen, M K; Basca, C; Bernstein, A (2010). B+Hash Tree: optimizing query execution times for on-disk semantic web data structures. In: Proceedings Of The 6th International Workshop On Scalable Semantic Web Knowledge Base Systems (SSWS2010), Shanghai, China, 08 November 2010 - 08 November 2010, 96-111.
| PDF 764Kb |
Abstract
The increasing growth of the Semantic Web has substantially enlarged the amount of data available in RDF format. One proposed solution is to map RDF data to relational databases (RDBs). The lack of a common schema, however, makes this mapping inefficient. Some RDF-native solutions use B+Trees, which are potentially becoming a bottleneck, as the single key-space approach of the Semantic Web may even make their O(log(n)) worst case performance too costly. Alternatives, such as hash-based approaches, suffer from insufficient update and scan performance. In this paper we propose a novel type of index structure called a B+Hash Tree, which combines the strengths of traditional B-Trees with the speedy constant-time lookup of a hash-based structure. Our main research idea is to enhance the B+Tree with a Hash Map to enable constant retrieval time instead of the common logarithmic one of the B+Tree. The result is a scalable, updatable, and lookup-optimized, on-disk index-structure that is especially suitable for the large key-spaces of RDF datasets. We evaluate the approach against existing RDF indexing schemes using two commonly used datasets and show that a B+Hash Tree is at least twice as fast as its competitors - an advantage that we show should grow as dataset sizes increase.
| Item Type: | Conference or Workshop Item (Paper), refereed, original work |
|---|---|
| Communities & Collections: | 03 Faculty of Economics > Department of Informatics |
| DDC: | 000 Computer science, knowledge & systems |
| Language: | English |
| Event End Date: | 08 November 2010 |
| Deposited On: | 24 Feb 2011 16:05 |
| Last Modified: | 17 Oct 2012 10:09 |
| Other Identification Number: | 1459 |
Users (please log in): suggest update or correction for this item
Repository Staff Only: item control page