Abstract
Processing streams rather than static files of Linked Data has gained increasing importance in the web of data. When processing data streams system builders are faced with the conundrum of guaranteeing a constant maximum response time with limited resources and, possibly, no prior information on the data arrival frequency. One approach to address this issue is to delete data from a cache during processing – a process we call eviction. The goal of this paper is to show that data- driven eviction outperforms today’s dominant data-agnostic approaches such as first-in-first-out or random deletion. Specifically, we first introduce a method called Clock that evicts data from a join cache based on the likelihood estimate of contributing to a join in the future. Second, using the well-established SR-Bench benchmark as well as a data set from the IPTV domain, we show that Clock outperforms data-agnostic approaches indicating its usefulness for resource-limited linked data stream processing.