Abstract
In temporal-probabilistic (TP) databases, the combination of the temporal and the probabilistic dimension adds significant overhead to the computation of set operations. Although set queries are guaranteed to yield linearly sized output relations, all of the existing solutions exhibit a quadratic runtime complexity. They suffer from redundant interval comparisons and additional joins for the formation of lineage expressions. In this paper, we formally define TP set operations and study their properties. For their efficient computation, we introduce the lineage-aware temporal window, a mechanism that binds intervals with lineage expressions. We suggest the lineage-aware window advancer (LAWA) for producing lineage-aware temporal windows, which enable direct filtering of irrelevant intervals and finalization of output lineage expressions. This way, we compute TP set operations in linearithmic time. A series of experiments over both synthetic and real-world datasets show that (a) our approach has predictable performance, which depends only on the size of the input relations and not on the number of time intervals per fact or the overlap of the time intervals, and that (b) it outperforms state-of-the-art approaches.