Abstract
Histograms are a fundamental tool for multidimensional data analysis and processing, and many applications in graphics and visualization rely on computing histograms over large regions of interest (ROI). Integral histograms (IH) greatly accelerate the calculation in the case of rectangular regions, but come at a large extra storage cost. Based on the tensor train decomposition model, we propose a new compression and approximate retrieval algorithm to reduce the overall IH memory usage by several orders of magnitude at a user-defined accuracy. To this end we propose an incremental tensor decomposition algorithm that allows us to compress integral histograms of hundreds of gigabytes. We then encode the borders of any desired rectangular ROI in the IH tensor-compressed domain and reconstruct the target histogram at a high speed which is independent of the region size. We furthermore generalize the algorithm to support regions of arbitrary shape rather than only rectangles, as well as histogram field computation, i.e., recovering many histograms at once. We test our method with several multidimensional data sets and demonstrate that it radically speeds up costly histogram queries while avoiding storing massive, uncompressed IHs.