UZH-Logo

Maintenance Infos

Recovery of Missing Values using Matrix Decomposition Techniques


Khayati, Mourad. Recovery of Missing Values using Matrix Decomposition Techniques. 2015, University of Zurich, Faculty of Economics.

Abstract

Time series data is prominent in many real world applications, e.g., hydrology or finance stock market. In many of these applications, time series data is missing in blocks, i.e., multiple consecutive values are missing. For example, in the hydrology field around 20% of the data is missing in blocks. However, many time series analysis tasks, such as prediction, require the existence of complete data. The recovery of blocks of missing values in time series is challenging if the missing block is a peak or a valley. The problem is more challenging in real world time series because of the irregularity in the data. The state-of-the-art recovery techniques are suitable either for the recovery of single missing values or for the recovery of blocks of missing values in regular time series. The goal of this thesis is to propose an accurate recovery of blocks of missing values in irregular time series. The recovery solution we propose is based on matrix decomposition techniques. The main idea of the recovery is to represent correlated time series as columns of an input matrix where missing values have been initialized and iteratively apply matrix decomposition technique to refine the initialized missing values. A key property of our recovery solution is that it learns the shape, the width and the amplitude of the missing blocks from the history of the time series that contains the missing blocks and the history of its correlated time series. Our experiments on real world hydrological time series show that our approach outperforms the state-of-the-art recovery techniques for the recovery of missing blocks in irregular time series. The recovery solution is implemented as a graphical tool that displays, browses and accurately recovers missing blocks in irregular time series. The proposed approach supports learning from highly and lowly correlated time series. This is important since lowly correlated time series, e.g., shifted time series, that exhibit shape and/or trend similarities are beneficial for the recovery process. We reduce the space complexity of the proposed solution from quadratic to linear. This allows to use time series with long histories without prior segmentation. We prove the scalability and the correctness of the solution.

Time series data is prominent in many real world applications, e.g., hydrology or finance stock market. In many of these applications, time series data is missing in blocks, i.e., multiple consecutive values are missing. For example, in the hydrology field around 20% of the data is missing in blocks. However, many time series analysis tasks, such as prediction, require the existence of complete data. The recovery of blocks of missing values in time series is challenging if the missing block is a peak or a valley. The problem is more challenging in real world time series because of the irregularity in the data. The state-of-the-art recovery techniques are suitable either for the recovery of single missing values or for the recovery of blocks of missing values in regular time series. The goal of this thesis is to propose an accurate recovery of blocks of missing values in irregular time series. The recovery solution we propose is based on matrix decomposition techniques. The main idea of the recovery is to represent correlated time series as columns of an input matrix where missing values have been initialized and iteratively apply matrix decomposition technique to refine the initialized missing values. A key property of our recovery solution is that it learns the shape, the width and the amplitude of the missing blocks from the history of the time series that contains the missing blocks and the history of its correlated time series. Our experiments on real world hydrological time series show that our approach outperforms the state-of-the-art recovery techniques for the recovery of missing blocks in irregular time series. The recovery solution is implemented as a graphical tool that displays, browses and accurately recovers missing blocks in irregular time series. The proposed approach supports learning from highly and lowly correlated time series. This is important since lowly correlated time series, e.g., shifted time series, that exhibit shape and/or trend similarities are beneficial for the recovery process. We reduce the space complexity of the proposed solution from quadratic to linear. This allows to use time series with long histories without prior segmentation. We prove the scalability and the correctness of the solution.

Downloads

4 downloads since deposited on 14 Jan 2016
4 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Dissertation
Referees:Böhlen Michael Hanspeter
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Language:English
Date:2015
Deposited On:14 Jan 2016 13:31
Last Modified:05 Apr 2016 19:55
Other Identification Number:merlin-id:12953
Permanent URL: https://doi.org/10.5167/uzh-119431

Download

[img]
Filetype: PDF - Registered users only
Size: 2MB

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations