Header

UZH-Logo

Maintenance Infos

A case study competition among methods for analyzing large spatial data


Heaton, Matthew J; Datta, Abhirup; Finley, Andrew O; Furrer, Reinhard; Guinness, Joseph; Guhaniyogi, Rajarshi; Gerber, Florian; Gramacy, Robert B; Hammerling, Dorit; Katzfuss, Matthias; Lindgren, Finn; Nychka, Douglas W; Sun, Furong; Zammit-Mangion, Andrew (2019). A case study competition among methods for analyzing large spatial data. Journal of agricultural, biological, and environmental statistics, 24(3):398-425.

Abstract

The Gaussian process is an indispensable tool for spatial data analysts. The onset of the “big data” era, however, has lead to the traditional Gaussian process being computationally infeasible for modern spatial data. As such, various alternatives to the full Gaussian process that are more amenable to handling big spatial data have been proposed. These modern methods often exploit low-rank structures and/or multi-core and multi-threaded computing environments to facilitate computation. This study provides, first, an introductory overview of several methods for analyzing large spatial data. Second, this study describes the results of a predictive competition among the described methods as implemented by different groups with strong expertise in the methodology. Specifically, each research group was provided with two training datasets (one simulated and one observed) along with a set of prediction locations. Each group then wrote their own implementation of their method to produce predictions at the given location and each was subsequently run on a common computing environment. The methods were then compared in terms of various predictive diagnostics. Supplementary materials regarding implementation details of the methods and code are available for this article online.

Abstract

The Gaussian process is an indispensable tool for spatial data analysts. The onset of the “big data” era, however, has lead to the traditional Gaussian process being computationally infeasible for modern spatial data. As such, various alternatives to the full Gaussian process that are more amenable to handling big spatial data have been proposed. These modern methods often exploit low-rank structures and/or multi-core and multi-threaded computing environments to facilitate computation. This study provides, first, an introductory overview of several methods for analyzing large spatial data. Second, this study describes the results of a predictive competition among the described methods as implemented by different groups with strong expertise in the methodology. Specifically, each research group was provided with two training datasets (one simulated and one observed) along with a set of prediction locations. Each group then wrote their own implementation of their method to produce predictions at the given location and each was subsequently run on a common computing environment. The methods were then compared in terms of various predictive diagnostics. Supplementary materials regarding implementation details of the methods and code are available for this article online.

Statistics

Citations

Dimensions.ai Metrics
21 citations in Web of Science®
19 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

42 downloads since deposited on 17 Jan 2019
32 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Mathematics
Dewey Decimal Classification:510 Mathematics
Scopus Subject Areas:Physical Sciences > Statistics and Probability
Life Sciences > Agricultural and Biological Sciences (miscellaneous)
Physical Sciences > General Environmental Science
Life Sciences > General Agricultural and Biological Sciences
Social Sciences & Humanities > Statistics, Probability and Uncertainty
Physical Sciences > Applied Mathematics
Uncontrolled Keywords:Agricultural and Biological Sciences (miscellaneous), Statistics, Probability and Uncertainty, Statistics and Probability, Applied Mathematics, General Agricultural and Biological Sciences, General Environmental Science
Language:English
Date:1 September 2019
Deposited On:17 Jan 2019 11:50
Last Modified:15 Apr 2020 22:35
Publisher:Springer
ISSN:1085-7117
OA Status:Green
Publisher DOI:https://doi.org/10.1007/s13253-018-00348-w
Project Information:
  • : FunderSNSF
  • : Grant ID200021_175529
  • : Project TitleDisentangling evidence from huge multivariate space-time data from the earth sciences

Download

Green Open Access

Download PDF  'A case study competition among methods for analyzing large spatial data'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 774kB
View at publisher
Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)