Header

UZH-Logo

Maintenance Infos

CORE: Nonparametric Clustering of Large Numeric Databases


Taliun, Andrej; Böhlen, Michael Hanspeter; Mazeika, Arturas (2009). CORE: Nonparametric Clustering of Large Numeric Databases. In: SDM 2009: Proceedings of the SIAM International Conference on Data Mining, Sparks, Nevada, USA, 30 April 2009 - 2 May 2009, 14-25.

Abstract

Current clustering techniques are able to identify arbitrarily shaped clusters in the presence of noise, but depend on carefully chosen model parameters. The choice of model parameters is difficult: it depends on the data and the clustering technique at hand, and finding good model parameters often requires time consuming human interaction. In this paper we propose CORE, a new nonparametric clustering technique that explicitly computes the local maxima of the density and represents them with cores. CORE proposes an adaptive grid and gradients to define and compute the cores of clusters. The incrementally constructed adaptive grid and the gradients make the identification of cores robust, scalable, and independent of small density fluctuations. Our experimental studies show that CORE without any carefully chosen model parameters produces better quality clustering than related techniques and is efficient for large datasets.

Abstract

Current clustering techniques are able to identify arbitrarily shaped clusters in the presence of noise, but depend on carefully chosen model parameters. The choice of model parameters is difficult: it depends on the data and the clustering technique at hand, and finding good model parameters often requires time consuming human interaction. In this paper we propose CORE, a new nonparametric clustering technique that explicitly computes the local maxima of the density and represents them with cores. CORE proposes an adaptive grid and gradients to define and compute the cores of clusters. The incrementally constructed adaptive grid and the gradients make the identification of cores robust, scalable, and independent of small density fluctuations. Our experimental studies show that CORE without any carefully chosen model parameters produces better quality clustering than related techniques and is efficient for large datasets.

Statistics

Citations

Downloads

133 downloads since deposited on 01 Jun 2012
21 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Event End Date:2 May 2009
Deposited On:01 Jun 2012 15:47
Last Modified:07 Dec 2017 11:34
Publisher:SIAM (Society for Industrial and Applied Mathematics)
Official URL:http://www.siam.org/proceedings/datamining/2009/dm09_003_taliuna.pdf
Other Identification Number:merlin-id:2296

Download

Download PDF  'CORE: Nonparametric Clustering of Large Numeric Databases'.
Preview
Content: Published Version
Filetype: PDF (Copyright: SIAM (Society for Industrial and Applied Mathematics))
Size: 1MB