Header

UZH-Logo

Maintenance Infos

CORE: Nonparametric Clustering of Large Numeric Databases


Taliun, Andrej; Böhlen, Michael Hanspeter; Mazeika, Arturas (2009). CORE: Nonparametric Clustering of Large Numeric Databases. In: SDM 2009: Proceedings of the SIAM International Conference on Data Mining, Sparks, Nevada, USA, 30 April 2009 - 2 May 2009. SIAM (Society for Industrial and Applied Mathematics), 14-25.

Abstract

Current clustering techniques are able to identify arbitrarily shaped clusters in the presence of noise, but depend on carefully chosen model parameters. The choice of model parameters is difficult: it depends on the data and the clustering technique at hand, and finding good model parameters often requires time consuming human interaction. In this paper we propose CORE, a new nonparametric clustering technique that explicitly computes the local maxima of the density and represents them with cores. CORE proposes an adaptive grid and gradients to define and compute the cores of clusters. The incrementally constructed adaptive grid and the gradients make the identification of cores robust, scalable, and independent of small density fluctuations. Our experimental studies show that CORE without any carefully chosen model parameters produces better quality clustering than related techniques and is efficient for large datasets.

Abstract

Current clustering techniques are able to identify arbitrarily shaped clusters in the presence of noise, but depend on carefully chosen model parameters. The choice of model parameters is difficult: it depends on the data and the clustering technique at hand, and finding good model parameters often requires time consuming human interaction. In this paper we propose CORE, a new nonparametric clustering technique that explicitly computes the local maxima of the density and represents them with cores. CORE proposes an adaptive grid and gradients to define and compute the cores of clusters. The incrementally constructed adaptive grid and the gradients make the identification of cores robust, scalable, and independent of small density fluctuations. Our experimental studies show that CORE without any carefully chosen model parameters produces better quality clustering than related techniques and is efficient for large datasets.

Statistics

Citations

Downloads

139 downloads since deposited on 01 Jun 2012
5 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Scopus Subject Areas:Physical Sciences > Computational Theory and Mathematics
Physical Sciences > Software
Physical Sciences > Applied Mathematics
Language:English
Event End Date:2 May 2009
Deposited On:01 Jun 2012 15:47
Last Modified:15 Nov 2021 08:17
Publisher:SIAM (Society for Industrial and Applied Mathematics)
OA Status:Green
Official URL:http://www.siam.org/proceedings/datamining/2009/dm09_003_taliuna.pdf
Other Identification Number:merlin-id:2296
  • Content: Published Version
  • Description: Copyright: SIAM (Society for Industrial and Applied Mathematics)