Combining Different Clustering Techniques for Improved Knowledge Discovery

Archive - Central European Conference on Information and Intelligent Systems, CECIIS - 2009

Zita Boshnjak, Olivera Grljevic

Last modified: 2009-08-19

Abstract

Application of each clustering technique, sometimes even the multiplied application of the same algorithm, to an initial data set, can result in different partitions of the source data set, with an accent on a specific aspect of resulting clusters. Apart from diverse outputs, clustering algorithms use different visualization forms of derived clusters, which enable better insight into their structure and relations that group similar entities together, describe clusters’ centers, typical clusters’ representatives, the least typical clusters’ instances, etc. As a great responsibility is on analysts to carry out the interpretation of the results obtained by some of available tools, as well as to give meaningful explanations of derived clusters, any additional information that could be made available upon utilization of different tools supporting clustering would be of great use in shaping the conclusions.

This paper presents the results of clustering of small and medium sized enterprises (SME) data in province of Vojvodina, by means of DataEngine, Weka and iDataAnalyzer tools. Each tool supports a different clustering algorithm. The described composite approach, that implies diversity of tools and outputs, significantly simplifies the task of knowledge discovery, helps the analysts in interpreting the results, and facilitates detailed and clear conclusions formulation.

Full Text: PDF