Archive - Central European Conference on Information and Intelligent Systems, CECIIS - 2009

Font Size: 
Combining Different Clustering Techniques for Improved Knowledge Discovery
Zita Boshnjak, Olivera Grljevic

Last modified: 2009-08-19

Abstract


<!-- /* Font Definitions */ @font-face {font-family:Calibri; mso-font-alt:"Century Gothic"; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-1610611985 1073750139 0 0 159 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-parent:""; margin-top:0cm; margin-right:0cm; margin-bottom:6.0pt; margin-left:0cm; text-align:center; mso-pagination:widow-orphan; font-size:11.0pt; font-family:Calibri; mso-fareast-font-family:Calibri; mso-bidi-font-family:"Times New Roman";} @page Section1 {size:612.0pt 792.0pt; margin:72.0pt 90.0pt 72.0pt 90.0pt; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.Section1 {page:Section1;} -->

Application of each clustering technique, sometimes even the multiplied application of the same algorithm, to an initial data set, can result in different partitions of the source data set, with an accent on a specific aspect of resulting clusters.  Apart from diverse outputs, clustering algorithms use different visualization forms of derived clusters, which enable better insight into their structure and relations that group similar entities together, describe clusters’ centers, typical clusters’ representatives, the least typical clusters’ instances, etc. As a great responsibility is on analysts to carry out the interpretation of the results obtained by some of available tools, as well as to give meaningful explanations of derived clusters, any additional information that could be made available upon utilization of different tools supporting clustering would be of great use in shaping the conclusions.

This paper presents the results of clustering of small and medium sized enterprises (SME) data in province of Vojvodina, by means of DataEngine, Weka and iDataAnalyzer tools. Each tool supports a different clustering algorithm. The described composite approach, that implies diversity of tools and outputs, significantly simplifies the task of knowledge discovery, helps the analysts in interpreting the results, and facilitates detailed and clear conclusions formulation.


Full Text: PDF