Aplication of Validity Index in K Means and Fuzzy C Means

Jontinus Manullang; Pahala Sirait; Andri Andri

doi:10.35335/mantik.Vol4.2020.958.pp1430-1438

PDF Download

Published: Aug 31, 2020

DOI: https://doi.org/10.35335/mantik.Vol4.2020.958.pp1430-1438

Keywords:

Cluster validity index, Davies bouldin index, kd tree, k means, fuzzy c means

Issue

Vol. 4 No. 2 (2020): Augustus: Manajemen, Teknologi Informatika dan Komunikasi (Mantik)

Section

Computer Science

Statistics Article

Article View : 315 Times

Jontinus Manullang

STMIK Mikroskil

Pahala Sirait

STMIK Mikroskil

Andri Andri

STMIK Mikroskil

Abstract

K-Means and Fuzzy C-Means Clustering is a method of analyzing data that performs the modeling process without supervision (without supervision) and is a method that groups data by partitioning the system. Clusters Clusters and Fuzzy C-Means will produce different clusters in the same dataset, cluster validity index is a method that can be used to improve the results of clustering generated by the clustering method. This study will use the cluster validity index on the kmeans clustering algorithm and Fuzzy C-Means by calculating the index of validity of each kmeans clustering result with k = 2, ..., kmax (k max determined at the beginning) and the results from Fuzzy C-Means with c = 2, ...., cmax (c max is specified at the beginning). By using the cluster validity index, the most optimal cluster is obtained in the second cluster with the Dbi value = 0.45 in the mean K and the second cluster with the Dbi value = 0.5 in the Fuzzy C Mean, and the results of the clustering are consistent.

Downloads

Download data is not yet available.

How to Cite

Manullang, J., Sirait, P. and Andri, A. (2020) “Aplication of Validity Index in K Means and Fuzzy C Means”, Jurnal Mantik, 4(2), pp. 1430-1438. doi: 10.35335/mantik.Vol4.2020.958.pp1430-1438.

References

[1] Xu & Wunsch, 2009, Clustering, pp. 1-15, Wiley-IEEE Press, Available from: Ebook Library. [Agustus 2009].
[2] Hämäläinen J, Jauhiainen S & Kärkkäinen T 2017, Comparison of Internal Clustering Validation Indices for Prototype-Based Clustering. Algorithms 2017, 10, 105.
[3] Halkidi et., al 2001, On Clustering Validation Techniques, Journal of Intelligent Information Systems, 17:2/3, pp. 107–145, 2001.
[4] Han et., al 2012, Data Mining Concepts and Techniques. 3rd Edition, Morgan Kaufmann Publishers, Waltham.
[5] Kusumadewi S, Hartati S, Harjoko, Agus, Wardoyo & Retantyo 2006. Fuzzy Multi-Attribute Decision Making (FUZZY MADM). Graha Ilmu, Yogyakata.
[6] Min & Kai-fei 2015, Improved research to k-means initial cluster centers, 978-1-4673-9295-2/15 $31.00 © 2015 IEEE, 2015 Ninth International Conference on Frontier of Computer Science and Technology.
[7] Ding et al., 2015, Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm, Neurocomputing, 2015, 188:233-238.
[8] Arbelaitz O, Gurrutxaga I, Muguerza J, Perez J.M & Perona I. “An extensive comparative study of cluster validity indices”, Pattern Recognition 46. 2013. Pp 243-256.
[9] Zahra S, Ghazanfar M.A, Khalid A, Azam, M.A Naeem, N & Prugel-Bennett A. “Novel centroid selection approaches for KMeans-clustering based recommender systems”, Information Sciences, 320 .2015. pp 156–189.
[10] Nazeer K.A & Sebastian M.P 2010, Clustering biological data using enhanced k-means algorithm, in: Electronic Engineering and Computing Technology, Springer, 2010, pp. 433–442 (chapter 37).
[11] Stephen J Redmond and Conor Heneghan, "A method for initialising the K-means clustering algorithm using kd-trees," Pattern Recognition Letters, vol. 28, no. 8, pp. 965–973, June 2007.
[12] aI. Katsavounidis, C.C.J. Kuo, and Z. Zhen, "A new initialization technique for generalized lloyd iteration," IEEE Signal Processing Letter, vol. 1, no. 10, pp. 144–146, 1994.
[13] Lianyu H & Caiming Z 2019, An Internal Validity Index Based On density-involved distance ,IEEE Access, VOLUME 4, 2016, no. 1, pp. 1-14, 2019.
[14] Said, R. Hadjidj & S. Foufou, Cluster validity index based on jeffrey divergence, Pattern Analysis and Applications, vol. 20, no. 1, pp. 21–31, 2017.
[15] Salem et al., A vertex chain code approach for image recognition. ICGST International Journal on Graphics, vision and Image processing 05 (2005).
[16] Zhao et., al 2012, Imagination Difficulty and New Product Evaluation, J PROD INNOV MANAG 2012;29 (S1):pp. 76–90.
[17] Dunn JC 1974, Well separated clusters and optimal fuzzy partitions.J Cybern 4:95–104.
[18] Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(4):224–227.
[19] Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recogn 37(3):487–501.
[20] P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” Journal of computational and applied math- ematics, vol. 20, pp. 53–65, 1987.
[21] Soelaiman Isye; Gosno, Eric Budiman, R. A. (2013) ‘Implementasi KD-Tree K-Means Clustering untuk Klasterisasi Dokumen’, Jurnal Teknik ITS, 2(Vol 2, No 2 (2013)), pp. A432–A437. Available at: http://ejurnal.its.ac.id/index.php/teknik/article/view/3872.
[22] Likas, A., Vlassis, N., Verbeek, J.J., 2003. The global k-means clustering algorithm. Pattern Recognition 36, 451–461.
[23] Madhulatha, S. T 2012, “An overview on clustering method”, IOSR Journal of Engineering Apr. 2012, Vol. 2(4) pp: 719-725.
[24] Poteras, C.M., Mih?escu, M.C. & Mocanu, M. 2014. An optimized version of the kmeans clustering algorithm. Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, pp. 695–699.
[25] Goyal, M. & Kumar, S. 2014. Improving the initial centroids of k-means clustering algorithm to generalize its applicability. Journal of The Institution of Engineers 95(4): 345–350.
[26] Mr.Kaushi K Phukon MCA, P. H. K. B. (2013) ‘Extension of the Fuzzy C Means Clustering Algorithm To Fit With the Composite Graph’, International Journal Of Cognitive Research In science,engineering and education, 1(2).
[27] Andriyani, T. M., Linawati, L., and Setiawan, A., 2013, "Penerapan Algoritma Fuzzy C-Means (Fcm) Pada Penentuan Lokasi Pendirian Loket Pembayaran Air PDAM Salatiga," Prosiding Seminar Nasional Sains dan Pendidikan Sains VIII, Fakultas Sains dan Matematika Universitas Kristen Satya Wacana, Salatiga.
[28] Nawrin, S., Rahman, M.R. & Akhter, S. 2017. Exploreing k-means with internal validity indexes for data clustering in traffic management system. International Journal of Advanced Computer Science and Applications 8(3): 264-272.
[29] Bates A. & Kalita J. 2016. Counting clusters in twitter posts. Proceedings of the 2nd International Conference on Information Technology for Competitive Strategies, pp. 85.

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details