files/journal/2022-09-02_12-20-40-000000_622.png

International Journal of Soft Computing

ISSN: Online
ISSN: Print 1816-9503
94
Views
1
Downloads

Genetic Algorithm Based Dimensionality Reduction for Improving Performance of K-Means Clustering: A Case Study for Categorization of Medical Dataset

Asha Gowda Karegowda, Vidya T. Shama, M.A. Jayaram and A.S. Manjunath
Page: 249-255 | Received 21 Sep 2022, Published online: 21 Sep 2022

Full Text Reference XML File PDF File

Abstract

Medical data mining is the process of extracting hidden patterns from medical data. Among the various clustering algorithms, k-means is the one of most widely used clustering technique. The performance of k-means clustering depends on the initial cluster centers and might converge to local optimum. k-means does not guarantee unique clustering because it generates different results with randomly chosen initial clusters for different runs of k-means. In addition the performance of any data mining depends on feature subset selection. This study attempts to improve performance of k-means clustering using two stages. As part of first stage, this study investigates the use of wrapper approach for feature selection for clustering where Genetic Algorithm (GA) is used as a random search technique for subset generation, wrapped with k-means clustering. In second stage, GA and Entropy based Fuzzy Clustering (EFC) are used to find the initial centroid for k-means clustering. Experiments have been conducted using standard medical dataset namely Pima Indians Diabetes Dataset (PIDD) and Heart statlog. Results show markable reduction of 8.42 and 18.89% in the classification error of k-means clustering for PIDD and Heart statlog dataset using features identified by proposed wrapper approach and initial centroids identified by GA when compared to k-means performance with all the features and centroids initialized by random method for PIDD and Heart statlog dataset.


How to cite this article:

Asha Gowda Karegowda, Vidya T. Shama, M.A. Jayaram and A.S. Manjunath. Genetic Algorithm Based Dimensionality Reduction for Improving Performance of K-Means Clustering: A Case Study for Categorization of Medical Dataset.
DOI: https://doi.org/10.36478/ijscomp.2012.249.255
URL: https://www.makhillpublications.co/view-article/1816-9503/ijscomp.2012.249.255