Optimal Number of Clusters by Using Four Indexes
DOI:
https://doi.org/10.24996/ijs.2026.67.2.35Keywords:
clustering, K-means, machine learning, Elbow method, Silhouette score, Gap statistic, Davis-Bouldin indexAbstract
In data analysis, “Clustering” has emerged as a mechanism applied in machine learning to group analogous data points or objects together based on their features, attributes, or characteristics. Clustering attempts to detect underlying patterns or structures in data without prior knowledge of group labels. Many algorithms are used in clustering like K-means, one of the most widely used clustering algorithms whose performance depends on the initial point and the value of K. Most clustering techniques need to determine the number of clusters in the beginning. However, in most cases, predicting that value is a high computational cost task. In this paper, an algorithm is designed to compute the proper number of dataset clusters using various cluster validity indexes. The most popular CVIs (clustering validation indexes) are: Elbow method, Silhouette, Gap statistic, and Davis-Bouldin. The paper also proposes a new technique for estimating the appropriate number of clusters (k) depending on their indexes and ranks. The best result of the (ONC) algorithm obtained by the average of silhouette is: (0.501).
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Iraqi Journal of Science

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.



