Finding Best Clustering For Big Networks with Minimum Objective Function by Using Probabilistic Tabu Search

Fuzzy C-means (FCM) is a clustering method used for collecting similar data elements within the group according to specific measurements. Tabu is a heuristic algorithm. In this paper, Probabilistic Tabu Search for FCM implemented to find a global clustering based on the minimum value of the Fuzzy objective function. The experiments designed for different networks, and cluster’s number the results show the best performance based on the comparison that is done between the values of the objective function in the case of using standard FCM and Tabu-FCM, for the average of ten runs.


Introduction
Networks appear in different topics, for example, social media, electrical power networks, communication networks, Politic, biology, etc. In general, the strictures of the networks are finding by applying the mathematical techniques to give a description of the suitable patterns. Clustering is an unsupervised important technique used to search for structures in data. Clustering methods used to partition a set of elements into clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some defined criteria. Fuzzy c-means (FCM) is a method of clustering based on the minimization of the objective function. Al-Sultan, and Chawki, [1], studied the problem of possession of many local minima of Fuzzy clustering mathematical program, they proposed fuzzy C-means heuristic approach to this problem based on Tabu search technique. Ng et al, [2] presented a Tabu search based clustering algorithm, to extend the k-means paradigm to categorical domains, and domains with both numeric and categorical values. This technique gives the solution space beyond local optimality by finding the global solution of the fuzzy clustering problem. Zhang et al [3], compared three techniques that implemented to extend fuzzy c-means (FCM)

ISSN: 0067-2904
clustering to very large data, where both loadable and very large datasets to conduct the numerical experiments that facilitate comparisons based on time and space complexity, speed, quality of approximations. Zhu et. al, [4] generalized an algorithm called GIFP-FCM to get an effective clustering, they introduced a membership constraint function based on norm distance measure and competitive learning, they showed the robustness and convergence of their proposed algorithm. Shang et. al, [5] proposed a self-adaptive method to determine the optimal number of clusters, the algorithm designed to automatically determined the possible maximum number of clusters instead of using the empirical rule and obtained the optimal initial cluster centroids. Kochenberger et al, [6] presented the problem of max-cut problems via Tabu search, they applied TS algorithm on large scale Max-cut test problems, and unconstrained quadratic binary program. The rest of the paper, in section.3 we give the mean idea of Fuzzy C-means clustering via minimizing the objective function here we call it Fuzzy objective function, section.4 we explain the algorithm of Tabu Search for FCM, Section.5 presented the results and discussions.

Clustering via Fuzzy C-means
Fuzzy c-means (FCM) is a data clustering technique wherein each data point belongs to a cluster to some degree that is specified measures. This technique was originally introduced by Bezdek in 1981. FCM algorithm attempts to partition a finite collection of elements into a collection of fuzzy clusters with respect to some given criterion. Fuzzy c-means aims to minimize the objective function total weighted mean-square error, the degree to in case an observation belongs to a partition , is the center of the cluster , [7], [8] ∑ ∑ | | (1) Equation (1) is similar to fuzzy c-means algorithm, where the parameter is the fuzziness of the clustering. The centroid of each cluster For Fuzzy clustering the centroid is the mean of all points, weighted by their degree of belonging to the cluster: is the centroid of the cluster j, is the degree to which an observation belongs to a cluster Algorithm .1 Fuzzy clustering procedure 1. Determine K number of clusters 2. Assign randomly to each point coefficients for being in the clusters.

3.
Repeat until the maximum No.Iter is reached, or when the getting the condition of convergences

Find the coefficients of each point of being in the clusters, using Equation(2).
The algorithm minimizes intra-cluster variance as well, but has the same problems as k-means; the minimum is a local minimum, and the results depend on the initial choice of weights. Hence, different initializations may lead to different results.

Tabu Search
Tabu Search is a Global Optimization algorithm and a Metaheuristic or Meta-strategy for controlling an embedded heuristic technique. The basic concept of Tabu search as described by Glover (1986), he presented it as a meta-heuristic superimposed on another heuristic. The overall approach is to avoid entrainment in cycles by forbidding or penalizing moves which take the solution, in the next iteration, to points in the solution space previously visited (hence "Tabu") [9]. The idea of Tabu method is a simulation to the human behavior appears to operate with a random element that leads to inconsistent behavior given similar circumstances. Tabu method estimates the resulting tendency to deviate from a path, might be regretted as a source of error but can also prove to be a source of gain with the exception that new path is not chosen randomly. Instead, the Tabu search proceeds according to the supposition that there is no point in accepting a new (weak) solution unless it is to avoid a path already investigated. The new regions of the search space avoiding local minima and ultimately finding the optimal solution. The Tabu search begins its searching to a local minimum with avoiding of retracing the steps used; the recent moves are keeping in one or more Tabu lists. This list will not prevent a previous move from being repeated, but rather to ensure it will not reverse. Tabu lists used for recorded the history and build the Tabu search memory. The role of the memory can change as the algorithm proceeds, for more details see [9,10,11]. The differences between the implementations of the Tabu method are done with the size, variability, and adaptability of the Tabu memory to a problem search space. Step.1 Find centers of fuzzy c-mean.

Probabilistic Tabu search
Tabu probability version used for reducing the dependence on memory. The probabilities governing the acceptance of moves from a specified candidate set derive from three sources, move attractiveness, related to changes induced in c(x), Tabu status, related to tenure on a Tabu list, and aspiration level, related to the value of c(x) achieved in relation to a historical standard. Let ( ) be a neighborhood of the point and assume that it contains all neighboring points with Hamming distance ( ( ) then all points may be forbidden and ( ) ,-

Setting of the experiments
This section deals with experimental part of this paper, the results show the ability of the proposed algorithm to find optimal solution, the best clusters, based on the values of the Fuzzy objective function. The experiments designed for the real network with different topics and complicity, the details of the networks given in Table-1. The experiments are designed to find the clusters for different types of large networks see details in Table-1, [12].
Here, the maximum number of iterations are 10000, minimum amount of improvement 1e-20. The process of computing clustering stopped when the maximum number of iterations is reached, or when the objective function improvement between two iterations is less than the minimum amount of specific tolerance.
The comparison for the values of the Fuzzy objective function is given in Table-2., in this table the values of the objective function that computed for the case of using standard FCM, and in the case of using Tabu method to compute the objective function. Different setting is adopted to implement the experiments, the experiments designed for the case of the known number of clusters by assuming K=2, the second implementation when the number of clustersis auto selected, results are given in Table 2 and Table-3. Give the average best values of the objective function for 10 runs with number of clusters=2 or auto select. The affected parameters are probability threshold, the values of P are on the range (0, 1), the small value of P gives the minimum Fuzzy objective function, the results are given in Figure-(1-11). Dolphin 1232 × 10 (5) 2387 × 10 (5) 9411 × 10 (6) 5039 ×10 (-24) AF C 1856 × 10 (2) 3541 × 10 (4) 2491 × 10 (6) 30945 ×10 (-

Conclusions
In this paper we present Tabu Search for Fuzzy c-Means to estimate the best clustering by finding the best values of Fuzzy objective function and apply it on different types of real networks, the results show the ability of Tabu search to find the global solution and determine the centroids, this step is important to find the community detection of the big networks.