A New Efficient Hybrid Approach for Machine Learning-Based Firefly Optimization

Optimization is the task of minimizing or maximizing an objective function f(x) parameterized by x . A series of effective numerical optimization methods have become popular for improving the performance and efficiency of other methods characterized by high-quality solutions and high convergence speed. In recent years, there are a lot of interest in hybrid metaheuristics, where more than one method is ideally combined into one new method that has the ability to solve many problems rapidly and efficiently. The basic concept of the proposed method is based on the addition of the acceleration part of the Gravity Search Algorithm (GSA) model in the Firefly Algorithm (FA) model and creating new individuals. Some standard objective functions are used to compare the hybrid (FAGSA) method with FA and the traditional GSA to find the optimal solution. Simulation results obtained by MATLAB R2015a indicate that the hybrid algorithm has the ability to cross the local optimum limits with a faster convergence than the luminous Fireflies algorithm and the ordinary gravity search algorithm. Therefore, this paper proposes a new numerical optimization method based on integrating the properties of the two methods (luminous fireflies and gravity research). In most cases, the proposed method usually gives better results than the original methods individually.


Introduction
In machine learning (ML), classical optimization algorithms fail to give an appropriate solution to optimization problems with a high-dimensional search space, which grows exponentially with the problem size.Therefore, utilizing accurate techniques (such as exhaustive search) is recommended when exhaustive testing of these issues is impractical [1].Hybrid approaches can be used to speed up the process of finding a satisfactory solution and improve the sustainability of energy production infrastructures in many areas such as households.Several metaheuristic hybridization methods are presented in ( [2] and [3]).As a result, two or more algorithms can be hybridized as homogeneous or heterogeneous in highlevel or low-level systems via a relay or convolutionary methods.
According to [15], exploration and exploitation are the two fundamental aspects that play an important role in describing the working of any algorithm.Exploration is an algorithm's ability to search for entire parts of a problem in dimension space, whereas exploitation is the algorithm's ability to find the near optimum solution.All heuristic optimization algorithms strive to find a global optimum by efficiently keeping a balance between exploitation and exploration abilities.The method allows controlling the balance between investigating unknown regions of the dimension space (exploration) and investigating further known regions of dimension space to have low-energy structures (exploitation) [16].
Because of the aforementioned limitations, current heuristic optimization algorithms can only solve a limited set of problems.That is, there is no algorithm capable and sufficient to find a solution for all optimization problems [17].Therefore, a hybrid optimization algorithm is a way to make a balance between comprehensive exploration and exploitation when creating a new creative algorithm.Due to its simplicity, speed of convergence, and ability to search for a global optimum, in hybrid approaches, FA is one of the most extensively utilized evolutionary algorithms.
There have been various studies done in the literature to aggregate FA with other algorithms such as hybrid FireFly Particle Swarm Optimization (FFPSO) [18], where the main feature of the FA algorithm is the attractiveness of brightness intensity that models the optimal solution and better exploitation due to the absence of the velocity vector (V) and personal best (pbest) terms in it, while PSO ( [12] and [13]) is trying to take advantage of two characteristics to reach the best velocity; the first is based on the best personal position (pbest) and the second is the best global flight position (gbest) [19], Firefly Algorithm Deferential Evolution (FADE) enhances searching efficiency by executing FA and DE in parallel to promote information sharing among the Firefly population [20], and Hybrid FireFly Algorithm and Cuckoo Search (HFFACS) balances exploration and exploitation by its simplicity and low computational cost when applied to a wide range of problems [21].These hybrid algorithms are designed to reduce the likelihood of being trapped in a local optimum.GSA, a novel heuristic optimization method was recently introduced [22].This study introduces a new hybrid model that combines FA and GSA algorithms, named FAGSA, and uses twenty-three benchmark functions to compare the performance of the heuristic algorithm of the FA and the GSA algorithm with the new hybrid algorithm, FAGSA.
Instead of using traditional feature extraction and optimization techniques, several metaheuristic techniques are developed to solve the problem of feature optimization, such as the Firefly Sequential quadratic programming (FaSqp) algorithm introduced by [23], the FFPSO algorithm introduced by [18], and the FADE algorithm introduced by [20].For realworld optimization problems, meta-heuristic algorithms have been proven to outperform gradient-based techniques.According to [24], the firefly algorithm has one of the optimization algorithms that can deal with multimodal functions more naturally and efficiently.Fireflies are created artificially and deployed randomly in decision space.Then, each firefly emits a flushing mechanism for other fireflies.This paper introduces the metaheuristic FAGSA as a feature extraction method.The FA and GSA sections contain a detailed formulation and an explanation of each algorithm.

Standard Firefly Algorithm
FA is an evolutionary computation algorithm proposed by X.S. Yang in 2007.The firefly algorithm was developed by simulating firefly brightness (mating) behavior.Although this algorithm is similar to Particle Swarm Optimization (PSO) ( [12], and [13]), Artificial Bee Colony (ABC) Optimization [14], and Ant Colony Optimization (ACO) [11] were significantly easier to implement [21].
Fireflies are small insects that emit a bright light intended to attract other fireflies and also hunt prey [25].They emit a short series of light flashes in a repetitive pattern.From elementary physics, it is clear the intensity of light 'I' is inversely proportional to the square of the distance 'r' from the source, where the attractiveness 'I' of fireflies decreases as the distance 'r' increases.Hence, most fireflies can only be seen from a few hundred meters away.The implementation of the fitness function depends on the behavior of the brightness of the light emitted by the fireflies to implement this algorithm.For the aim of plainness, it is assumed that firefly light intensity attractiveness is specified by its brightness I, which correlates with the fitness function.

a. Attractiveness and light intensity
The brightness of a firefly is determined from the encoded objective function depending on the equations below.Min (  )  = (x1, ..., xd) T (1) where X is the population size at a given position r, I is the intensity of the light reflected by the source, which can be calculated using the firefly's total brightness as in I(r), which is proportional to fitness.As a result, I(r) follows the inverse square law, which is defined as: The variation in firefly attractively β with nearby others by distance r is proportional to I(r) and can be defined as follows: (3) Where β0 is an attractiveness at r = 0 based on brightness, γ is the fixed light absorption coefficient and theory, γ ∈ [0, ∞), but in a traineeship γ=O(1) is determined by the characteristic length г of the system to be optimized and in most applications, it typically varies in range from 0.01 to 100.r is the distance between fireflies at position   ,   and estimate the using following formula where d is a dimension in space,   and   are two fireflies and j>i.The movement of a firefly i is attracted to another more attractive (brighter) firefly j is determined by If there is no brighter firefly j  0 = 0 , it will move randomly (simple random walk).  ( + 1) =   () + ( − ) . ( 4604 The pseudo-code for the firefly algorithm Sort the solution results and keep the best solutions that exist from the population so far.

Standard Gravitational Search Algorithm
The authors in [10] and [26] have proposed a novel heuristic optimization method, GSA.According to Isaac Newton's law of gravitation, every particle in the universe attracts another with a gravitational force, which is directly proportional to the product of their masses and inversely proportional to the square of the distance between them.The following is a mathematical model of the GSA.Assume there are N particles in a system.The process begins by distributing all particles in the search space at random.The gravitational forces exerted by particle j on particle i at any given time are defined as follows [10]: The active gravitational mass associated with particle j is denoted by Maj, the passive gravitational mass associated with particle i is denoted by Mpi, the gravitational constant at time t is denoted by G(t), ɛ is a small constant, and the Euclidian distance between two particles i and j is denoted by Rij: Where α and  0 represent the descending coefficient and initial value, iter represents the current iteration, and maxiter represents the maximum number of iterations.The total force acting on particle i in a problem space with dimension d is calculated as follows: In Newton's law of motion, the acceleration of an object is caused by a net force inversely proportional to its mass, so all the acceleration of particles is computed by the following equation: In the preceding equation ( 12), t represents a specific time and Mi represents the mass of the object i.The following equations are used to calculate the velocity and position of the objects: where ɛ is uniformly distributed in the interval [0, 1].
In each round, Equations ( 15) and ( 16) are used to update the velocity as well as the mass of each agent i.
Firstly, random values are assigned to GSA masses.Each mass is a potential solution.Following the initialization step, the velocities of all masses are defined using equations (13).Meanwhile, the gravitational constant total forces and accelerations are computed as (Eq.8, 9, 10), respectively.The mass position value is computed using (14).Finally, the GSA will be terminated once the meeting criteria are met.

8:
Steps 3-7 must be repeated until the stopping criterion is met.

Proposed approach: FAGSA Algorithm
In this section, the acceleration part of the GSA algorithm is used in the FA algorithm, which is similar to previous work in ( [18], [20] and [23]).Such integration has the advantage of being better in convergence.Besides, it prevents falling into the local minimum.The FA algorithm has an advantage over other optimization techniques in its simplicity; only a few parameters need to be modified.The FAGSA algorithm works in the same way as the FA algorithm, except that the position vector of the FA algorithm is modified as the following equation:   ( + 1) =   () +  0  − 2 (  () −   ()) + aC i d (t) +   (17) If there is no brighter,  0 = 0 and the firefly moves randomly (simple random walk):   ( + 1) =   () + aC i d (t) +   , ( The pseudo-code for the proposed algorithm

2:
Set the initial values of randomization parameter α, firefly attractiveness β0, size of Fireflies (population) P, and a maximum of generation number T.

4:
Based on the intensity of the fireflies' light, compute the fitness of the initial population.

13: Else 14:
Firefly i is moved randomly towards Firefly j (Eq.18).A Firefly algorithm (FA) operator mutates the light intensity attraction step of each particle in the proposed method.That means each particle is attracted to the best position in the entire group at random.The modified attractiveness step of the FAGSA algorithm performs local searches in different regions.The FAGSA feature selection stage's main goal is to reduce the problem's features before supervising neural network classification.The FAGSA algorithm has the distinction of being able to be used to solve optimization problems using the methods of firefly flash behavior as a promising wrapper algorithm among all the ones used.To demonstrate FAGSA's efficiency, the following remarks are made: The quality of the results and good solutions (fitness) are taken into account when updating the FAGSA algorithm.Particles near good solutions attract others that explore the search space.When all the particles get close to the good solutions, they move slowly.In this case, the acceleration property of the GSA algorithm is used to combine with the FA algorithm to utilize the best solution and increase the acceleration to move into the best firefly brightness.The best FAGSA solutions are stored in memory where they can be extracted and accessed at any time.Therefore, each particle can be observed and attracted to the optimal solution.

Experimental results and discussion
The new hybrid (FAGSA), FA, and GSA models are implemented in MATLAB R2015a individually.FAGSA's performance is evaluated using twenty-three standard benchmark functions used for testing the performance of FA and GSA algorithms.Tables 1, 2, 3, and 4 lists these benchmark functions, their dimensions, their search space ranges, and their optimum fitness values.A detailed description of these functions is available in [27].In this paper, our objective is minimization optimization.Therefore, many parameters should be initialized in the algorithms FA, GSA, and FAGSA.For FA was performed with the following parameters: absorption coefficient (gamma=0.01),randomness reduction factor (theta=10^(-5/maxiter) =~ 0.97), attractiveness constant/light amplitude (beta0=1.0),maximum iteration (maxiter)=1000, and stop when criteria equal maxiter.For GSA and FAGSA, the following settings were used: population size=50, G0=100, α=20, and maximum iteration = 1000, and stopping when criteria equal maxiter.FA dominated FAGSA in terms of the best fitness on two functions: (F1 and F15) and in terms of the best mean on nine functions: (F2, F3, F5, F6, F10, F11, F13, F14, and F16).
GSA dominated FAGSA in terms of the best fitness values on eight functions only: F8, F9, F14, F16-F18, F20, and F23, and in terms of the best mean values on nine functions only: F8, F15-F17, and F19-F23.
That is, for the best fitness values of fifty iterations on twenty-three different fitness functions, the FAGSA performs best on seventeen functions, FA performs the best on two functions, GSA performs the best on four functions, and saddle point performs the best on the remaining four functions (F16-F18, F20, and F23).
As a result, the FAGSA performs nearly twice as many functions as the FA and the GSA.FAGSA achieves global minima in all benchmark functions except F8, F14, F15, and F19 for the best fitness of 50 runs.F19 is a noisy problem, and all algorithms exhibit similar characteristics.
FAGSA, like FA and GSA, can find global minima in 50 runs for the functions F8, F22, and F23.FAGSA performs better on functions with wide domains because the functions F22 and F23 all have narrow domains.FAGSA only finds global minima for high-dimensional functions (F1 to F13) on F8 and F12, This implies that FAGSA performs well on high-dimensional functions.Figure 1, Figure 2, and the following Figure 3 for the selected functions show the new algorithm FAGSA outperforms the standard FA and GSA in terms of the rate of convergence.

Conclusion
This paper proposes a new numerical optimization method based on the hybridization of the two methods, namely FA and GSA.The proposed method gives, in most cases, better results than the original ones.The core concept of the new approach is to hybridize the facility exploitation of the Firefly algorithm and the facility exploration of the capabilities of the Gravitational Search Algorithm.Twenty-three benchmark functions are used to validate the new algorithm FAGSA's performance in comparison to standard FA and GSA.According to the results, the new algorithm FAGSA performs better than both the FA algorithm and the GSA algorithm in most minimization function tasks.The results also prove that the new algorithm FAGSA has a faster convergence speed than FA and GSA.
The main findings of this study can be summarized as follows: • The new numerical hybrid optimization technique FAGSA based on the FA and the GSA, is proposed in this paper to address energy consumption forecasting in residential households.
• The performance of the proposed algorithm is compared with other well-known optimization techniques such as FA, GSA, and PSO using the objective function.
• The computation time is almost the same for all algorithms analyzed in this study, so the complexity of the proposed algorithm remains comparable to the basic algorithms and is used as a benchmark for evaluating performance.
Hence, it can be inferred that the FAGSA outperforms the other optimization techniques (FA, GSA, and PSO).Therefore, it can be successfully applied for energy consumption forecasting.
• We hereby confirm that all the Figures and Tables in the manuscript are mine.Besides, the Figures and images, which are not mine, have been given permission for re-publication and attached with the manuscript.
• The author has signed an animal welfare statement.
• Authors sign on ethical consideration's approval • Ethical Clearance: The project was approved by the local ethical committee at Cairo University.

Authors' Contribution:
A. Mohamed, Z. Hegazy, Naglaa, and O. Eyman contributed to the design and implementation of the research, the analysis of the results, and the writing of the manuscript (%50 Mohamed, %20 Hegazy, %15 Naglaa, and %15 Eyman).All authors discussed the results and contributed to the final manuscript.
5) where   ()  the current position of a firefly. 0  − 2 (  () −   ()) is the attractiveness (β) of a firefly (the attraction towards neighbors   ),   indicates the random walk of a firefly with  ∈ [0, 1] being the randomization parameter and ɛ  is Gaussian or uniformly distributed in [0, 1].If the scales differ significantly in different dimensions such as −105 to 105 in one dimension while, say, −0.001 to 0.01 along with the other, it is recommended to replace α with αSk where the scaling parameters are Sk (k = 1, ..., d).

Figure 3 :
Figure 3: Average best of benchmark functions F14 and F18

Table 4 :
Results of twenty-three Benchmark Functions

Table 4
shows the experimental results.According to dominated and non-dominated rules, the results are an average of 50 runs, with the best results highlighted in bold type.