Hybrid Whale Optimization Algorithm with Modified Conjugate Gradient Method to Solve Global Optimization Problems

Show more

1. Introduction

Optimization can be defined as one of the branches of knowledge dealing with discovering or arriving at the optimal solutions to a specific issue within a set of alternatives.

The methods of solving optimization problems divided into two types of algorithms: Deterministic Algorithms and Stochastic Algorithms.

Most of classical algorithms are specific algorithms. For example, the Simplex method in linear programming is a specific algorithm, and some specific algorithms use tilt information (Gradient), which called slope-based algorithms. For example, Newton-Raphson algorithm) is an algorithm based on slope or derivative [1].

As for random algorithms, they have two types of algorithms, although the difference between them is small: Heuristic Algorithms and Meta-Heuristic Algorithms [2].

The Whale Optimization Algorithm (WOA) is an algorithm inspired by the humpback whale search behavior for its food and hunting method and was first proposed by Lewis and Mirjalili (2016) [3].

In the same year, WOA improved by Trivedi and others by incorporating a new technology called WOA Adaptive Technology (WOA) [4].

In the same year, Touma studied the economic transmission problem on the IEEE Bus-30 system using a whale optimization algorithm, which gives good results compared to other algorithms [5].

In 2017, Hu and others proposed an improved algorithm of whale optimization by adding its inertia weights called (WOAs). The new algorithm tested using 27 functions and applied to predict the daily air quality index. The proposed algorithm showed efficiency compared to other algorithms [6].

This algorithm used in the same year by researchers Prakash and Lakshminarayana in determining the optimal location of the capacitors and determining their size in the radial distribution network, in order to reduce the losses of the distribution network line as the positioning of the capacitors in optimal locations will improve system performance, stability and reliability [7].

In the same year, the researcher Desuky used a whale optimization algorithm to improve two levels of male fertility classification. Recently, diseases and health problems that were common among the elderly only became common among young people, and some of the causes of these medical problems are behavioral, environmental and lifestyle factors. The whale optimization algorithm then combined with the Pegasos algorithm to enhance the male fertility rating at both levels. This integration improved the results by 90% [8].

The algorithm of the whale optimization was also used in the same year by Reddy and others to optimize renewable resources to reduce losses in electricity distribution systems [9].

In the same year, Mafarja and Mirjalili crossed the whale optimization algorithm with Simulated Annealing Algorithm and used in the classification process. The results confirm the efficiency of the hybrid algorithm in improving classification accuracy [10].

The aim of the research is to propose a new hybrid algorithm consisting of a Whale optimization algorithm (WOA) with Modified traditional Conjugate Gradient Directional Methods (WOA-MCG). Table 1 represents a definition of the variables used in this study.

2. Conjugate Gradient Method

In unconstrained optimization, we minimize an objective function depends on real variables with no restrictions on the values of these variables. The unconstrained optimization problem is:

$\mathrm{min}f\left(x\right):x\in {R}^{n}$, (1)

where $f:{R}^{n}\to R$ is a continuously differentiable function, bounded from below. A nonlinear conjugate gradient method generates a sequence $\left\{{x}_{k}\right\}$, k: integer number, $k\ge 0$. Starting from an initial point ${x}_{0}$, the value of ${x}_{k}$ calculate by the following equation:

${x}_{k+1}={x}_{k}+{\lambda}_{k}{d}_{k}$, (2)

where the positive step size ${\lambda}_{k}>0$ is obtained by a line search, and the directions ${d}_{k}$ are generated as:

${d}_{k+1}=-{g}_{k+1}+{\beta}_{k}{d}_{k}$, (3)

where ${d}_{0}=-{g}_{0}$, the value of ${\beta}_{k}$ is determined according to the algorithm of Conjugate Gradient (CG), and its known as a conjugate gradient parameter, ${s}_{k}={x}_{k+1}-{x}_{k}$ and ${g}_{k}=\nabla f\left({x}_{k}\right)={f}^{\prime}\left({x}_{k}\right)$, consider $\Vert .\Vert $ is the Euclidean norm and ${y}_{k}={g}_{k+1}-{g}_{k}$. The termination conditions for the conjugate gradient line search are often based on some version of the Wolfe conditions. The standard Wolfe conditions:

$f\left({x}_{k}+{\lambda}_{k}{d}_{k}\right)-f\left({x}_{k}\right)\le \rho {\lambda}_{k}{g}_{k}^{\text{T}}{d}_{k}$, (4)

$g{\left({x}_{k}+{\lambda}_{k}{d}_{k}\right)}^{\text{T}}{d}_{k}\ge \sigma {g}_{k}^{\text{T}}{d}_{k}$, (5)

Table 1. Represents a definition of the variables used.

where ${d}_{k}$ is a descent search direction and $0<\rho \le \sigma <1$, where ${\beta}_{k}$ is defined by one of the following formulas:

${\beta}_{k}^{\left(HS\right)}=\frac{{y}_{k}^{\text{T}}{g}_{k+1}}{{y}_{k}^{\text{T}}{d}_{k}};\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\beta}_{k}^{\left(FR\right)}=\frac{{g}_{k+1}^{\text{T}}{g}_{k+1}}{{g}_{k}^{\text{T}}{g}_{k}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\beta}_{k}^{\left(PRP\right)}=\frac{{y}_{k}^{\text{T}}{g}_{k+1}}{{g}_{k}^{\text{T}}{g}_{k}}$ (6)

${\beta}_{k}^{\left(CD\right)}=-\frac{{g}_{k+1}^{\text{T}}{g}_{k+1}}{{g}_{k}^{\text{T}}{d}_{k}};\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\beta}_{k}^{\left(LS\right)}=-\frac{{y}_{k}^{\text{T}}{g}_{k+1}}{{g}_{k}^{\text{T}}{d}_{k}};\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\beta}_{k}^{\left(DY\right)}=\frac{{g}_{k+1}^{\text{T}}{g}_{k+1}}{{y}_{k}^{\text{T}}{s}_{k}}$ (7)

Al-Bayati and Al-Assady (1986) proposed three forms for the scalar ${\beta}_{k}$ defined by:

${\beta}_{k}^{AB1}=\frac{{\Vert {y}_{k}\Vert}^{2}}{{\Vert {g}_{k}\Vert}^{2}};\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\beta}_{k}^{AB2}=-\frac{{\Vert {y}_{k}\Vert}^{2}}{{d}_{k}^{\text{T}}{g}_{k}};\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\beta}_{k}^{AB3}=\frac{{\Vert {y}_{k}\Vert}^{2}}{{d}_{k}^{\text{T}}{y}_{k}}$ (8) [11]

3. Proposed a New Conjugacy Coefficient

We have the quasi-Newton condition

${y}_{k}={G}_{k}{s}_{k}$ (9)

We multiply both sides of Equation (9) by ${s}_{k}$ and we get

$\left[{y}_{k}={G}_{k}{s}_{k}\right]*{s}_{k}\Rightarrow {y}_{k}^{\text{T}}{s}_{k}=G{s}_{k}^{\text{T}}{s}_{k}$

$G=\frac{{y}_{k}^{\text{T}}{s}_{k}}{{\Vert {s}_{k}\Vert}^{2}}\cdot {I}_{n\times n}$ (10)

Let ${d}_{k+1}^{N}=-\lambda {G}_{k}^{-1}{g}_{k+1}$ (11)

${d}_{k+1}^{N}=-\lambda \frac{{y}_{k}^{\text{T}}{s}_{k}}{{\Vert {s}_{k}\Vert}^{2}}{g}_{k+1}$ (12)

Multiply both sides of Equation (12) by ${y}_{k}$ and we get

${y}_{k}^{\text{T}}{d}_{k+1}^{N}=-\lambda \left[\frac{{y}_{k}^{\text{T}}{s}_{k}}{{\Vert {s}_{k}\Vert}^{2}}\right]{y}_{k}^{\text{T}}{g}_{k+1}$ (13)

$\Rightarrow {y}_{k}^{\text{T}}{d}_{k+1}^{CG}=-{y}_{k}^{\text{T}}{g}_{k+1}+{\beta}_{k}{d}_{k}^{\text{T}}{y}_{k}$ (14)

From (13) and (14) we have

$-{y}_{k}^{\text{T}}{g}_{k+1}+{\beta}_{k}{d}_{k}^{\text{T}}{y}_{k}=-\lambda \left[\frac{{y}_{k}^{\text{T}}{s}_{k}}{{\Vert {s}_{k}\Vert}^{2}}\right]{y}_{k}^{\text{T}}{g}_{k+1}$ (15)

We assume that ${\beta}_{k}={\beta}_{k}^{\left(DY\right)}=\frac{{g}_{k+1}^{\text{T}}{g}_{k+1}}{{y}_{k}^{\text{T}}{d}_{k}}$

Then we have

$-{y}_{k}^{\text{T}}{g}_{k+1}+{\beta}_{k}^{DY}{d}_{k}^{\text{T}}{y}_{k}=-\lambda \left[\frac{{y}_{k}^{\text{T}}{s}_{k}}{{\Vert {s}_{k}\Vert}^{2}}\right]{y}_{k}^{\text{T}}{g}_{k+1}$ (16)

$-{y}_{k}^{\text{T}}{g}_{k+1}+\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{d}_{k}^{\text{T}}{y}_{k}}{d}_{k}^{\text{T}}{y}_{k}=-\lambda \left[\frac{{y}_{k}^{\text{T}}{s}_{k}}{{\Vert {s}_{k}\Vert}^{2}}\right]{y}_{k}^{\text{T}}{g}_{k+1}$ (17)

From Equation (17) we get:

$-{y}_{k}^{\text{T}}{g}_{k+1}+{\beta}_{k}{\tau}_{k}=-\lambda \left[\frac{{y}_{k}^{\text{T}}{s}_{k}}{{\Vert {s}_{k}\Vert}^{2}}\right]{y}_{k}^{\text{T}}{g}_{k+1}$ (18)

Then, we have

${\beta}_{k}=\frac{-\left[\frac{{\Vert {s}_{k}\Vert}^{2}}{2\left[{f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right]}\right]\left[\frac{{y}_{k}^{\text{T}}{s}_{k}}{{\Vert {s}_{k}\Vert}^{2}}\right]{y}_{k}^{\text{T}}{g}_{k+1}+{y}_{k}^{\text{T}}{g}_{k+1}}{{\tau}_{k}}$ (19)

${\beta}_{k}=\frac{-\left[\frac{{y}_{k}^{\text{T}}{s}_{k}}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]{y}_{k}^{\text{T}}{g}_{k+1}+{y}_{k}^{\text{T}}{g}_{k+1}}{{\tau}_{k}}$ (20)

${\beta}_{k}=\frac{\left[1-\frac{{y}_{k}^{\text{T}}{s}_{k}}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]{y}_{k}^{\text{T}}{g}_{k+1}}{{\tau}_{k}}$ (21)

Since ${\tau}_{k+1}>0$ then we suppose: ${\tau}_{k}={\Vert {g}_{k}\Vert}^{2}$ then:

${\beta}_{k}=\frac{\left[1-\frac{{y}_{k}^{\text{T}}{s}_{k}}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]{y}_{k}^{\text{T}}{g}_{k+1}}{{\Vert {g}_{k}\Vert}^{2}}$. (22)

3.1. Outlines of the Proposed Algorithm

Step (1): *The initial step*: We select the starting point
${x}_{0}\in {R}^{n}$, and we select the accuracy solution
$\epsilon >0$ is a small positive real number and we find
${d}_{k}=-{g}_{k}$,
${\lambda}_{0}=Minary\left({g}_{0}\right)$, and we set
$k=0$.

Step (2): *The convergence test*: If
$\Vert {g}_{k}\Vert \le \epsilon $ then stop and set the optimal solution is
${x}_{k}$. Else, go to step (3).

Step (3): *The line search*: We compute the value of
${\lambda}_{k}$ by Cubic method and that satisfy the Wolfe conditions in Equations (4), (5) and go to step (4).

Step (4): *Update the variables*:
${x}_{k+1}={x}_{k}+{\lambda}_{k}{d}_{k}$ and compute
$f\left({x}_{k+1}\right),{g}_{k+1}$ and
${s}_{k}={x}_{k+1}-{x}_{k}$,
${y}_{k}={g}_{k+1}-{g}_{k}$.

Step (5): *Check*: if
$\Vert {g}_{k+1}\Vert \le \epsilon $ then stop. Else continue.

Step (6): *The search direction*: We compute the scalar
${\beta}_{k}^{\left(New\right)}$ by using the Equation (22) and set
$k=k+1$, and go to step (4).

3.2. Flowchart of Conjugated Gradient Algorithm

Figure 1 shows the flowchart of the standard conjugated gradient method.

3.3. Theoretical Properties for the New CG-Method.

In this section, we focus on the convergence behavior on the ${\beta}_{k}^{New}$ method with exact line searches. Hence, we make the following basic assumptions on the objective function.

Figure 1. Flowchart of the standard conjugated gradient method.

Assumption (1):

f is bounded below in the level set ${L}_{{x}_{0}}=\left\{x\in {R}^{n}|f\left(x\right)\le f\left({x}_{0}\right)\right\}$ ; in some neighborhood U of the level set ${L}_{{x}_{0}}$, f is continuously differentiable and its gradient $\nabla f$ is Lipschitz continuous in the level set ${L}_{{x}_{0}}$, namely, there exists a constant L > 0 such that:

$\Vert \nabla f\left(x\right)-\nabla f\left(y\right)\Vert \le L\Vert x-y\Vert $ forall $x,y\in {L}_{{x}_{0}}$. (23)

3.3.1. Sufficient Descent Property

We will show that in this section the proposed algorithm defined in the equations (22) and (3) satisfy the sufficient descent property which satisfies the convergence property.

Theorem (1):

The search direction ${d}_{k}$ that generated by the proposed algorithm of modified CG satisfies the descent property for all k, when the step size ${\lambda}_{k}$ satisfied the Wolfe conditions (4), (5).

Proof: we will use the indication to prove the descent property, for $k=0$, ${d}_{0}=-{g}_{0}\Rightarrow {d}_{0}^{\text{T}}{g}_{0}=-\Vert {g}_{0}\Vert <0$, then we proved that the theorem is true for $k=0$, we assume that $\Vert {s}_{k}\Vert \le \eta $ ; $\Vert {g}_{k+1}\Vert \le \Gamma $ and $\Vert {g}_{k}\Vert \le \eta 2$ and assume that the theorem is true for any k i.e. ${d}_{k}^{\text{T}}{g}_{k}<0$ or ${s}_{k}^{\text{T}}{g}_{k}<0$ since ${s}_{k}={\lambda}_{k}{d}_{k}$, now we will prove that the theorem is true for $k+1$ then:

${d}_{k+1}=-{g}_{k+1}+{\beta}_{k}^{\left(New\right)}{d}_{k}$ (24)

i.e. ${d}_{k+1}=-{g}_{k+1}+\frac{\left[1-\frac{{y}_{k}^{\text{T}}{s}_{k}}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]{y}_{k}^{\text{T}}{g}_{k+1}}{{\Vert {g}_{k}\Vert}^{2}}{d}_{k}$ (25)

Multiply both sides of the Equation (25) by ${g}_{k+1}$ we get:

${g}_{k+1}^{\text{T}}{d}_{k+1}=-{\Vert {g}_{k+1}\Vert}^{2}+\frac{\left[1-\frac{{y}_{k}^{\text{T}}{s}_{k}}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]{y}_{k}^{\text{T}}{g}_{k+1}}{{\Vert {g}_{k}\Vert}^{2}}{g}_{k+1}^{\text{T}}{d}_{k}$ (26)

Divided both side by ${\Vert {g}_{k+1}\Vert}^{2}$ :

$\frac{{g}_{k+1}^{\text{T}}{d}_{k+1}+{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {g}_{k+1}\Vert}^{2}}=\frac{\left[1-\frac{{y}_{k}^{\text{T}}{s}_{k}}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]{y}_{k}^{\text{T}}{g}_{k+1}}{{\Vert {g}_{k}\Vert}^{2}}\frac{{g}_{k+1}^{\text{T}}{d}_{k}}{{\Vert {g}_{k+1}\Vert}^{2}}$ (27)

$\frac{{g}_{k+1}^{\text{T}}{d}_{k+1}+{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {g}_{k+1}\Vert}^{2}}\le \frac{\left[1-\frac{{y}_{k}^{\text{T}}{s}_{k}}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]\Vert {y}_{k}\Vert \Vert {g}_{k+1}\Vert}{{\Vert {g}_{k}\Vert}^{2}}\frac{\Vert {g}_{k+1}\Vert \Vert {d}_{k}\Vert}{{\Vert {g}_{k+1}\Vert}^{2}}$ (28)

$\frac{{g}_{k+1}^{\text{T}}{d}_{k+1}+{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {g}_{k+1}\Vert}^{2}}\le \frac{\left[1-\frac{{y}_{k}^{\text{T}}{s}_{k}}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]\Vert {y}_{k}\Vert \Vert {d}_{k}\Vert}{{\Vert {g}_{k}\Vert}^{2}}$ (29)

$\frac{{g}_{k+1}^{\text{T}}{d}_{k+1}+{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {g}_{k+1}\Vert}^{2}}\le \frac{\left[1-\frac{\Vert {y}_{k}\Vert \Vert {s}_{k}\Vert}{2\left({f}_{k}-{f}_{k+1}+\Vert {g}_{k+1}\Vert \Vert {s}_{k}\Vert \right)}\right]\Vert {y}_{k}\Vert \Vert {d}_{k}\Vert}{{\Vert {g}_{k}\Vert}^{2}}$ (30)

$\frac{{g}_{k+1}^{\text{T}}{d}_{k+1}+{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {g}_{k+1}\Vert}^{2}}\le \frac{\Vert {y}_{k}\Vert \Vert {d}_{k}\Vert}{{\Vert {g}_{k}\Vert}^{2}}$ (31)

$\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{g}_{k+1}^{\text{T}}{d}_{k+1}+{\Vert {g}_{k+1}\Vert}^{2}}\ge \frac{{\Vert {g}_{k}\Vert}^{2}}{\Vert {y}_{k}\Vert \Vert {d}_{k}\Vert}=\delta >1$ (32)

$\frac{{g}_{k+1}^{\text{T}}{d}_{k+1}+{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {g}_{k+1}\Vert}^{2}}\le \frac{1}{\delta}$ (33)

${g}_{k+1}^{\text{T}}{d}_{k+1}+{\Vert {g}_{k+1}\Vert}^{2}\le \frac{1}{\delta}{\Vert {g}_{k+1}\Vert}^{2}$ (34)

${g}_{k+1}^{\text{T}}{d}_{k+1}\le -\left(1-\frac{1}{\delta}\right){\Vert {g}_{k+1}\Vert}^{2}$

Let $c=1-\frac{1}{\delta}$ (35)

Then ${g}_{k+1}^{\text{T}}{d}_{k+1}\le -c{\Vert {g}_{k+1}\Vert}^{2}$ (36)

For some positive constant c > 0. This condition often has been used to analyze the global convergence of conjugate gradient methods with inexact line search.

3.3.2. Global Convergence Property

The conclusion of the following lemma used to prove the global convergence of nonlinear conjugate gradient methods, under the general Wolfe line search.

Lemma 1:

Suppose assumptions (1) (i) and (ii) hold and consider any conjugate gradient method (22) and (3), where ${d}_{k}$ is a descent direction and ${\lambda}_{k}$ is obtained by the strong Wolfe line search. If

$\underset{k\ge 1}{\overset{\alpha}{\sum}}\frac{1}{{\Vert {d}_{k}\Vert}^{2}}}=\alpha $ (37)

Then $\underset{k\to \infty}{\mathrm{lim}\mathrm{inf}}\Vert {g}_{k}\Vert =0$ (38)

For uniformly convex functions which satisfy the above assumptions, we can prove that the norm of ${d}_{k+1}$ given by (25) is bounded above. Assume that the function f is a uniformly convex function, i.e. there exists a constant $\mu \ge 0$ such that for all $x,y\in S$,

${\left(g\left(x\right)-g\left(y\right)\right)}^{\text{T}}\left(x-y\right)\ge \mu {\Vert x-y\Vert}^{2},$ (39)

Using Lemma 1 the following result can be proved.

Theorem 2:

Suppose that the assumptions (i) and (ii) hold. Consider the algorithm (3), (22). If $\Vert {s}_{k}\Vert $ tends to zero and there exists nonnegative constants $\eta 1$ and $\eta 2$ such that:

${\Vert {g}_{k}\Vert}^{2}\ge \eta 1{\Vert {s}_{k}\Vert}^{2},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\Vert {g}_{k+1}\Vert}^{2}\ge \eta 2\Vert {s}_{k}\Vert $ (40)

and *f* is a uniformly convex function, then.

$\underset{k\to \infty}{\mathrm{lim}\mathrm{inf}}\Vert {g}_{k}\Vert =0$ (41)

Proof: From Equation (22) We have:

${\beta}_{k}^{new}=\frac{\left[1-\frac{{y}_{k}^{\text{T}}{s}_{k}}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]{y}_{k}^{\text{T}}{g}_{k+1}}{{\Vert {g}_{k}\Vert}^{2}}$

From Cuchy-Shwartz we get:

$\left|{\beta}_{k+1}^{New}\right|=\left|\frac{\left[1-\frac{{y}_{k}^{\text{T}}{s}_{k}}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]{y}_{k}^{\text{T}}{g}_{k+1}}{{\Vert {g}_{k}\Vert}^{2}}\right|$ (42)

$\left|{\beta}_{k+1}^{New}\right|\le \frac{\left[1-\frac{\Vert {y}_{k}\Vert \Vert {s}_{k}\Vert}{2\left({f}_{k}-{f}_{k+1}+{g}_{k+1}^{\text{T}}{s}_{k}\right)}\right]\Vert {y}_{k}\Vert \Vert {g}_{k+1}\Vert}{{\Vert {g}_{k}\Vert}^{2}}$ (43)

But $\Vert {y}_{k}\Vert \le L\Vert {s}_{k}\Vert $. Then

$\left|{\beta}_{k+1}^{New}\right|\le \frac{\left[1-\frac{L\Vert {s}_{k}\Vert \Vert {s}_{k}\Vert}{2\left({f}_{k}-{f}_{k+1}+\Vert {g}_{k+1}\Vert \Vert {s}_{k}\Vert \right)}\right]L\Vert {s}_{k}\Vert \Vert {g}_{k+1}\Vert}{{\Vert {g}_{k}\Vert}^{2}}$ (44)

$\left|{\beta}_{k+1}^{New}\right|\le \frac{\left[1-\frac{L\Vert {s}_{k}\Vert \Vert {s}_{k}\Vert}{2\left({f}_{k}-{f}_{k+1}+\Vert {g}_{k+1}\Vert \Vert {s}_{k}\Vert \right)}\right]L\Vert {s}_{k}\Vert \Vert {g}_{k+1}\Vert}{{\Vert {g}_{k}\Vert}^{2}}$ (45)

From Equation (41)

$\left|{\beta}_{k+1}^{New}\right|\le \frac{\left[1-\frac{L{\eta}^{2}}{2\left(\left|{f}_{k}-{f}_{k+1}\right|+\eta \Gamma \right)}\right]L\eta \Gamma}{\eta 1\eta \Vert {s}_{k}\Vert}$ (46)

Let from theorem (1):

$A={f}_{k}-{f}_{k+1}$ then $\left|{\beta}_{k+1}^{New}\right|\le \frac{\left[1-\frac{L{\eta}^{2}}{2\left(\left|A\right|+\eta \Gamma \right)}\right]L\eta \Gamma}{\eta 1\eta \Vert {s}_{k}\Vert}$ (47)

$\left|{\beta}_{k+1}^{New}\right|\le \frac{L\eta \Gamma}{\eta 1\eta \Vert {s}_{k}\Vert}$ (48)

Hence,

$\Vert {d}_{k+1}\Vert \le \Vert {g}_{k+1}\Vert +\left|{\beta}_{k}^{N}\right|\Vert {s}_{k}\Vert $ (49)

$\Vert {d}_{k+1}\Vert \le \gamma +\frac{L\eta \Gamma}{\eta 1\eta \Vert {s}_{k}\Vert}\Vert {s}_{k}\Vert =\gamma +\frac{L\eta \Gamma}{\eta 1\eta}$ (50)

$\underset{k\ge 1}{\sum}\frac{1}{{\Vert {d}_{k+1}\Vert}^{2}}=\infty$ (51)

$\frac{1}{{\left(\gamma +\frac{L\eta \Gamma}{\eta 1\eta}\right)}^{2}}{\displaystyle \underset{k\ge 1}{\sum}1}=\infty $. (52)

4. Whale Optimization Algorithm (WOA)

Whales are the largest animals in the world where there are whales with a length of up to 30 meters and weighs 180 tons. There are major species in the world such as killer whales, humpback whales, blue whales. Whales are often predators and whales are sleepless because they breathe from the ocean surface. In fact, only half of the brain sleeps and the interesting thing about whales is that they are very smart animals to add to emotion [3].

The hunting technique used by these whales is one of the most interesting methods and is called the method of nutrition, the process of searching for food. Figure 2 represents the feeding behavior using the bubble trap in humpback whales.

Figure 2. Represents the feeding behavior using the bubble trap in humpback whales.

The humpback whale dives about (12) meters down and then creates bubbles in the form of circles or spiral encircles the prey and then swim towards the surface and this process consists of three different stages as follows:

1: Coral loop.

2: Lob tail.

3: Capture loop.

This style of food can be observed in the humpback whale only.

4.1. Mathematical Model

In this section, we will talk about how physically encircle the prey, which divided into maneuvering the spiral and how to get to the prey. We will also discuss Whale Optimization Algorithm (WOA).

1) Encircling prey

One of the characteristics of humpback whales is their knowledge of the location of the prey and encircling them either in the research space. The optimal location can not be known in advance. But the whale algorithm assumes that the target prey is the best or near solution, then the rest of the other elements will update their positions according to the best location and is represented by the following equations:

$D=\left|C\cdot {X}^{\ast}\left(t\right)-X\left(t\right)\right|$ (53)

$X\left(t+1\right)={X}^{\ast}\left(t\right)-A\cdot D$ (54)

Since:

t: represents instantaneous iteration (instantaneous).

A, C: Indicates the vectors.

$X$ : means the position vector.

${X}^{\ast}$ represents the site vector for the best solution obtained and should occur in all iterations if the solution is not preferred

To calculate the values of vectors A and C, we use the following formulas:

$A=2a\cdot r-a$ (55)

$C=2\cdot r$ (56)

The value of $a$ decreases over the frequency range from 2 to 0 and $r$ represents a vector that takes values in the period [0, 1] at random.

2) Bubble-net attacking method (Exploitation phase)

Where the humpback whale style of food Bubble trap was divided mathematically into two parts are:

a. Shrinking encircling mechanism

This process carried out by the value of $a$ where its value decreases as in Equation (5) which leads to decreasing $A$ as well.

The value of $a$ can found from the following formula:

$a=2-t\frac{2}{Maxlter}$ (57)

where

t: means the current iteration.

Maxlter: Maximum number of iterations allowed [10].

From this, we conclude that the value of $A$ falls between [−a, a] which is a random value and that over all iterations the value of (a) decreases from 2 to 0 by placing values of $A$ in [−1, 1] randomly.

The new location of the researched element can be considered in any position between the best element currently and the original position of the element. Figure 3 shows the possible position of (X, Y) in the direction of (X*, Y*) where this can be achieved in the space of the two axes by setting $0\le A\le 1$ as follows It also explains the reduction of the encircling mechanism:

b. Spiral updating position

In this method, we will calculate the distance between the whale in (X, Y) and the prey in (X^{*}, Y^{*}) as shown in Figure 4 and then create an equation that is a spiral equation between the position of the prey and the whale that represents the movement of the snail movement Humpback whales are as follows:

$X\left(t+1\right)={D}^{\prime}\cdot {\text{e}}^{bl}\mathrm{cos}\left(2\pi l\right)+{X}^{*}\left(t\right)$ (58)

Figure 3. Represents the reduction of the encircling mechanism.

Figure 4. Represents the spiral updated location.

where

${D}^{\prime}=\left|{X}^{*}\left(t\right)-X\left(t\right)\right|$ (59)

Also, ${D}^{\prime}$ represents the best distance between the whale and its prey obtained to the present moment, (b) is a constant number to determine the shape of the logarithmic spiral, l is a number belonging to the period [−1, 1] randomly.

Humpback whales run around their prey in a shrinking circle and are in the form of a spiral. To express this technique, we will impose a 50% probability of selection to reduce the cord or spiral pattern to improve the position of the whales. It shall be mathematically as follows:

$X\left(t+1\right)=\{\begin{array}{l}{X}^{*}\left(t\right)-A\cdot D\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}P<0.5\\ {D}^{\prime}{\text{e}}^{bl}\mathrm{cos}\left(2\pi l\right)+{X}^{*}\left(t\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}P\ge 0.5\end{array}$ (60)

where P: number represents belong to the period [0, 1] at randomly [6].

3) Search for prey:

The same method based on the variation in vector $A$ can be used to search for prey. Humpback whales randomly search for their prey depending on the position of each one. Therefore, we will use the vector $A$ with values greater than 1 or less than −1 randomly as this process will force the search element to search away from the reference whale, and in contrast to the exploitation stage. We will improve the position of the search element randomly in the exploration phase rather than better Element obtained so far, this method and $1<\left|A\right|$. It emphasizes the exploration process and allows the WOA algorithm to do a full research and the mathematical representation is:

$D=\left|C\cdot Xrand-X\right|$ (61)

$X\left(t+1\right)=Xrand-A\cdot D$ (62)

Since rand

4.2. Whale Optimization Algorithm

The WOA algorithm relies on a set of random solutions that begin with each

Figure 5. Represents the exploration mechanism in the WOA algorithm.

process. The search elements optimize their position based on the randomly selected search element or based on the best solution found to date. To facilitate the properties of exploration and exploitation, where the random element is found when $\left|A\right|>1$, when $\left|A\right|<1$ is the best solution, because the position of the search element is improved [3].

The WOA algorithm is able to change the movements between the motion of the helix or circular motion based on the value of P and the algorithm will terminate if the stop condition is met.

If we take the WOA algorithm in theory, we can say that it is an integrated optimization algorithm because it has the ability to explore and exploit. On this basis, the proposed method defines the process of research on the best solutions and allows the rest of the other research elements to take the best obtained so far.

In the WOA algorithm, the search vector (A) can be allowed to update for the better by the easy passage between exploration and exploitation. For exploitation at ( $\left|A\right|\ge 1$ ), it should be noted that the WOA algorithm has only two internal key parameters to be modified, A, C. [3].

4.3. Whale Optimization Algorithm Features

1. Algorithms are easy to implement.

2. This algorithm is highly flexible.

3. Do not need many parameters.

4. You can easily navigate through exploration and exploitation based on one parameter.

5. Due to the simplicity of this algorithm and its lack of many parameters, it is used to solve the logarithmic spiral function, it covers the boundary area in the research space.

6. The position of the elements (solutions) in the exploration phase is improved based on randomly selected solutions rather than the best solution obtained so far [10].

4.4. Proposed Hybrid Algorithm

In this paragraph, a new hybrid method proposed to solve the optimization issues called WOA-CG, a proposed hybrid algorithm that links the evolutionary ideas of the WOA algorithm with the classical optimization of Conjugate Gradient Algorithm, called WOA-CG. Figure 6 represents the proposed algorithm

Figure 6. Represents the proposed algorithm (WOA-CG).

(WOA-CG). In this algorithm, the process in each iteration divided into two phases. In the first stage, the random community and the initial velocity of the WOA are generated, and in the second stage, the HS-CG algorithm is used. The steps of the proposed hybrid algorithm (WOA-CG) can be summarized as follows:

Step 1: Create a primary community by generating a primary community and configuring parameters A, C.

Step 2: The random community then entered into the classic conjugate gradient algorithm to improve the community and get the best solution.

Step 3: Calculate the fitness function of the resulting new community (from the traditional conjugate gradient algorithm as the primary community of the whale optimization algorithm) for each search element that represents the distance between the whale and its prey.

Step 4: Calculate the best position in the search elements. With this feature can produce a new generation of children.

Step 5: Update the location of each search element using the algorithm attributes: prey search, prey encirclement, hunting and attacking prey.

Step 6: Update the new generation position using numbered Equations (6) and (7).

Step 7: The WOA algorithm performs a number of iterative steps until the stop condition is met.

5. Practical Aspect

For the purpose of evaluating the performance of the proposed algorithms in solving optimization problems, the proposed WOA-CG algorithm was tested, using (10) standard functions to compare with the algorithm of the whale optimization themselves. The minimum and upper limits of each function used when the function reaches the minimum value and the highest frequency of all programs equals (500) iterations (Table 2).

Tables 3-5 show the results of the WOA-MCG algorithm compared with the results of the WOA algorithm. The proposed WOA-MCG algorithm shown to be successful by improving the results of most high-standard test functions. This confirms the success of the hybridization process.

The test was carried out by a laptop with the following characteristics: CPU speed is 2.70, RAM is 8 GB, and Matlab R2014a runs on Windows 8.

6. Conclusions

1. Hybridization of post-intuitive algorithms with one of the classical algorithms has contributed to improving its performance by increasing the speed of convergence.

2. Hybridization of post-intuitive algorithms with one of the classical algorithms has contributed to an improvement in the quality of the resulting solutions by increasing its exploratory and exploitative capabilities, as numerical results show the ability of hybrid algorithms to solve different optimization problems.

Table 2. Details of test functions.

Table 3. Comparison of results between WOA and WOA-MCG using the number of elements consisting of 5 elements and the number of iterations 500.

Table 4. Comparison of results between WOA and WOA-MCG using the number of elements consisting of 10 elements and the number of iterations 500.

Table 5. (a) Comparison of results between AWO and AWO-MCG using the number of elements consisting of 15 elements and the number of iterations 500; (b) Comparison of results between AWO and AWO-MCG using the number of elements consisting of 30 elements and the number of iterations 500.

The results of the WOA-MCG algorithm compared with the WOA algorithm itself, which led to encouraging results as good solutions were obtained for most test functions.

References

[1] Yang, X.-S. (2010) Engineering Optimization: An Introduction with Metaheuristic Applications. John Wiley & Sons, New Jersey.
https://doi.org/10.1002/9780470640425

[2] Meng, X., Liu, Y., Gao, X. and Zhang, H. (2014) A New Bio-Inspired Algorithm: Chicken Swarm Optimization. In: Tan, Y., Shi, Y., Coello, C.A.C., Eds., Advances in Swarm Intelligence, ICSI 2014. Lecture Notes in Computer Science, Vol. 8794. Springer, Cham, 86-94.

https://doi.org/10.1007/978-3-319-11857-4_10

[3] Mirjalili, S. and Lewis, A. (2016) The Whale Optimization Algorithm. Advances in Engineering Software, 95, 51-67. https://doi.org/10.1016/j.advengsoft.2016.01.008

[4] Trivedi, I.N., Pradeep, J., Narottam, J., Arvind, K. and Dilip, L. (2016) Novel Adaptive Whale Optimization Algorithm for Global Optimization. Indian Journal of Science and Technology, 9, 1-6. https://doi.org/10.17485/ijst/2016/v9i38/101939

[5] Touma, H.J. (2016) Study of the Economic Dispatch Problem on IEEE 30-Bus System Using Whale Optimization Algorithm. International Journal of Engineering Technology and Sciences (IJETS), 5, 11-18.

[6] Abhiraj, T. and Aravindhababu, P. (2017) Dragonfly Optimization Based Reconfiguration for Voltage Profile Enhancement in Distribution Systems. International Journal of Computer Applications, 158, 1-4. https://doi.org/10.5120/ijca2017912758

[7] Andrea, R., Blesa, M., Blum, C. and Michael, S. (2008) Hybrid Metaheuristics: An Emerging Approach to Optimization, Springer, Berlin.

[8] Manoharan, N., Dash, S.S., Rajesh, K.S. and Panda, S. (2017) Automatic Generation Control by Hybrid Invasive Weed Optimization and Pattern Search Tuned 2-DOF PID Controller. International Journal of Computers, Communications & Control, 12, 533-549.

https://doi.org/10.15837/ijccc.2017.4.2751

[9] Mirjalili, S. (2016) Dragonfly Algorithm: A New Meta-Heuristic Optimization Technique for Solving Single-Objective, Discrete, and Multi-Objective Problems. Neural Computing and Applications, 27, 1053-1073.
https://doi.org/10.1007/s00521-015-1920-1

[10] Mafarja, M.M. and Mirjalili, S. (2017) Hybrid Whale Optimization Algorithm with Simulated Annealing for Feature Selection. Neurocomputing, 260, 302-312.

https://doi.org/10.1016/j.neucom.2017.04.053

[11] Bayati, A.Y. and Assady, N.H. (1986) Conjugate Gradient Method. Technical Research, School of Computer Studies, Leeds University.