Global Convergence Property with Inexact Line Search for a New Hybrid Conjugate Gradient Method

Show more

1. Introduction

In unconstrained optimization, we minimize an objective function that depends on real variables with no restrictions at all on the value of these variables. The unconstrained optimization problem is stated by:

${\mathrm{min}}_{x\in {R}^{n}}f\left(x\right)$ (1)

where $x\in {R}^{n}$ is a real vector with $n\ge 1$ component and $f:{R}^{n}\to R$ is a smooth function and its gradient g is available [1]. A nonlinear conjugate gradient method generates a sequence ${x}_{k}$ Starting from an initial guess ${x}_{0}\in {R}^{n}$ Using the recurrence

${x}_{k+1}={x}_{k}+{\alpha}_{k}{d}_{k},\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=0,1,2,\cdots $ (2)

where ${\alpha}_{k}$ is the positive step size obtained by carrying out a one dimensional search, known as the line searches [2]. Among them, the so-called strong wolf line search conditions require that [3] [4].

$f\left({x}_{k}+{\alpha}_{k}{d}_{k}\right)\le f\left({x}_{k}\right)+\sigma {\alpha}_{k}{d}_{k}$, (3)

$\left|g\left({x}_{k}+{\alpha}_{k}{d}_{k}\right)\right|\le \delta \left|{g}_{k}^{\text{T}}{d}_{k}\right|$ (4)

where $0<\sigma <\delta <\text{1}$, is to find an approximation of ${\alpha}_{k}$ where the descent property must be satisfied and no longer searching in the direction when ${x}_{k}$ is far from the solution. Thus by strong Wolfe line search conditions we in herit the advantages of exact line search with inexpensive and low computational cost [5].

The search direction ${d}_{k}$ is generated by:

${d}_{k}=\{\begin{array}{l}-{g}_{k},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=1\\ -{g}_{k}+{\beta}_{k}{d}_{k},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k>1\end{array}$ (5)

where ${g}_{k}$ and ${\beta}_{k}$ is the gradient and conjugate gradient coefficient of f(x) respectively at the point ${x}_{k}$. The different choices for the parameter ${\beta}_{k}$ correspond to different conjugate gradient methods. The most popular formulas for ${\beta}_{k}$ is Hestenes Stiefel method (HS), Fletcher-Reeves method (FR), Polak-Ribiere- Polyak method (PR), conjugate―Descent method (CD), Liu―Storey method (LS), and Dai-Yuan method (DY), etc

These methods are identical when f is a strongly convex quadratic function and the line search is exact, since the gradient are mutually orthogonal, and the parameters ${\beta}_{k}$ in these methods are equal. When applied to general nonlinear function with inexact line searches, however, the behavior of these methods is marked different [1]. We are going to summarize some well known conjugate gradient method in Table 1.

An important class of conjugate gradient methods is the hybrid conjugate gradient algorithms. The hybrid computational schemes perform better than the classical conjugate gradient methods. They are defined by (2) and (5) where the parameter ${\beta}_{k}$ is computed as projections or as convex combinations of different conjugate gradient methods [14].

We are going to summarize some well known hybrid conjugate gradient method in Table 2.

We propose a new hybrid CG method based on combination of MMWU [24] and RMAR [25] conjugate gradient methods for solving unconstrained optimization method with suitable conditions. The corresponding conjugate gradient parameters are

${B}_{k}^{MMWU}=\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}$ (6)

and

${\beta}_{k}^{RMAR}=\frac{{\Vert {g}_{k+1}\Vert}^{2}-\frac{\Vert {g}_{k+1}\Vert}{\Vert {d}_{k}\Vert}{g}_{k+1}^{\text{T}}{d}_{k}}{{\Vert {d}_{k}\Vert}^{2}}$ (7)

Table 1. Some well known conjugate gradient coefficients.

Table 2. Hybrid conjugate gradient methods.

We defined the parameter ${\beta}_{k}$ in the proposed method by:

${\beta}_{k}^{FG}=\left(1-{\phi}_{k}\right){\beta}_{k}^{MMWU}+{\phi}_{k}{\beta}_{k}^{RMAR}$ (8)

Observe that if ${\phi}_{k}=0$, then ${\beta}_{k}^{FG}={\beta}_{k}^{MMWU}$, and if ${\phi}_{k}=1$, then ${\beta}_{k}^{FG}={\beta}_{k}^{RMAR}$.

By choosing the appropriate value of the parameter ${\phi}_{k}$ In the convex combination, the search direction ${d}_{k}$ of our algorithm not only is the Newton direction, but also satisfies the famous DL conjugate condition proposed by Dai and Liao [26]. Under the strong Wolfe line search conditions, we prove the global convergence of our algorithm. The numerical results also show the feasibility and effectiveness of our algorithm.

This paper is organized as follows. Section 2 we introduce our new hybrid conjugate gradient method (HFG), and we obtain the parameter ${\phi}_{k}$ using some approaches and give us a specific algorithm. Section 3, we prove that it generates direction satisfying the sufficient descent condition under strong Wolfe line search conditions. The global convergence property of the proposed method is established in Section 4. Some numerical results are reported in Section 5.

2. A New Hybrid Conjugate Gradient Method

In this section, we will describe a new proposed hybrid conjugate gradient method. In order to obtain the sufficient descent direction, we will compute ${\phi}_{k}$ as follows. We combine ${\beta}_{k}^{MMWU}$ and ${\beta}_{k}^{RMAR}$ in a convex combination in order to have a good algorithm for unconstrained optimization.

The direction ${d}_{k+1}$ is generated by the rule

${d}_{k+1}=-{g}_{k+1}+{\beta}_{k}^{HFG}{d}_{k}$ (9)

where ${\beta}_{k}^{HFG}$ defined in (8), the iterates ${x}_{1},{x}_{2},{x}_{3},\cdots $ of our method are computed by means of the recurrence (2), where the step size ${\alpha}_{k}$ Is determined according to the strong Wolf conditions (3) and (4).

The scale parameter ${\phi}_{k}$ satisfying $0\le {\phi}_{k}\le 1$, which will be determined in a specific way to be described later. Observe that if ${\phi}_{k}=0$, then ${\beta}_{k}^{HFG}={\beta}_{k}^{MMWU}$, and

If ${\phi}_{k}=1$, then ${\beta}_{k}^{HFG}={\beta}_{k}^{RMAR}$. On the other hand, if $0<{\phi}_{k}<1$, then ${\beta}_{k}^{HFG}$ is a convex combination of ${\beta}_{k}^{MMWU}$ and ${\beta}_{k}^{RMAR}$.

From (8) and (9) it is obvious that:

${d}_{k+1}=\{\begin{array}{l}-{g}_{k+1},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=1\\ -{g}_{k+1}+\left(1-{\phi}_{k}\right)\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}{d}_{k}+{\phi}_{k}\frac{{\Vert {g}_{k+1}\Vert}^{2}-\frac{\Vert {g}_{k+1}\Vert}{\Vert {d}_{k}\Vert}{g}_{k+1}^{\text{T}}{d}_{k}}{{\Vert {d}_{k}\Vert}^{2}}{d}_{k},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}k>1\end{array}$, (10)

Our motivation to select the parameter ${\phi}_{k}$ in such a manner that the defection ${d}_{k+1}$ given in (10) is equal to the Newton direction ${d}_{k+1}^{N}=-{\nabla}^{2}f{\left({x}_{k+1}\right)}^{-1}{g}_{k+1}$. There for

$\begin{array}{l}-{\nabla}^{2}f{\left({x}_{k+1}\right)}^{-1}{g}_{k+1}\\ =-{g}_{k+1}+\left(1-{\phi}_{k}\right)\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}{d}_{k}+{\phi}_{k}\frac{{\Vert {g}_{k+1}\Vert}^{2}-\frac{\Vert {g}_{k+1}\Vert}{\Vert {d}_{k}\Vert}{g}_{k+1}^{\text{T}}{d}_{k}}{{\Vert {d}_{k}\Vert}^{2}}{d}_{k}\end{array}$ (11)

Now multiplying (11) by ${s}_{k}^{\text{T}}{\nabla}^{2}f\left({x}_{k+1}\right)$ from the left, we get

$\begin{array}{c}-{s}_{k}^{\text{T}}{g}_{k+1}=-{s}_{K}^{\text{T}}{\nabla}^{2}f\left({x}_{k+1}\right){g}_{k+1}+\left(1-{\phi}_{k}\right)\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}{s}_{K}^{\text{T}}{\nabla}^{2}f\left({x}_{k+1}\right){d}_{k}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\phi}_{k}\frac{{\Vert {g}_{k+1}\Vert}^{2}-\frac{\Vert {g}_{k+1}\Vert}{\Vert {d}_{k}\Vert}{g}_{k+1}^{\text{T}}{d}_{k}}{{\Vert {d}_{k}\Vert}^{2}}{s}_{k}^{\text{T}}{\nabla}^{2}f\left({x}_{k+1}\right){d}_{k}\end{array}$

Therefore, in order to have an algorithm for solving large scale problems we assume that pair $\left({s}_{k},{y}_{k}\right)$ satisfies the secant equation

${\nabla}^{2}f\left({x}_{k+1}\right){s}_{k}={y}_{k}.$ (12)

From (12), we get

${s}_{k}^{\text{T}}{\nabla}^{2}f\left({x}_{k+1}\right)={y}_{k}^{\text{T}}.$

Denoting ${\phi}_{k}^{FG}={\phi}_{k}$ we get

$-{s}_{k}^{\text{T}}{g}_{k+1}=-{y}_{K}^{\text{T}}{g}_{k+1}+\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}{y}_{k}^{\text{T}}{d}_{k}+{\phi}_{k}^{FG}\left(\frac{\Vert {g}_{k+1}\Vert \left({g}_{k+1}^{\text{T}}{d}_{k}\right)}{{\Vert {d}_{k}\Vert}^{2}}\right)\left({y}_{k}^{\text{T}}{d}_{k}\right)$

after some algebra, we get

${\phi}_{k}^{FG}=\frac{\left({s}_{k}^{\text{T}}{g}_{k+1}-{y}_{k}^{\text{T}}{g}_{k+1}\right)\cdot {\Vert {d}_{k}\Vert}^{3}+{\Vert {g}_{k+1}\Vert}^{2}\cdot \Vert {d}_{k}\Vert \left({y}_{k}^{\text{T}}{d}_{k}\right)}{\Vert {g}_{k+1}\Vert \cdot \left({g}_{k+1}^{\text{T}}{d}_{k}\right)\cdot \left({y}_{k}^{\text{T}}{d}_{k}\right)}$ (13)

Now, we specify a complete hybrid conjugate gradient method (HFG) which posses some nice properties of conjugate gradient and Newton method.

Algorithm HFG

Step 1: Select ${x}_{0}\in {R}^{n}$, $\in \text{\hspace{0.17em}}>0$, set $k=0$. Compute $f\left({x}_{0}\right)$ and ${g}_{0}=-\nabla f\left({x}_{0}\right)$, set ${d}_{0}=-{g}_{0}$.

Step 2: Test the stopping criteria, i.e. if $\Vert {g}_{k}\Vert \le \text{\hspace{0.17em}}\in $, then stop.

Step 3: Compute ${\alpha}_{k}$ by strong Wolfe line search conditions in (3) & (4).

Step 4: Compute ${x}_{k+1}={x}_{k}+{\alpha}_{k}{d}_{k}$, ${g}_{k+1}=g\left({x}_{k+1}\right)$. Compute ${s}_{k}={x}_{k+1}-{x}_{k}$ And ${y}_{k}={g}_{k+1}-{g}_{k}$

Step 5: If ${\phi}_{k}\ge 1$ then set ${\phi}_{k}=1$. If ${\phi}_{k}\le 0$, then set ${\phi}_{k}=0$, otherwise compute ${\phi}_{k}$ as (13).

Step 6: Compute ${\beta}_{k}^{FG}$ by (8).

Step 7: Generate $d=-{g}_{k+1}+{\beta}_{k}^{FG}{d}_{k}$

Step 8: If the restart criteria of Powell $\left|{g}_{k}^{\text{T}}{g}_{k}\right|\ge 0.2{\Vert {g}_{k+1}\Vert}^{2}$, is satisfied, then set ${d}_{k}=-{g}_{k+1}$, otherwise define ${d}_{k+1}=d$

Step 9: Set $k=k+1$, and continue with step 2.

3. The Sufficient Descent Condition

In this section, we are going to apply the following theorem to illustrate that the search direction ${d}_{k}$ Obtained by hybrid FG satisfies the sufficient descent condition which plays Avit of role in analyzing the global convergence.

For further considerations we need the following assumptions

3.1. Assumption

The level sets $S=\left\{x\in {R}^{n},f\left({x}_{n}\right)\right\}$ are bounded.

3.2. Assumption

In a neighborhood N of S, the function f is continuously differentiable and its gradient is Lipschitz continuous, i.e., there exists a constant $L>0$, such that

$\Vert \nabla f\left(x\right)-\nabla f\left(y\right)\Vert \le L\Vert x-y\Vert ,\text{\hspace{0.17em}}\forall x,y\in N$

Under these assumptions of if there exists a positive constant ( $\gamma ,\stackrel{\xaf}{\gamma},\omega $ & $\stackrel{\xaf}{\omega}$ ) & such that

$\stackrel{\xaf}{\gamma}\le \Vert {g}_{k+1}\Vert \le \gamma \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\stackrel{\xaf}{\omega}\le \Vert {g}_{k}\Vert \le \omega ,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\forall x\in S$ [27].

Theorem.

Let the sequences $\left\{{g}_{k}\right\}$ and $\left\{{d}_{k}\right\}$ be generated by a hybrid FG method. Then the search direction ${d}_{k}$ satisfies the sufficient descent condition:

${g}_{k+1}^{\text{T}}{d}_{k+1}\le -\mu {\Vert {g}_{k+1}\Vert}^{2},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\forall \mu \ge 0$ (14)

where $\mu =1-\left({E}_{4}-{E}_{3}\right)$, with $0<\left({E}_{4}-{E}_{3}\right)<1$.

Proof. We shall show that ${d}_{k}$ satisfies the sufficient descent condition holds for $k=0$, the proof is a trivial one, i.e. ${d}_{0}=-{g}_{0}$ and so ${g}_{0}^{\text{T}}{d}_{0}=-{\Vert {g}_{0}\Vert}^{2}$. Now we have

${d}_{k+1}=-{g}_{k+1}+{\beta}_{k}^{FG}{d}_{k},$

i.e.

${d}_{k+1}=-{g}_{k+1}+\left[\left(1-{\phi}_{k}\right){\beta}_{k}^{MMWU}+{\phi}_{k}{\beta}_{k}^{RMAR}\right]{d}_{k}$

We can rewrite the above direction by the following manner:

${d}_{k+1}=-\left({\phi}_{k}{g}_{k+1}+\left(1-{\phi}_{k}\right){g}_{k+1}\right)+\left(\left(1-{\phi}_{k}\right){\beta}_{k}^{MMWU}+{\phi}_{k}{\beta}_{k}^{RMAR}\right){d}_{k}$.

So,

${d}_{k+1}={\phi}_{k}\left(-{g}_{k+1}+{\beta}_{k}^{RMAR}{d}_{k}\right)+\left(1-{\phi}_{k}\right)\left(-{g}_{k}+{\beta}_{k}^{MMWU}{d}_{k}\right)$,_{ }

After some arrangement, we get

${d}_{k+1}={\phi}_{k}{d}_{k+1}^{RMAR}+\left(1-{\phi}_{k}\right){d}_{k+1}^{MMWU}$ (15)

Multiplying (15) by ${g}_{k+1}^{\text{T}}$ from the left, we get

${g}_{k+1}^{\text{T}}{d}_{k+1}={\phi}_{k}{g}_{k+1}^{\text{T}}{d}_{k+1}^{RMAR}+\left(1-{\phi}_{k}\right){g}_{k+1}^{\text{T}}{d}_{k}^{MMWU}$

Firstly, if ${\phi}_{k}=0$, then ${d}_{k+1}={d}_{k+1}^{MMWU}$, we are going to prove that the sufficient descent condition holds for MMWU method in the presence of the strong Wolfe line search condition, because in [24] they proved this method satisfied the sufficient descent condition with exact line search.

i.e.

${g}_{k+1}^{\text{T}}{d}_{k+1}^{MMWU}=-{\Vert {g}_{k+1}\Vert}^{2}+\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}{g}_{k+1}^{\text{T}}{d}_{k}$ (16)

Since,

${g}_{k+1}^{\text{T}}{d}_{k}\le {y}_{k}^{\text{T}}{d}_{k}$ and ${y}_{k}^{\text{T}}{d}_{k}\le {\alpha}_{k}L{\Vert {d}_{k}\Vert}^{2}$ (17)

Applications (17) in (16), we get

$\begin{array}{c}{g}_{k+1}^{\text{T}}{d}_{k+1}^{MMWU}\le -{\Vert {g}_{k+1}\Vert}^{2}+\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}{\alpha}_{k}L{\Vert {d}_{k}\Vert}^{2}\\ =-\left(1-{\alpha}_{k}L\right){\Vert {g}_{k+1}\Vert}^{2}\\ =-{E}_{1}{\Vert {g}_{k+1}\Vert}^{2}\end{array}$ (18)

where ${E}_{1}=\left(1-{\alpha}_{k}L\right)>0$, with $0<{\alpha}_{k}L<1$.

So, it is proved that ${d}_{k+1}^{MMWU}$ satisfies the sufficient descent condition.

Now let ${\phi}_{k}=1$ then ${d}_{k}={d}_{k}^{RMAR}$, we are going to prove that the sufficient descent condition holds for RMAR method in the presence of the strong Wolfe line search condition because in [25] they proved this method satisfied the sufficient descent condition with exact line search.

${d}_{k+1}^{RMAR}=-{g}_{k+1}+{\beta}_{k}^{RMAR}{d}_{k}$

Multiplying the above equation from left by ${g}_{k+1}^{\text{T}}$ we get

${g}_{k+1}^{\text{T}}{d}_{k+1}^{RMAR}=-{\Vert {g}_{k+1}\Vert}^{2}+\frac{{\Vert {g}_{k+1}\Vert}^{2}-\frac{\Vert {g}_{k+1}\Vert}{\Vert {d}_{k}\Vert}{g}_{k+1}^{\text{T}}{d}_{k}}{{\Vert {d}_{k}\Vert}^{2}}{g}_{k+1}^{\text{T}}{d}_{k}$.

In [25], they proved that

$0\le \frac{{\Vert {g}_{k+1}\Vert}^{2}-\frac{\Vert {g}_{k+1}\Vert}{\Vert {d}_{k}\Vert}{g}_{k+1}^{\text{T}}{d}_{k}}{{\Vert {d}_{k}\Vert}^{2}}\le 2\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}$ (19)

Used (17), and (19) the direction become

$\begin{array}{c}{g}_{k+1}^{\text{T}}{d}_{k+1}\le -{\Vert {g}_{k+1}\Vert}^{2}+2{\alpha}_{k}L{\Vert {g}_{k+1}\Vert}^{2}\\ =-\left(1-2{\alpha}_{k}L\right)\cdot {\Vert {g}_{k+1}\Vert}^{2}\\ =-{E}_{2}\cdot {\Vert {g}_{k+1}\Vert}^{2}\end{array}$ (20)

where ${E}_{2}=\left(1-2{\alpha}_{k}L\right)>0$ with $0<2{\alpha}_{k}L<1$ and $0<L<\frac{1}{2}$.

So, it is proved that ${d}_{k+1}^{RMAR}$ satisfied the sufficient descent condition.

Now, we are going to prove the direction satisfy the sufficient descent condition when $0<{\phi}_{k}<1$, firstly for

$\begin{array}{l}\left(1-{\phi}_{k}\right){\beta}_{K}^{MMWU}{g}_{k+1}^{\text{T}}{d}_{k}\\ =\frac{{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}{g}_{k}^{\text{T}}{d}_{k}-\left[\frac{{s}_{k}^{\text{T}}{g}_{k+1}{\Vert {d}_{k}\Vert}^{3}-{y}_{k}^{\text{T}}{g}_{k+1}{\Vert {d}_{k}\Vert}^{3}+{\Vert {g}_{k+1}\Vert}^{2}\Vert {d}_{k}\Vert {y}_{k}^{\text{T}}{d}_{k}}{\Vert {g}_{k+1}\Vert \left({g}_{k+1}^{\text{T}}{d}_{k}\right){y}_{k}^{\text{T}}{d}_{k}}\right]\ast \frac{{\Vert {g}_{k+1}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}{g}_{k+1}^{\text{T}}{d}_{k}\end{array}$

We have from Lipschitz condition ${g}_{k+1}^{\text{T}}{d}_{k}<{y}_{k}^{\text{T}}{d}_{k}$ and

$-\left(1-\sigma \right)\Vert {g}_{k}\Vert \le {y}_{k}^{\text{T}}{d}_{k}\le {\alpha}_{k}L{\Vert {d}_{k}\Vert}^{2}$

with a mathematical calculation, we get

$\begin{array}{l}\left(1-{\phi}_{k}\right){\beta}_{k}^{MMWU}{g}_{k+1}^{\text{T}}{d}_{k}\\ \le \left[\frac{{\alpha}_{k}L{\Vert {d}_{k}\Vert}^{2}}{{\Vert {d}_{k}\Vert}^{2}}-\frac{\Vert {s}_{k}\Vert \Vert {g}_{k+1}\Vert \Vert {d}_{k}\Vert -{\alpha}_{k}L{\Vert {d}_{k}\Vert}^{2}+{\Vert {g}_{k+1}\Vert}^{2}{\alpha}_{k}L\Vert {d}_{k}\Vert}{\Vert {g}_{k+1}\Vert \cdot \left(-\left(1-\sigma \right)\Vert {g}_{k}\Vert \right)}\right]{\Vert {g}_{k+1}\Vert}^{2}\\ \le \left[{\alpha}_{k}L+\frac{L}{\left(1-\sigma \right)}\frac{{\Vert {s}_{k}\Vert}^{2}\Vert {d}_{k}\Vert -{\alpha}_{k}L{\Vert {d}_{k}\Vert}^{3}+{\Vert {g}_{k+1}\Vert}^{2}\Vert {d}_{k}\Vert}{\Vert {g}_{k+1}\Vert {\Vert {g}_{k}\Vert}^{2}}\right]{\Vert {g}_{k+1}\Vert}^{2}\\ \le \left[{\alpha}_{k}L+\frac{A\gamma B-{\alpha}_{k}L{B}^{2}+{Y}^{2}{\alpha}_{k}LB}{\left(1-\sigma \right)\stackrel{\xaf}{Y}{\stackrel{\xaf}{W}}^{2}}\right]{\Vert {g}_{k+1}\Vert}^{2}\end{array}$

Let
${E}_{1}={\alpha}_{k}L+\frac{LAB-{\alpha}_{k}L{B}^{3}+{\gamma}^{2}{\alpha}_{k}LB}{\left(1-\sigma \right)\stackrel{\xaf}{\gamma}\text{\hspace{0.05em}}{\stackrel{\xaf}{\omega}}^{2}}$ ^{ }

$\therefore \text{\hspace{0.17em}}\left(1-{\phi}_{k}\right){\beta}^{MMWU}{g}_{k+1}^{\text{T}}{d}_{k}\le {E}_{3}{\Vert {g}_{k+1}\Vert}^{2}$ (21)

Now, secondly for

$\begin{array}{l}{\phi}_{k}{\beta}_{k}^{RMAR}{g}_{k+1}^{\text{T}}{d}_{k}\\ =\left[\frac{{s}_{k}^{\text{T}}{g}_{k+1}{\Vert {d}_{k}\Vert}^{3}-{y}_{k}^{\text{T}}{g}_{k+1}{\Vert {d}_{k}\Vert}^{3}+{\Vert {g}_{k+1}\Vert}^{2}\Vert {d}_{k}\Vert {y}_{k}^{\text{T}}\Vert {d}_{k}\Vert}{\Vert {g}_{k+1}\Vert \left({g}_{k+1}^{\text{T}}{d}_{k}\right)\left({y}_{k}^{\text{T}}{d}_{k}\right)}\right]\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\cdot \left[\frac{{\Vert {g}_{k+1}\Vert}^{2}-\frac{\Vert {g}_{k+1}\Vert}{\Vert {d}_{k}\Vert}{g}_{k+1}^{\text{T}}{d}_{k}}{{\Vert {d}_{k}\Vert}^{2}}\right]{g}_{k+1}^{\text{T}}{d}_{k}\end{array}$

From (19), Lipschitz condition
${s}_{k}^{\text{T}}{g}_{k+1}\le {y}_{k}^{\text{T}}{s}_{k}\le L{\Vert {s}_{k}\Vert}^{2}$ _{ }and
${s}_{k}={\alpha}_{k}{d}_{k}$, we get

${\phi}_{k}{\beta}_{k}^{RMAR}{g}_{k+1}^{\text{T}}{d}_{k}=2\left[\frac{L{\Vert {s}_{k}\Vert}^{2}\Vert {d}_{k}\Vert -\Vert {y}_{k}\Vert \Vert {g}_{k+1}\Vert {\Vert {d}_{k}\Vert}^{3}+{\alpha}_{k}L{\Vert {g}_{k+1}\Vert}^{2}{\Vert {d}_{k}\Vert}^{2}}{{\Vert {g}_{k+1}\Vert}^{2}\left(-\left(1-\sigma \right)\right){\Vert {g}_{k}\Vert}^{2}{\Vert {d}_{k}\Vert}^{2}}\right]\cdot {\Vert {g}_{k+1}\Vert}^{2}$

Since $\Vert {y}_{k}\Vert \le \Vert {g}_{k+1}\Vert +\Vert {g}_{k}\Vert $, so

$\begin{array}{l}{\phi}_{k}{\beta}_{k}^{RMAR}{g}_{k+1}^{\text{T}}{d}_{k}\\ \le \frac{-2}{1-\sigma}\left[\frac{L{\Vert {s}_{k}\Vert}^{2}\Vert {d}_{k}\Vert -0.8{\Vert {g}_{k+1}\Vert}^{2}\Vert {d}_{k}\Vert +{\alpha}_{k}L{\Vert {g}_{k+1}\Vert}^{2}\Vert {d}_{k}\Vert}{{\Vert {g}_{k+1}\Vert}^{2}{\Vert {g}_{k}\Vert}^{2}}\right]\cdot {\Vert {g}_{k+1}\Vert}^{2}\\ \le \frac{-2B}{1-\sigma}\left[\frac{LA-0.8{\gamma}^{2}+{\alpha}_{k}L{\omega}^{2}}{\stackrel{\xaf}{\gamma}\text{\hspace{0.05em}}{\stackrel{\xaf}{\omega}}^{2}}\right]\cdot {\Vert {g}_{k+1}\Vert}^{2}\end{array}$

where ${E}_{4}=\frac{2B}{1-\sigma}\left[\frac{LA-0.8{\gamma}^{2}+{\alpha}_{k}L{\omega}^{2}}{\stackrel{\xaf}{\gamma}\text{\hspace{0.05em}}{\stackrel{\xaf}{\omega}}^{2}}\right]$

$\therefore \text{\hspace{0.17em}}{\phi}_{k}{\beta}_{k}^{RMAR}{g}_{k+1}^{\text{T}}{d}_{k}\le -{E}_{4}{\Vert {g}_{k+1}\Vert}^{2}$ (22)

From (18), (20), (21) and (22) we get

$\begin{array}{c}{g}_{k+1}^{\text{T}}{d}_{k+1}\le -{\Vert {g}_{k+1}\Vert}^{2}+{E}_{3}{\Vert {g}_{k+1}\Vert}^{2}-{E}_{4}{\Vert {g}_{k+1}\Vert}^{2}\\ =-\left[1-\left({E}_{4}-{E}_{3}\right)\right]{\Vert {g}_{k+1}\Vert}^{2}\\ =-E{\Vert {g}_{k+1}\Vert}^{2}\end{array}$

with $E=1-\left({E}_{4}-{E}_{3}\right)$ and $0<{E}_{4}-{E}_{3}<1$.

So, it is proved that ${d}_{k+1}$ Satisfied the sufficient descent condition.

4. Converge Analysis

Let Assumption 2.1 and 2.2 hold. In [26] it is proved that for any conjugate gradient method with strong Wolfe line search conditions, it holds:

4.1. Lemma

Let Assumption 2.1 and 2.2 holds. Consider the method (2) and (5) where the
${d}_{k}$ Is a descent direction and α_{k} is received from the strong wolf line search. If

${\sum}_{k\ge 1}\frac{1}{{\Vert {d}_{k}\Vert}^{2}}}=\infty $.

Then

${\mathrm{lim}}_{k\to \infty}\mathrm{inf}\Vert {g}_{k}\Vert =0$.

4.2. Theorem

Suppose that assumption 2.1 and 2.2 holds. Consider the algorithm HFG were $0\le {\phi}_{k}\le 1$ and ${\alpha}_{k}$ is obtained by the strong Wolfe line search and ${d}_{k+1}$ is the descent direction. Then

${\mathrm{lim}}_{k\to \infty}\mathrm{inf}\Vert {g}_{k}\Vert =0$.

Proof. Because the descent condition holds, we have. So using lemma 3.1, it is sufficient to prove that is bounded above. From (10).

They proved that in [24] and [25].

,

And

.

Now for

,

By (4), we have.

Since and

,

with some mathematical calculation, we get

5. Numerical Experiments

In this section we selected some of test functions in Table 3 from CUTE library, along with other large scale optimization problems presented in Andrei [28] [29] and Bongartz et al. [30].

All codes are written in double precision FORTRAN Language and compiled Visual F90 (default compiler settings) on a Workstation Intel Pentium 4. The value of is always computed by cubic fitting procedure.

We selected 26 large scale unconstrained optimization problems in the extended

Table 3. It gives the comparison depending in the NOI and NOF between, and the proposed method.

Table 4. The percentage performance of the proposed methods.

Figure 1. The comparison between the three methods.

or generalized form. Each problem was tested three times for a gradually increasing number of variables: N = 1000, 5000 and 10,000, all algorithms implemented the strong Wolfe line search (3) and (4) conditions with and and the same stopping criterion is used.

In some cases, the computation stopped due to the failure of the line search to find the positive step size, and thus it was considered as a failure denoted by (F).

We record the number of iteration calls (NOI), the number of function evaluations calls (NOF), and the dimensions of test problems calls (N), for the purpose of our comparisons.

Table 3 gives the comparison depending in the NOI and NOF between, and the proposed method.

Table 4 gives the percentage performance of the proposed methods against and. We have seen that. Method saves (NOI 0.8%), (NOF 7.6%), and method saves (NOI 28.7%), (NOF 40.0%) compared with method.

While Figure 1 gives the comparison between, and, using a well-known Wood test function.

References

[1] Liu, A.J. and Li, S. (2014) New Hybrid Conjugate Gradient Method for Unconstrained Optimization. Applied Mathematics and Computation, 245, 36-43.

https://www.sciencedirect.com/science/article/abs/pii/S0096300314010637

https://doi.org/10.1016/j.amc.2014.07.096

[2] Salleh, Z. and Alhawarat, A. (2016) An Efficient Modification of the Hes-tenes-Stiefel Nonlinear Conjugate Gradient Method with Restart Property. Journal of Inequalities and Applications, 2016, Article No. 110.

https://link.springer.com/article/10.1186/s13660-016-1049-5

https://doi.org/10.1186/s13660-016-1049-5

[3] Wolfe, P. (1969) Convergence Condition for Ascent Methods. SIAM Review, 11, 226-235. https://doi.org/10.1137/1011036

[4] Wolfe, P. (1971) Convergence Conditions for Ascent Methods II: Some Corrections. SIAM Review, 13, 185-188. https://doi.org/10.1137/1013035

[5] Alhawarat, A., Mamat, M., Rivaie, M. and Salleh, Z. (2015) An Efficient Hybrid Conjugate Gradient Method with the Strong Wolfe-Powell Line Search. Mathematical Problems in Engineering, 2015, Article ID: 103517.

https://www.hindawi.com/journals/mpe/2015/103517

https://doi.org/10.1155/2015/103517

[6] Hestenes, M.R. and Stiefel, E. (1952) Methods of Conjugate Gradients for Solving Linear Systems. Journal of Research of the National Bureau of Standards, 49, 409-436. https://nvlpubs.nist.gov/nistpubs/jres/049/jresv49n6p409_A1b.pdf

https://doi.org/10.6028/jres.049.044

[7] Fletcher, R. and Reeves, C. (1964) Function Minimization by Conjugate Gradients. The Computer Journal, 7, 149-154. https://doi.org/10.1093/comjnl/7.2.149

https://academic.oup.com/comjnl/article/7/2/149/335311

[8] Polak, E. and Ribie’re, G. (1969) Note sur la convergence de me’thodes de directions conjugue’s. Revue Francsis d’Infermatique et de recherché Operationnelle, 3, 35-43.

http://www.numdam.org/item?id=M2AN_1969__3_1_35_0

https://doi.org/10.1051/m2an/196903R100351

[9] Polyak, B.T. (1969) The Conjugate Gradient Method in Extreme Problems. USSR Computational Mathematics and Mathematical Physics, 9, 94-112.

https://www.researchgate.net/publication/222365587

https://doi.org/10.1016/0041-5553(69)90035-4

[10] Fletcher, R. (1987) Practical Methods of Optimization. 2nd Edition, John Wiley & Sons, Inc., Hoboken. https://www.wiley.com/en-ba

[11] Liu, D. and Story, C. (1991) Efficient Generalized Conjugate Gradient Algorithms. Part 1: Theory. Journal of Optimization Theory and Applications, 69, 129-137.

https://link.springer.com/article/10.1007/BF00940464

https://doi.org/10.1007/BF00940464

[12] Dai, Y.H. and Yuan, Y. (1999) A Nonlinear Conjugate Gradient Method with a Strong Global Convergence Property. SIAM Journal on Optimization, 10, 177-182.

https://doi.org/10.1137/S1052623497318992

[13] Al-Naemi, Gh.M. and Hamed, E.T. (2013) New Conjugate Method with Wolfe Type Line Searches for Nonlinear Programming. Australian Journal of Basic and Applied Sciences, 7, 622-632. https://www.researchgate.net/publication/330686295

[14] Andrei, N. (2010) Acceleration Hybrid Conjugate Gradient Algorithm with Modified Secant Condition for Unconstrained Optimization. Numerical Algorithms, 54, 23-46. https://link.springer.com/article/10.1007/s11075-009-9321-0

https://doi.org/10.1007/s11075-009-9321-0

[15] Andrei, N. (2008) Another Nonlinear Conjugate Gradient Algorithm for Unconstrained Optimization. Optimization Methods and Software, 24, 89-104.

https://www.tandfonline.com/doi/abs/10.1080/10556780802393326

https://doi.org/10.1007/s11075-007-9152-9

[16] Yan, H., Chen, L. and Jiao, B. (2009) HS-LS-CD Hybrid Conjugate Gradient Algorithm for Unconstrained Optimization. 2nd International Workshop on Computer Science and Engineering, Vol. 1, Qingdao, 28-30 October 2009, 264-268.

https://ieeexplore.ieee.org/document/5403315

https://doi.org/10.1109/WCSE.2009.667

[17] Li, S. and Sun, Z.B. (2010) A New Hybrid Conjugate Gradient Method and Its Global Convergence for Unconstrained Optimization. International Journal of Pure and Applied Mathematics, 63, 84-93.

https://ijpam.eu/contents/2010-63-3/4/index.html

[18] Djordjevic, S.S. (2019) New Hybrid Conjugate Gradient Method as a Convex Combination of LS and CD Methods. Filomat, 31, 1813-1825.

https://doi.org/10.2298/FIL1706813D

[19] Djordjevic, S.S. (2018) New Hybrid Conjugate Gradient Method as a Convex Combination of HS and FR Conjugate Gradient Methods. Journal of Applied Mathematics and Computation, 2, 366-378. https://doi.org/10.26855/jamc.2018.09.002

https://www.researchgate.net/publication/328125868

[20] Djordjevi?, S.S. (2019) New Hybrid Conjugate Gradient Method as a Convex Combination of LS and FR Conjugate Gradient Methods. Acta Mathematica Scientia, 39, 214-228.

[21] Zheng, X.Y., Dong, X.L., Shi, J.R. and Yang, W. (2019) Further Comment on Another Hybrid Conjugate Gradient Algorithm for Unconstrained Optimization by Andrei. Numerical Algorithm, 1-6. https://doi.org/10.1007/s11075-019-00771-1

[22] Abdullahi, I. and Ahmad, R. (2016) Global Convergence Analysis of a Nonlinear Conjugate Gradient Method for Unconstrained Optimization Problems. Indian Journal of Science and Technology, 9, 1-9.

http://www.indjst.org/index.php/indjst/article/view/90175/74779

https://doi.org/10.17485/ijst/2016/v9i41/90175

[23] Livieris, I.E., Tampakas, V. and Pintelas, P. (2018) A Descent Hybrid Conjugate Gradient Method Based on the Memoryless BFGS Update. Numerical Algorithms, 79, 1169-1185. https://link.springer.com/article/10.1007/s11075-018-0479-1

https://doi.org/10.1007/s11075-018-0479-1

[24] Mandara, A.V., Mamat, M., Waziri, M.Y., Mohamed, M.A. and Yakubu, U.A. (2018) A New Conjugate Gradient Coefficient with Exact Line Search for Unconstrained Optimization. Far East Journal of Mathematical Sciences, 105, 193-206.

https://www.researchgate.net/publication/329522075

https://doi.org/10.17654/MS105020193

[25] Liu, J.K. and Li, S.J. (2014) New Hybrid Conjugate Gradient Method for Unconstrained Optimization. Applied Mathematics and Computation, 245, 36-43.

https://doi.org/10.1016/j.amc.2014.07.096

https://www.sciencedirect.com/science/article/abs/pii/S0096300314010637

[26] Yunus, R.B., Mamat, M. and Abashar, A. (2018) Comparative Study of Some New Conjugate Gradient Methods. UniSZA Research Conference (URC 2015), Kuala Terengganu, 14-16 April 2015, 616-621.

https://www.researchgate.net/publication/326301618

[27] Al-Naemi, Gh.M. and Ahmed, H.I. (2013) Modified Nonlinear Conjugate Gradient Algorithms with Application in Neural Networks. LAP Lambert Academic Publishing, Saarbrucken.

[28] Andrei, N. (2008) An Unconstrained Optimization Test Functions Collection. Advanced Modeling and Optimization, 10, 147-161.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.665.3152&rep=rep1&type=pdf

[29] Andrei, N. (2014) Test Functions for Unconstrained Global Optimization. 3-5.

[30] Bongartz, I., Conn, A.R., Gould, N. and Toint, P.L. (1995) CUTE: Constrained and Unconstrained Testing Environment. ACM Transactions on Mathematical Software, 21, 123-160. https://doi.org/10.1145/200979.201043