A Continuous Approach to Binary Quadratic Problems

Show more

1. Introduction

Binary quadratic programming (BQP) problem is a kind of typical combinatorial optimization problem, and has a variety of applications in computer aided design, traffic management, cellular mobile communication frequency allocation, operations research and engineering. And many combinatorial optimization problems with constraints can be transformed into BQP problems form by certain transformation, so these problems can be on behalf of kinds of important problems in combinatorial optimization. Hammer [1] pointed out that any integer programming problems where the objective function is quadratic or linear and the constraint condition is linear can be described as BQP problems. Glover et al. [2] have successfully used this transformation method to transform the quadratic knapsack problem into BQP problem for solving. Considering the feasibility of this transformation, BQP problem has more extensive practical application. Due to the wide application background of this problem and the difficulties caused by NP (Non-Deterministic Polynomial) hard properties, it is of great academic value to study effective algorithms for solving such problems.

Binary quadratic programming can be uniformly written in the following general

$\begin{array}{l}\mathrm{min}\text{\hspace{1em}}f\left(x\right)={x}^{\text{T}}Qx\\ \text{s}\text{.t}\text{.}\text{\hspace{1em}}{x}_{i}\in \left\{{l}_{i},{u}_{i}\right\},i=1,2,\cdots ,n\end{array}$ (1)

where Q is a real symmetric $n\times n$ matrix, and the objective function can be followed by another linear term, but it can be omitted because there is no substantial influence on the method introduced later. Of course, in the following discussion, we will mainly focus on the so-called 0 - 1 quadratic programming:

$\begin{array}{l}\mathrm{min}\text{\hspace{1em}}f\left(x\right)={x}^{\text{T}}Qx\\ \text{s}\text{.t}\text{.}\text{\hspace{1em}}{x}_{i}\in \left\{0,1\right\},i=1,2,\cdots ,n\end{array}$ (2)

However, the continuous method proposed later in this paper is also applicable to the general binary quadratic programming (1), which can be easily accomplished by simple variable transformation. It is well known that 0 - 1 programming problem will increase exponentially with the increase of the scale of the problem, and within limited computer resources and time the traditional solution method will be difficult to realize. Because of the wide application prospect and difficult characteristic of 0 - 1 programming, how to solve this kind of combinatorial optimization problem effectively has been the focus of many scholars. The classical methods for solving 0 - 1 programming problems are mainly divided into two types: accurate algorithms and random search algorithms. The accurate algorithms include implicit enumeration method, branch and bound algorithm, cutting plane algorithm, dynamic programming method, etc. Random search algorithms include genetic algorithm, simulated algorithm, artificial neural network algorithm, etc. [3] [4] [5] [6] [7] . But continuous method has become a new tendency of the 0 - 1 programming problem in recent years, whose advantage is to avoid inherent characteristics of the combinatorial optimization problem, and no longer limited to the size of the problem. This method is to transform the combinatorial optimization Aproblem into equivalent continuous optimization problem for solving, thus effectively avoid the so-called combination explosion problem, which can solve some large combinatorial optimization problems efficiently. While this approach is still in the exploratory stage with few research work in this area, but we think it is a noteworthy new research direction. Interested readers can see the literature [8] - [13] .

In this article, we will first turn the binary constraints of the 0 - 1 programming problem into equivalent complementarity problem, that is to say, a mathematical programming problem with equilibrium constraints (MPEC); then NCP function is used to transform the complementary condition into equivalent equation. Finally, the non-smooth equation is smoothed by using the aggregate function and we finally get a nonlinear smooth optimization problem.

2. Continuous Formulation of BQP

The simplest and most intuitive way to solve 0 - 1 planning problem is to adopt the round integration technology, that is, to consider 0 - 1 variable as a continuous variable, and to replace the variable constraint ${x}_{i}\in \left\{0,1\right\},i=1,2,\cdots ,n$ in the original problem with the following interval constraint:

$0\le {x}_{i}\le 1$ (3)

Then a relaxation optimization problem is obtained, and the components of the relaxation solution are rounded to the nearest discrete value. Although this method is easy to implement, but the theory of this technology is not strict. It cannot guarantee that the optimal solution obtained is the global optimal solution, or even a feasible solution.

The other method is the penalty function method, which is easy to see that ${x}_{i}\in \left\{0,1\right\},i=1,2,\cdots ,n$ is always equivalent to a complementarity condition:

${x}_{i}\left(1-{x}_{i}\right)=0,i=1,2,\cdots ,n$ (4)

Therefore, the punishment function can be constructed as follows:

$P\left(\alpha ,x\right)=\alpha {\displaystyle \underset{i=1}{\overset{n}{\sum}}}\text{\hspace{0.05em}}{x}_{i}\left(1-{x}_{i}\right)$ (5)

where $\alpha >0$ is a large enough penalty parameter. Adding this penalty function to the object function can obtain an equivalent continuous optimization problem:

$\mathrm{min}\text{\hspace{1em}}f\left(x\right)+\alpha {x}^{\text{T}}\left(1-x\right)$ (6)

Due to the strong concavity of $\alpha {x}^{\text{T}}\left(1-x\right)$, the object function in (6) is concave. The equivalence of (2) and (6) is based on the fact that the concavity function obtains the minimum value at a vertex, as well as ${x}^{\text{T}}\left(1-x\right)=0$ has included $x=0$ and $x=1$ the sum is included. Although the vertex of the feasible domain is not necessarily the vertex of the unit hypercube, if $\alpha $ is sufficiently large, the global minimum can only be obtained while ${x}^{\text{T}}\left(1-x\right)=0$ .

However, the specific value of $\alpha $ cannot be determined. If the value is too small, the penalty term will not play its role. If the value is too large, due to the strong concavity of the function, this penalty function method cannot effectively obtain the optimal solution of the original problem, so it is not an effective method. It is pointed out in literature [14] , but only a new viewpoint is provided, and no more effective method is given to solve the original problem. Therefore, it is not ideal to solve the original problem directly with (6). Great-Ming Ng later constructed a logarithmic barrier term:

$\Psi \left(x\right)=-{\displaystyle \underset{i=1}{\overset{n}{\sum}}}\left[\mathrm{ln}{x}_{i}+\mathrm{ln}\left(1-{x}_{i}\right)\right]$ (7)

This barrier term can be used to replace the boxed constraint conditions $0\le {x}_{i}\le 1,i=1,2,\cdots ,n$ as well as to avoid falling into the local minimum during the iteration, so as to find the global optimal solution for the original problem. At this point, the following unconstrained optimization problems can be obtained:

$\mathrm{min}\text{\hspace{1em}}f\left(x\right)+\alpha {x}^{\text{T}}\left(1-x\right)-\mu {\displaystyle \underset{i=1}{\overset{n}{\sum}}}\left[\mathrm{ln}{x}_{i}+\mathrm{ln}\left(1-{x}_{i}\right)\right]$ (8)

where $\alpha >0$ is a penalty parameter and $\mu >0$ is a barrier parameter. In a sense, (8) is an improvement on (6), but in essence they both replace ${x}_{i}\in \left\{0,1\right\},i=1,2,\cdots ,n$ constraints with penalty terms or logarithmic barrier terms.

For the BQP problem, another continuous method is proposed in this paper. Consider the following two sets of constraints:

$\begin{array}{l}0\le {x}_{i}\le 1,i=1,2,\cdots ,n,\\ {x}_{i}\left(1-{x}_{i}\right)=0,i=1,2,\cdots ,n.\end{array}$ (9)

Obviously, these two sets of constraint conditions and binary constraint conditions are equivalent. In fact, these two constraints skillfully reform complementary constraints:

${x}_{i}\ge 0,1-{x}_{i}\ge 0,{x}_{i}\left(1-{x}_{i}\right)=0,i=1,2,\cdots ,n$ (10)

The advantage of constructing constraint conditions in this way is that: the two constraint conditions of (9) work at the same time, and to each variable there is a unique complementary constraint condition ${x}_{i}\ge 0,1-{x}_{i}\ge 0,{x}_{i}\left(1-{x}_{i}\right)=0$

And more importantly, complementary constraints (10) and the general constraint conditions are different: for general complementary constraints, each component is a function about $x=\left\{{x}_{1},{x}_{2},\cdots ,{x}_{n}\right\}$, and for (10), the value of every ${F}_{i}\left(x\right)={x}_{i}\left(1-{x}_{i}\right)$ only has relationship with the values of ${x}_{i}$, nothing to do with the other variables. Namely, each of the complementary constraint of (10) is independent of each other.

Through the above, the problem (2) can be equivalently transformed into the following mathematical programming problems with complementary constraints:

$\begin{array}{l}\mathrm{min}\text{\hspace{1em}}f\left(x\right)={x}^{\text{T}}Qx\\ {x}_{i}\ge 0,1-{x}_{i}\ge 0,{x}_{i}\left(1-{x}_{i}\right)=0,i=1,2,\cdots ,n\end{array}$ (11)

For the above complementary constraints, we can use NCP function [15] to replace them further. To better describe NCP functions, we first introduce the concept of them:

Definition (NCP function): If a function is set $\varphi \mathrm{:}{R}^{2}\to R$, and $\varphi \left(a,b\right)=0\iff a\ge 0,b\ge 0,ab=0$, then the function of $\varphi $ is called an NCP function.

Therefore, we can replace the complementary constraint in (11) with a series of equation equations. Finally, we get the continuous optimization model as follows:

$\begin{array}{l}\mathrm{min}\text{\hspace{1em}}f\left(x\right)={x}^{\text{T}}Qx\\ \text{s}\text{.t}\text{.}\text{\hspace{1em}}\varphi \left({x}_{i},1-{x}_{i}\right)=0,i=1,2,\cdots ,n\end{array}$ (12)

Thus, the global optimal solution of (12) is the exact solution of problem (2).

3. Smoothing Method for BQP

According to the definition of NCP function given above, functions satisfying conditions can take various forms. Several commonly used NCP functions can be given:

$\begin{array}{l}{\varphi}_{M}=\mathrm{min}\left(a,b\right)\\ {\varphi}_{FB}=\sqrt{{a}^{2}+{b}^{2}}-\left(a+b\right)\\ \varphi \left(a,b\right)=-ab+\frac{1}{2}{\mathrm{min}}^{2}\left\{0,a+b\right\}\\ \varphi \left(a,b\right)={\left(a-b\right)}^{2}-a\left|a\right|-b\left|b\right|0\end{array}$ (13)

In the following section, we mainly choose the minimum value function to study.

It is easy to know $\mathrm{min}\left\{a,b\right\}=-\mathrm{max}\left\{-a,-b\right\}$, and for the non-differentiable maximum function:

${f}_{\mathrm{max}}\left(x\right)=\mathrm{max}\left\{{f}_{i}\left(x\right)\right\},i=1,2,\cdots ,n$ (14)

we often use aggregate function to carry out smooth approximation to it [16] [17] . The so-called aggregate function is a function in the following form:

${F}_{\mu}\left(x\right)={\mu}^{-1}\left\{\mathrm{ln}{\displaystyle \underset{i=1}{\overset{m}{\sum}}}\mathrm{exp}\left[\mu {f}_{i}\left(x\right)\right]\right\}$ (15)

where the smooth parameter $\mu $ is large enough, and ${f}_{i}\left(x\right),i=1,2,\cdots ,n$ are continuous differentiable (i.e., smooth) real value function. The aggregate function has very good properties [18] , which can be found from the following properties:

Lemma 3.1 The function ${F}_{\mu}\left(x\right)$ which is defined by (15) satisfies the following conditions:

1) ${f}_{\mathrm{max}}\left(x\right)<{F}_{\mu}\left(x\right)\le {f}_{\mathrm{max}}\left(x\right)+{\mu}^{-1}\mathrm{ln}m$

2) $\underset{\mu \to \infty}{\mathrm{lim}}{F}_{\mu}\left(x\right)={f}_{\mathrm{max}}(x)$

Proof 1) ${F}_{\mu}\left(x\right)$ can be equivalent to deformation ${F}_{\mu}\left(x\right)={f}_{\mathrm{max}}\left(x\right)+{\mu}^{-1}\mathrm{ln}{\displaystyle {\sum}_{i=1}^{m}}\mathrm{exp}\left\{\mu \left[{f}_{i}\left(x\right)-{f}_{\mathrm{max}}\left(x\right)\right]\right\}$, because ${f}_{i}\left(x\right)\le {f}_{\mathrm{max}}\left(x\right)$, so $0<\mathrm{exp}\left\{\mu \left[{f}_{i}\left(x\right)-{f}_{\mathrm{max}}\left(x\right)\right]\right\}\le 1$, and there is at least one indicator that makes ${f}_{i}\left(x\right)={f}_{\mathrm{max}}\left(x\right)$, so exist $\mathrm{exp}\left\{\mu \left[{f}_{i}\left(x\right)-{f}_{\mathrm{max}}\left(x\right)\right]\right\}=1$ . So we have

$0<{f}_{i}\left(x\right)-{f}_{\mathrm{max}}\left(x\right)\le {\mu}^{-1}\mathrm{ln}m$ (16)

2) When $\mu \to \infty $, the above conclusion can be proved by formula (16).

Based on the above introduction, we can handle the minimum function as follows [19] :

$\begin{array}{c}{\varphi}_{M}\left(a,b\right)=\mathrm{min}\left(a,b\right)=-\mathrm{max}\left(-a,-b\right)\approx -{F}_{\mu}\left(-a,-b\right)\\ =-{\mu}^{-1}\mathrm{ln}\left[\mathrm{exp}\left(-\mu a\right)+\mathrm{exp}\left(-\mu b\right)\right]\end{array}$ (17)

where the smooth parameter $\mu $ is large enough, as shown by lemma 3.1, when $\mu \to \infty $, $-{F}_{\mu}\left(-a,-b\right)$ and ${\varphi}_{M}\left(a,b\right)$ are equivalent.

For ease of expression, let’s call:

$\begin{array}{l}{\varphi}_{M}\left({x}_{i},1-{x}_{i}\right)=\mathrm{min}\left\{{x}_{i},1-{x}_{i}\right\},i=1,2,\cdots ,n\\ {\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)={\mu}^{-1}\mathrm{ln}\left[\mathrm{exp}\left(-\mu {x}_{i}\right)+\mathrm{exp}\left(-\mu \left(1-{x}_{i}\right)\right)\right],i=1,2,\cdots ,n\end{array}$ (18)

At this point, the problem (12) can be further transformed into an equivalent continuous and smooth nonlinear constraint optimization problem:

$\begin{array}{l}\mathrm{min}\text{\hspace{1em}}f\left(x\right)={x}^{\text{T}}Qx\\ \text{s}\text{.t}\text{.}\text{\hspace{1em}}{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)=0,i=1,2,\cdots ,n\end{array}$ (19)

By this continuous method, the global optimal solution of the original problem (2) can be solved by solving the corresponding nonlinear constrained optimization problem (19). Many mature optimization algorithms have been developed for solving nonlinear equality constraint optimization problems. This paper mainly uses the multiplier penalty function method to solve the above problems (19). Multiplier method is an optimization algorithm independently proposed by Powell and Hestenes in 1969 to solve equality constraint optimization problems. Later, it was extended to solve inequality constraint optimization problems by Rockfellar in 1973. The basic idea is to start from the Lagrangian function of the original problem and add an appropriate penalty function, so as to transform the original problem into a series of unconstrained optimization subproblems.

For convenience’s sake, let’s first:

$\begin{array}{l}\Phi \left(x\right)={\left({\varphi}_{M}\left({x}_{1},1-{x}_{1}\right),{\varphi}_{M}\left({x}_{2},1-{x}_{2}\right),\cdots ,{\varphi}_{M}\left({x}_{n},1-{x}_{n}\right)\right)}^{\text{T}}\\ \Phi \left(x,\mu \right)={\left({\varphi}_{\mu}\left({x}_{1},1-{x}_{1}\right),{\varphi}_{\mu}\left({x}_{2},1-{x}_{2}\right),\cdots ,{\varphi}_{\mu}\left({x}_{n},1-{x}_{n}\right)\right)}^{\text{T}}\end{array}$ (20)

The Lagrange function of problem (19) is [20] :

$L\left(x\mathrm{,}\lambda \right)=f\left(x\right)+{\lambda}^{\text{T}}\Phi \left(x\mathrm{,}\mu \right)$ (21)

where $\lambda ={\left({\lambda}_{1}\mathrm{,}{\lambda}_{2}\mathrm{,}\cdots \mathrm{,}{\lambda}_{n}\right)}^{\text{T}}$ is the Lagrangian multiplier vector, and $\left({x}^{\mathrm{*}}\mathrm{,}{\lambda}^{\mathrm{*}}\right)$ is set as KT pairs of problem (19), then it can be known from the optimality condition:

${\nabla}_{x}L\left({x}^{*},{\lambda}^{*}\right)=0,\text{\hspace{1em}}{\nabla}_{\lambda}L\left({x}^{*},{\lambda}^{*}\right)=\Phi \left({x}^{*},\mu \right)=0$ (22)

In addition, it is not difficult to find any x in the feasible domain is satisfied:

$L\left({x}^{*},{\lambda}^{*}\right)=f\left({x}^{*}\right)\le f\left(x\right)=f\left(x\right)+{\left({\lambda}^{*}\right)}^{\text{T}}\Phi \left(x,\mu \right)=L\left(x,{\lambda}^{*}\right)$ (23)

The above equation shows that if the multiplier vector ${\lambda}^{\mathrm{*}}$ is known, the problem (19) can be equivalently transformed into:

$\begin{array}{l}\mathrm{min}\text{\hspace{1em}}L\left(x,{\lambda}^{*}\right)\\ \text{s}\text{.t}\text{.}\text{\hspace{1em}}{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)=0,i=1,2,\cdots ,n\end{array}$ (24)

Then the external penalty function method is considered to solve the problem (24). The augmented objective function is

${L}_{\mu}\left(x,\lambda ,\alpha \right)=L\left(x,\lambda \right)+\frac{\alpha}{2}{\Vert \Phi \left(x,\mu \right)\Vert}^{2}=f\left(x\right)+{\lambda}^{\text{T}}\Phi \left(x,\mu \right)+\frac{\alpha}{2}{\Vert \Phi \left(x,\mu \right)\Vert}^{2}$ (25)

In this way, we can fix $\lambda =\stackrel{\xaf}{\lambda}$ and find a minimum of ${L}_{\mu}\left(x\mathrm{,}\stackrel{\xaf}{\lambda}\mathrm{,}\alpha \right)$, then change the value of $\lambda $ appropriately to find a new value $\stackrel{\xaf}{x}$, until we get the ${x}^{\mathrm{*}}$ and ${\lambda}^{\mathrm{*}}$ that we want. Specifically, when solving the minimum ${x}^{\left(k\right)}$ of $\mathrm{min}{L}_{\mu}\left(x,{\lambda}^{\left(k\right)},\alpha \right)$ in the k-th iteration of the unconstrained subproblem, the necessary conditions for taking the extreme values are known:

${\nabla}_{x}{L}_{\mu}\left({x}^{\left(k\right)},{\lambda}^{\left(k\right)},\alpha \right)=\nabla f\left({x}^{\left(k\right)}\right)+\nabla \Phi \left({x}^{\left(k\right)},\mu \right)\left[{\lambda}^{\left(k\right)}+\alpha \Phi \left({x}^{\left(k\right)},\mu \right)\right]=0$ (26)

And the KT-point of $\left({x}^{\mathrm{*}}\mathrm{,}{\lambda}^{\mathrm{*}}\right)$ the original problem satisfies:

$\nabla f\left({x}^{*}\right)+\nabla \Phi \left({x}^{*},\mu \right){\lambda}^{*}=0,\text{\hspace{1em}}\Phi \left({x}^{*},\mu \right)=0$ (27)

In order for $\left\{{x}^{\left(k\right)}\right\}\to {x}^{\mathrm{*}}$ and $\left\{{\lambda}^{\left(k\right)}\right\}\to {\lambda}^{\mathrm{*}}$, after comparing the above two expressions, the updating formula of the multiplier sequence $\left\{{\lambda}^{\left(k\right)}\right\}$ is:

${\lambda}_{i}^{\left(k+1\right)}={\lambda}_{i}^{\left(k\right)}+{\alpha}^{\left(k\right)}{\varphi}_{{\mu}^{\left(k\right)}}\left({x}_{i}^{\left(k\right)},1-{x}_{i}^{\left(k\right)}\right),i=1,2,\cdots ,n$ (28)

It can be seen from Equation (28) that the sufficient and necessary condition for $\left\{{\lambda}^{\left(k\right)}\right\}$ convergence is $\left\{{\varphi}_{\mu}\left({x}^{\left(k\right)},1-{x}^{\left(k\right)}\right)\right\}\to 0$ And then we proof that ${\varphi}_{\mu}\left({x}^{\left(k\right)},1-{x}^{\left(k\right)}\right)=0$ is also a necessary and sufficient condition for judging the KT-point $\left({x}^{\mathrm{*}}\mathrm{,}{\lambda}^{\mathrm{*}}\right)$ .

Theorem 3.1 Let ${x}^{\left(k\right)}$ be the minimum point of the unconstrained optimization problem:

$\mathrm{min}\text{\hspace{1em}}{L}_{\mu}\left(x,{\lambda}^{\left(k\right)},\alpha \right)=L\left(x,{\lambda}^{\left(k\right)}\right)+\frac{\alpha}{2}{\Vert \Phi \left(x,\mu \right)\Vert}^{2}$ (29)

then $\left({x}^{\left(k\right)}\mathrm{,}{\lambda}^{\left(k\right)}\right)$ being a KT-point to the (19) has a necessary and sufficient condition of ${\varphi}_{\mu}\left({x}^{\left(k\right)}\mathrm{,1}-{x}^{\left(k\right)}\right)=0$ .

Proof The necessity is obvious. The following proof is sufficient, since ${x}^{\left(k\right)}$ is a minimum of (29) and ${\varphi}_{\mu}\left({x}^{\left(k\right)}\mathrm{,1}-{x}^{\left(k\right)}\right)=0$, therefore, for any feasible point:

$f\left(x\right)={L}_{\mu}\left(x,{\lambda}^{\left(k\right)},\alpha \right)\ge {L}_{\mu}\left({x}^{\left(k\right)},{\lambda}^{\left(k\right)},\alpha \right)=f\left({x}^{\left(k\right)}\right)$ (30)

that is, ${x}^{\left(k\right)}$ is also a minimum of (19). On the other hand, it is noted that ${x}^{\left(k\right)}$ is also the stable point of (29), therefore

${\nabla}_{x}{L}_{\mu}\left({x}^{\left(k\right)},{\lambda}^{\left(k\right)},\alpha \right)=\nabla f\left({x}^{\left(k\right)}\right)+\nabla \Phi \left({x}^{\left(k\right)},\mu \right){\lambda}^{\left(k\right)}=0$ (31)

The above formula indicates that ${\lambda}^{\left(k\right)}$ is the Lagrangian multiplier vector with respect to ${x}^{\left(k\right)}$, that is to say $\left({x}^{\left(k\right)}\mathrm{,}{\lambda}^{\left(k\right)}\right)$ is also the KT-point of (19).

Based on the above discussion, before giving the detailed steps to solve the multiplier method of equation constraint problem (19), the following lemmas are given by studying some properties of ${\varphi}_{\mu}\left({x}_{i}\mathrm{,1}-{x}_{i}\right)$ .

Lemma 3.2 For each $i=1,2,\cdots ,n$, the following formula holds:

${\varphi}_{M}\left({x}_{i},1-{x}_{i}\right)-{\mu}^{-1}\mathrm{ln}2\le {\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\le {\varphi}_{M}\left({x}_{i},1-{x}_{i}\right)$ (32)

Proof According to the properties of aggregate function:

${f}_{\mathrm{max}}\left(-{x}_{i},-1+{x}_{i}\right)\le {F}_{\mu}\left({x}_{i},1-{x}_{i}\right)\le {f}_{\mathrm{max}}\left(-{x}_{i},-1+{x}_{i}\right)+{\mu}^{-1}\mathrm{ln}2$ (33)

thus $-{f}_{\mathrm{max}}\left(-{x}_{i},-1+{x}_{i}\right)-{\mu}^{-1}\mathrm{ln}2\le {\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\le -{f}_{\mathrm{max}}\left(-{x}_{i},-1+{x}_{i}\right)$ . That is to say, $\mathrm{min}\left\{{x}_{i},1-{x}_{i}\right\}-{\mu}^{-1}\mathrm{ln}2\le {\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\le \mathrm{min}\left\{{x}_{i},1-{x}_{i}\right\}$ . Therefore, ${\varphi}_{M}\left({x}_{i},1-{x}_{i}\right)-{\mu}^{-1}\mathrm{ln}2\le {\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\le {\varphi}_{M}\left({x}_{i},1-{x}_{i}\right)$ .

Lemma 3.3 For any $\mu >0$, there is

$\begin{array}{l}\left|{\varphi}_{M}\left({x}_{i},1-{x}_{i}\right)-{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\right|\le {\mu}^{-1}\mathrm{ln}2\\ \Vert \Phi \left(x\right)-\Phi \left(x,\mu \right)\Vert \le {\mu}^{-1}\sqrt{n}\mathrm{ln}2\end{array}$ (34)

Proof Easy to obtain by lemma 3.2 that: $0\le {\varphi}_{M}\left({x}_{i},1-{x}_{i}\right)-{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\le {\mu}^{-1}\mathrm{ln}2$, so

$\left|{\varphi}_{M}\left({x}_{i},1-{x}_{i}\right)-{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\right|\le {\mu}^{-1}\mathrm{ln}2,i=1,2,\cdots ,n.$ (35)

From the above equation we know:

$\begin{array}{c}0\le \Vert \Phi \left(x\right)-\Phi \left(x,\mu \right)\Vert ={\left\{{\displaystyle \underset{i=1}{\overset{n}{\sum}}}{\left[{\varphi}_{M}\left({x}_{i},1-{x}_{i}\right)-{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\right]}^{2}\right\}}^{\frac{1}{2}}\\ \le {\left[{\left({\mu}^{-1}\mathrm{ln}2\right)}^{2}n\right]}^{\frac{1}{2}}={\mu}^{-1}\sqrt{n}\mathrm{ln}2\end{array}$ (36)

Lemma 3.4 For each $i=1,2,\cdots ,n$, ${\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)$ is strictly convex on the interval $\left(-\infty \mathrm{,}+\infty \right)$ ; When $\mu \ge {10}^{2}$, ${\varphi}_{\mu}^{2}\left({x}_{i}\mathrm{,1}-{x}_{i}\right)$ is convex on the interval $\left(-\infty \mathrm{,0.4737}\right)\cup \left(\mathrm{0.5263,}+\infty \right)$ .

Proof According to the definition of ${\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)$, it can be seen that:

$\begin{array}{l}{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)={\mu}^{-1}\mathrm{ln}\left\{\mathrm{exp}\left(-\mu {x}_{i}\right)+\left[-\mu \left(1-{x}_{i}\right)\right]\right\}\\ {\varphi}_{\mu}^{2}\left({x}_{i},1-{x}_{i}\right)={\langle {\mu}^{-1}\mathrm{ln}\left\{\mathrm{exp}\left(-\mu {x}_{i}\right)+\left[-\mu \left(1-{x}_{i}\right)\right]\right\}\rangle}^{2}\end{array}$ (37)

let ${r}_{1}={\text{e}}^{-\mu {x}_{i}}/\left({\text{e}}^{-\mu {x}_{i}}+{\text{e}}^{-\mu \left(1-{x}_{i}\right)}\right),{r}_{2}={\text{e}}^{-\mu \left(1-{x}_{i}\right)}/\left({\text{e}}^{-\mu {x}_{i}}+{\text{e}}^{-\mu \left(1-{x}_{i}\right)}\right)$ . We could get,

$\begin{array}{l}{{\varphi}^{\u2033}}_{\mu}\left({x}_{i},1-{x}_{i}\right)=4\mu {r}_{1}{r}_{2}>0,i=1,2,\cdots ,n;\\ {\left({\varphi}_{\mu}^{2}\left({x}_{i},1-{x}_{i}\right)\right)}^{\prime \text{}\prime}=8\mu \left[{\left({r}_{1}-{r}_{2}\right)}^{2}/\left(4\mu {r}_{1}{r}_{2}\right)+{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\right].\end{array}$ (38)

So ${\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)$ is strictly convex on the interval $\left(-\infty \mathrm{,}+\infty \right)$ ; for ${\varphi}_{\mu}^{2}\left({x}_{i},1-{x}_{i}\right)$, when $\mu \ge {10}^{2}$ all the $x\notin \left(\mathrm{0.4737,0.5263}\right)$ satisfy ${\left({\varphi}_{\mu}^{2}\left({x}_{i},1-{x}_{i}\right)\right)}^{\prime \text{}\prime}>0$, so when $\mu \ge {10}^{2}$, ${\varphi}_{\mu}^{2}\left({x}_{i},1-{x}_{i}\right)$ is convex on the interval $\left(-\infty \mathrm{,0.4737}\right)\cup \left(\mathrm{0.5263,}+\infty \right)$ .

Lemma 3.5 For any $\mu >0$ and $\alpha >0$ that are large enough, as long as the region where x satisfies the following equation: ${\left({r}_{2}-{r}_{1}\right)}^{2}/\left(4\mu {r}_{1}{r}_{2}\right)+{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\mathrm{>0}$, the augmented Lagrange function ${L}_{\mu}\left(x\mathrm{,}\lambda \mathrm{,}\alpha \right)$ defined by Equation (25) is convex in this region.

Proof Let $l\left(x\mathrm{,}\lambda \right)={x}^{\text{T}}Qx+{\lambda}^{\text{T}}\Phi \left(x\mathrm{,}\mu \right)$ then the hesse matrix of ${L}_{\mu}\left(x\mathrm{,}\lambda \mathrm{,}\alpha \right)$ is

${\nabla}_{xx}^{2}{L}_{\mu}\left(x\mathrm{,}\lambda \mathrm{,}\alpha \right)={\nabla}_{xx}^{2}l\left(l\mathrm{,}p\right)+\alpha B\left(x\mathrm{,}p\right)$ (39)

where

$B\left(x,p\right)=Diag\left[{\left({{\varphi}^{\prime}}_{\mu}\left({x}_{i},1-{x}_{i}\right)\right)}^{2}+{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right){{\varphi}^{\u2033}}_{\mu}\left({x}_{i},1-{x}_{i}\right)\right],i=1,2,\cdots ,n$ (40)

and

$\begin{array}{l}{\left({{\varphi}^{\prime}}_{\mu}\left({x}_{i},1-{x}_{i}\right)\right)}^{2}+{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right){{\varphi}^{\u2033}}_{\mu}\left({x}_{i},1-{x}_{i}\right)\\ =8\mu \left[{\left({r}_{2}-{r}_{1}\right)}^{2}/\left(4\mu {r}_{1}{r}_{2}\right)+{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\right]\end{array}$ (41)

So we just have to satisfy ${\left({r}_{2}-{r}_{1}\right)}^{2}/\left(4\mu {r}_{1}{r}_{2}\right)+{\varphi}_{\mu}\left({x}_{i},1-{x}_{i}\right)\mathrm{>0}$, the matrix $B\left(x,p\right)$ is positive, therefore for sufficiently large $\alpha >0$, ${L}_{\mu}\left(x\mathrm{,}\lambda \mathrm{,}\alpha \right)$ is convex in the region that satisfies the above conditions. When $\mu >{10}^{3}$ the interval is ${\varphi}_{\mu}^{2}\left({x}_{i},1-{x}_{i}\right)$ is convex on the interval $\left(-\infty \mathrm{,0.4962}\right)\cup \left(\mathrm{0.5038,}+\infty \right)$ ; when $\mu >{10}^{4}$ the interval is $\left(-\infty \mathrm{,0.4995}\right)\cup \left(\mathrm{0.5005,}+\infty \right)$ .

4. Algorithm

According to lemma 3.5, we can use the augmented Lagrange penalty function to solve the problem (19). For a strictly monotonic increasing sequence $\left\{{\mu}^{k}\right\}$

and $\left\{{\alpha}^{k}\right\}$, the solution of unconstrained optimization $\underset{x\in {R}^{n}}{\mathrm{min}}{L}_{\mu}\left(x,{\lambda}^{\left(k\right)},{\alpha}^{\left(k\right)}\right)$ is

$\left\{{x}^{k}\right\}$, the corresponding Lagrange multiplier is $\left\{{\lambda}^{k}\right\}$, and is modified according to (28) in the iteration.

Based on the previous analysis, the basic algorithm for solving the problem (19) is as follows [19] :

Algorithm 1

Step 1 Given parameters ${\mu}^{\left(0\right)}>0$, ${\alpha}^{\left(0\right)}>0$, ${\epsilon}_{1}>0$, ${\epsilon}_{2}>0$ and ${\sigma}_{1}>1$, ${\sigma}_{2}>1$ . Select a starting point ${x}^{\left(s\mathrm{,0}\right)}\mathrm{,}{\lambda}^{\left(0\right)}$ and set ${t}^{\left(0\right)}=\Vert \Phi \left({x}^{\left(s,0\right)},{\mu}^{\left(0\right)}\right)\Vert $, k = 0;

Step 2 Using BFGS algorithm solve the unconstrained minimization problem

$\underset{x\in {R}^{n}}{\mathrm{min}}{L}_{\mu}\left(x,{\lambda}^{\left(k\right)},{\alpha}^{\left(k\right)}\right)$ with the starting point ${x}^{\left(s\mathrm{,}k\right)}$, and denote by ${x}^{\left(k\right)}$ its optimal solution;

Step 3 If $=\Vert \Phi \left({x}^{\left(k\right)},{\mu}^{\left(k\right)}\right)\Vert \le {\epsilon}_{1},\Vert f\left({x}^{\left(k\right)}\right)-f\left({x}^{\left(s,k\right)}\right)\Vert \le {\epsilon}_{2}$, set ${x}^{*}={x}^{\left(k\right)}$ go to Step 6; else go to Step 4;

Step 4 If $=\Vert \Phi \left({x}^{\left(k\right)},{\mu}^{\left(k\right)}\right)\Vert \le 0,1{t}^{\left(k\right)}$ go to Step 5; else update the parameter ${\alpha}^{\left(k+1\right)}={\sigma}_{1}{\alpha}^{\left(k\right)}$, ${\mu}^{\left(k+1\right)}={\sigma}_{2}{\mu}^{\left(k\right)}$ set $k=k+1$, and go to Step 1;

Step 5 Set ${t}^{\left(k\right)}=\Vert \Phi \left({x}^{\left(k\right)},{\mu}^{\left(k\right)}\right)\Vert $, ${x}^{\left(s,k+1\right)}={x}^{\left(k\right)}$, modify ${\lambda}^{\left(k+1\right)}$ according to Equation (28), set $k=k+1$, and go to Step 1;

Step 6 Finish.

The optimal solution sequence ${x}^{\left(k\right)}$ required by the above algorithm makes $\Vert \Phi \left({x}^{\left(k\right)},{\mu}^{\left(k\right)}\right)\Vert $ converge to 0 at a speed of at least 0.1. If this requirement is not met in an iteration, the penalty factor and smooth parameter should be increased automatically.

Then we introduce the BFGS algorithm mentioned in step 2. It is the most popular and effective quasi-newton methods at present. It was proposed by Broyden, Fletcher, Goldfarb, and Shanno independently in 1970, so it is called BFGS algorithm. We know that the basic idea of quasi-newton method is to replace the Hesse matrix ${G}^{\left(l\right)}={\nabla}^{2}f\left({x}^{\left(l\right)}\right)$ with one of its approximate matrixs ${B}^{\left(l\right)}$ in the steps of basic Newton method, and the following relational expression is required:

${B}^{\left(l+1\right)}{s}^{\left(l\right)}={y}^{\left(l\right)}$ (42)

where displacement ${s}^{\left(l\right)}={x}^{\left(s,l\right)}-{x}^{\left(s,l-1\right)}$, gradient difference ${y}^{\left(l\right)}={g}^{\left(l\right)}-{g}^{\left(l-1\right)}$, (42) is usually referred to as quasi-newton equation or quasi-newton condition.

Algorithm 2 (BFGS)

Step 1 Given parameters $\delta \in \left(\mathrm{0,1}\right)$, $\sigma \in \left(\mathrm{0,0.5}\right)$ select a starting point ${x}^{\left(s,l\right)}$ and positive matrices ${B}^{\left(0\right)}$ ( usually set as $G\left({x}_{0}\right)$ or the unit matrix ${I}_{n}$ ),and set $l=0$, $f\left(x\right)={L}_{\mu}\left(x,{\lambda}^{\left(k\right)},{\alpha}^{\left(k\right)}\right)$ ;

Step 2 Calculate ${g}^{\left(l\right)}=\nabla f\left({x}^{\left(s,l\right)}\right)$, if $\Vert {g}^{\left(l\right)}\Vert \le \epsilon $ go to Step 6, output ${x}^{\left(s,l\right)}$ as approximate minimum point; else go to Step 3;

Step 3 Solve the linear equations ${B}^{\left(l\right)}d=-{g}^{\left(l\right)}$ and get the solution ${d}^{\left(l\right)}$ ;

Step 4 Suppose ${m}^{\left(l\right)}$ is the minimum non-negative solution that satisfies the following inequality:

$f\left({x}^{\left(s,l\right)}\right)+{\delta}^{m}{d}^{\left(l\right)}\le f\left({x}^{\left(s,l\right)}\right)+\sigma {\delta}^{m}{\left({g}^{\left(l\right)}\right)}^{\text{T}}{d}^{(l)}$

set ${\delta}^{\left(l\right)}={\delta}^{{m}^{\left(l\right)}}$, ${x}^{\left(s,l+1\right)}={x}^{\left(s,l\right)}+{\alpha}^{\left(l\right)}{d}^{\left(l\right)}$ ;

Step 5 Set ${s}^{\left(l\right)}={x}^{\left(s,l\right)}-{x}^{\left(s,l-1\right)}$ ${y}^{\left(l\right)}={g}^{\left(l\right)}-{g}^{\left(l-1\right)}$,by the correction formula:

${B}^{\left(l+1\right)}=(\begin{array}{ll}{B}^{\left(l\right)}\hfill & {\left({y}^{\left(l\right)}\right)}^{\text{T}}{s}^{\left(l\right)}\le 0\hfill \\ {B}^{\left(l\right)}-\frac{{B}^{\left(l\right)}{s}^{\left(l\right)}{\left({s}^{\left(l\right)}\right)}^{\text{T}}{B}^{\left(l\right)}}{{\left({s}^{\left(l\right)}\right)}^{\text{T}}{B}^{\left(l\right)}{s}^{\left(l\right)}}+\frac{{y}^{\left(l\right)}{\left({y}^{\left(l\right)}\right)}^{\text{T}}}{{\left({y}^{\left(l\right)}\right)}^{\text{T}}{s}^{\left(l\right)}}\hfill & {\left({y}^{\left(l\right)}\right)}^{\text{T}}{s}^{\left(l\right)}>0\hfill \end{array}$ (43)

identify the ${B}^{\left(l+1\right)}$, set $l=l+1$ and go to Step 1;

Step 6 Finish.

5. Number Experiments

About BQP questions, the current relatively popular in the question bank is J. E. Beasleys OR-library (http://people.brunel.ac.uk/~%20mastjjb/jeb/orlib/bqpinfo.html), in this paper, we has carried on the numerical experiment on parts of question bank BQP questions, and the calculation results of our continuous algorithm are compared with those of the best solution of known, as shown in Table 1.

The first column Mu in the Table 1 is the question number; The second column m denotes the number of variables; The third is the numerical result of this paper; The fourth is the best known solution; The fifth column represents our numerical result as a percentage of the best solution.

Table 1. Comparison of numerical results.

It’s not hard to see from the Table 1 that in the process of solving these problems, the optimal solution obtained by our algorithm is basically close to the best known solution. Through the comparison in the table above, it can be seen that the proposed continuous algorithm is basically close to the known best solution for solving the general unconstrained 0 - 1 quadratic programming problem.

6. Conclusion

In this paper, we have reformulated an unconstrained BQP problem as a MPEC problem by the equivalent complementarity conditions of a binary vector. To seek a global minimizer of the resulting continuous optimization problem, we construct a global smoothing function and develop a global continuation algorithm via a sequence of unconstrained minimization. And the numerical results indicate that the continuous approach proposed is extremely promising, especially for those large problems, in terms of the quality of the optimal values generated and the computational work involved. In addition, this new approach can be extended to general nonlinear binary optimization problems, and we can leave it as a future research topic.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Hammer, P. and Rudeanu, S. (1968) Boolean Methods in Operations Research. Elsevier B.V., New York.

https://doi.org/10.1007/978-3-642-85823-9

[2] Glover, F., Kochenberger, G., Alidaee, B. and Amini, M.M. (1999) Tabu with Search Critical Event Memory: An Enhance Application for Binary Quadratic Programs. Kluwer Academic Publisher, Boston.

[3] Goffin, J.L. (1997) Solving Nonlinear Multicommodity Flow Problems by the Analytic Center Cutting Plane Method. Mathematical Programming, 76, 131-154.

https://doi.org/10.1007/BF02614381

[4] Christoph, H. and Rendl, F. (1998) Solving Quadratic (0,1)-Problems by Semidefinite Programs and Cutting Planes. Mathematical Programming, 82, 291-315.

https://doi.org/10.1007/BF01580072

[5] Wolkowicz, H. and Anjos, M.F. (2002) Semidefinite Programming for Discrete Optimization and Matrix Completion Problem. Discrete Applied Mathematics, 123, 513-577.

https://doi.org/10.1016/S0166-218X(01)00352-3

[6] Beasley, J.E. (1998) Heuristic Algorithms for the Unconstrained Binary Quadratic. Management School of Imperial College of U.K., London, 1-36.

[7] Glover, F. (2002) One-Pass Heuristics for Large-Scale Unconstrained Binary Quadratic Problems. European Journal of Operational Research, 137, 272-287.

https://doi.org/10.1016/S0377-2217(01)00209-0

[8] Warners, J.P. (1997) Potential Reduction Algorithms for Structured Combinatorial Optimization Problems. Operations Research Letters, 21, 55-64.

https://doi.org/10.1016/S0167-6377(97)00031-X

[9] Audet, C. (1997) Links between Linear Bilevel and Mixed 0 - 1 Programming Problems. Journal of Optimization Theory and Applications, 93, 273-300.

https://doi.org/10.1023/A:1022645805569

[10] Kiwiel, K.C. (2000) Bregman Proximal Relaxation of Large-Scale 0 - 1 Problems. Computational Optimization and Applications, 15, 33-44.

https://doi.org/10.1023/A:1008770914218

[11] Pardalos, P.M. (2000) Recent Developments and Trends in Global Optimization. Journal of Computational and Applied Mathematics, 124, 209-228.

https://doi.org/10.1016/S0377-0427(00)00425-8

[12] Pardalos, P.M. (1996) Continuous Approaches to Discrete Optimization Problems. Nonlinear Optimization and Applications. Plenum Publishing, New York, 313-328.

https://doi.org/10.1007/978-1-4899-0289-4_22

[13] Ng, K.M. (2002) A Continuation Approach for Solving Nonlinear Optimization Problems with Discrete Variables. Stanford University, Stanford.

[14] Horst, R., Pardalos, P.M. and Thoai, N.V. (1995) Introduction to Global Optimization. Kluwer Academic Publisher, Boston.

[15] Mangasarian, O.L. (1976) Equivalence of the Complementarity Problem to a System of Nonlinear Equations. SIAM Journal on Applied Mathematics, 31, 89-92.

https://doi.org/10.1137/0131009

[16] Li, X. (1991) An Aggregate Constrained Method for Nonlinear Programming. The Operations Research Society, 42, 1003-1010.

https://doi.org/10.1057/jors.1991.190

[17] Li, X. (1992) An Entropy-Based Aggregate Method for Minimax Optimization. Engineering Optimization, 18, 277-285.

https://doi.org/10.1080/03052159208941026

[18] Li, Y. and Li, X. (2009) Continuous Approaches to 0 - 1 Programming Problem with Applications. PhD Thesis, Dalian University of Technology, Dalian.

[19] Tao, T. and Li, X. (2006) Continuous Approaches to Discrete Optimum Design. PhD Thesis, Dalian University of Technology, Dalian.

[20] Ma, C. (2010) Optimization Method and Matlab Program Design. Science Press, Beijing.