A Smoothing Penalty Function Method for the Constrained Optimization Problem

Show more

1. Introduction

Many problems in industry design, management science and economics can be modeled as the following constrained optimization problem:

$\left(P\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\begin{array}{ll}\mathrm{min}\hfill & f\left(x\right)\hfill \\ \text{\hspace{0.05em}}\text{s}\text{.t}\text{.}\text{\hspace{0.05em}}\hfill & {g}_{j}\left(x\right)\le \mathrm{0,}j=\mathrm{1,2,}\cdots \mathrm{,}m\mathrm{,}\hfill \end{array}$ (1)

where $f,{g}_{j}:{\Re}^{n}\to \Re ,j=1,2,\cdots ,m$ are continuously differentiable functions. Let ${\mathcal{F}}_{0}$ be the feasible solution set, that is, ${\mathcal{F}}_{0}=\left\{x\in {\Re}^{n}|{g}_{j}\left(x\right)\le 0,j=1,2,\cdots ,m\right\}$ . Here we assume that ${\mathcal{F}}_{0}$ is nonempty.

The penalty function methods based on various penalty functions have been proposed to solve problem (P) in the literatures. One of the popular penalty functions is the quadratic penalty function with the form

${F}_{2}\left(x,\rho \right)=f\left(x\right)+\rho {\displaystyle \underset{j=1}{\overset{m}{\sum}}}\mathrm{max}{\left\{{g}_{j}\left(x\right),0\right\}}^{2},$ (2)

where $\rho >0$ is a penalty parameter. Clearly, ${F}_{2}\left(x\mathrm{,}\rho \right)$ is continuously differentiable, but is not an exact penalty function. In Zangwill [1], an exact penalty function was defined as

${F}_{1}\left(x,\rho \right)=f\left(x\right)+\rho {\displaystyle \underset{j=1}{\overset{m}{\sum}}}\mathrm{max}\left\{{g}_{j}\left(x\right),0\right\}.$ (3)

The corresponding penalty problem is

$\left({P}_{\rho}\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\begin{array}{ll}\mathrm{min}\hfill & {F}_{1}\left(x\mathrm{,}\rho \right)\hfill \\ \text{\hspace{0.05em}}\text{s}\text{.t}\text{.}\text{\hspace{0.05em}}\hfill & x\in {\Re}^{n}\mathrm{.}\hfill \end{array}$ (4)

We say that ${F}_{1}\left(x\mathrm{,}\rho \right)$ is an exact penalty function for Problem (P) partly because it satisfies one of the main characteristics of exactness, that is, under some constraint qualifications, there exists a sufficiently large ${\rho}^{\mathrm{*}}$ such that for each $\rho >{\rho}^{*}$ , the optimal solutions of Problem ( ${P}_{\rho}$ ) are all the feasible solutions of Problem (P), therefore, they are all the optimal solution of (P) (Di Pillo [2], Han [3] ).

The obvious difficulty with the exact penalty functions is that it is nondifferentiable, which prevents the use of efficient minimization methods that are based on Gradient-type or Newton-type algorithms, and may cause some numerical instability problems in its implementation. In practice, an approximately optimal solution to (P) is often only needed. Differentiable approximations to the exact penalty function have been obtained in different contexts such as in BeaTal and Teboulle [4], Herty et al. [5] and Pinar and Zenios [6]. Penalty methods based on functions of this class were studied by Auslender, Cominetti and Haddou [7] for convex and linear programming problems, and by Gonzaga and Castillo [8] for nonlinear inequality constrained optimization problems, respectively. In Xu et al. [9] and Lian [10], smoothing penalty functions are proposed for nonlinear inequality constrained optimization problems. This kind of functions is also described by Chen and Mangasarian [11] who constructed them by integrating probability distributions to study complementarity problems, by Herty et al. [5] to study the optimization problems with box and equality constraints, and by Wu et al. [12] to study global optimization problem. Meng et al. [13] propose two smoothing penalty functions to the exact penalty function

${F}_{3}\left(x,\rho \right)=f\left(x\right)+\rho {\displaystyle \underset{i=1}{\overset{m}{\sum}}}\sqrt{\mathrm{max}\left\{{g}_{j}\left(x\right),0\right\}}.$ (5)

In Wu et al. [14] and Lian [15], some smoothing techniques for (5) are also given.

Moreover, smoothed penalty methods can be applied to solve optimization problems with large scale such as network-structured problems and minimax problems in [6], and traffic flow network models in [5].

In this paper, we consider another simpler method for smoothing the exact penalty function ${F}_{1}\left(x\mathrm{,}\rho \right)$ , and construct the corresponding smoothed penalty problem. We show that our smooth penalty function can approximate ${F}_{1}\left(x\mathrm{,}\rho \right)$ well and has better smoothness. Based on our smooth penalty function, we give for (P) a simple smoothed penalty algorithm which is different from the existing literature in that the convergence of it can be obtained without the compactness of the feasible region of (P). We also give an approximate algorithm which enjoys some convergence under mild conditions.

The rest of this paper is organized as follows. In Section 2, we propose a method for smoothing the ${l}_{1}$ exact penalty function (3). The approximation function we give is convex and smooth. We give some error estimates among the optimal objective function values of the smoothed penalty problem, of the nonsmooth penalty problem and of the original constrained optimization problem. In Section 3, we present an algorithm to compute a solution to (P) based on our smooth penalty function and show the convergence of the algorithm. In particular, we give an approximate algorithm. Some computational aspects are discussed and some experiment results are given in Section 4.

2. A Smooth Penalty Function

We define a function ${P}_{\rho \epsilon}\left(t\right)$ :

${P}_{\rho \epsilon}\left(t\right)=\{\begin{array}{ll}\frac{1}{2}\epsilon {\text{e}}^{\frac{\rho t}{\epsilon}},\hfill & \text{\hspace{0.05em}}\text{if}\text{\hspace{0.05em}}\text{\hspace{0.17em}}t\le 0;\hfill \\ \rho t+\frac{1}{2}\epsilon {\text{e}}^{-\frac{\rho t}{\epsilon}},\hfill & \text{\hspace{0.05em}}\text{if}\text{\hspace{0.05em}}\text{\hspace{0.17em}}t>0,\hfill \end{array}$ (6)

given any $\epsilon >0,\rho >0$ .

Let ${P}_{\rho}\left(t\right)=\rho \mathrm{max}\left\{t,0\right\}$ , for any $\rho >0$ . It is easy to show that $\underset{\epsilon \to 0}{\mathrm{lim}}{P}_{\rho \epsilon}\left(t\right)={P}_{\rho}\left(t\right)$ .

The function ${P}_{\rho \epsilon}\left(t\right)$ is different from the function ${P}_{\epsilon}\left(t\right)$ given in [16] since here we use two parameters $\rho $ and. The function has the following abstractive properties.

(I) is at least twice continuously differentiable in t for. In fact, we have that

(7)

and

(8)

(II) is convex and monotonically increasing in t for any given.

Property (II) can follow from (I) immediately.

Note that. Consider the penalty function for (P) given by

(9)

where is a penalty parameter. Clearly, is at least twice continuously differentiable in any, if are all at least twice continuously differentiable, and is convex, if are all convex functions.

The corresponding penalty problem to is given as follows:

.

Since for any given, we will first study the relationship between () and ().

The following Lemma is easily to prove.

Lemma 2.1 For any given, and,

Two direct results of Lemma 2.1 are given as follows.

Theorem 2.1 Let be a sequence of positive numbers and assume is a solution to for some given. Let be an accumulating point of the sequence, then is an optimal solution to.

Theorem 2.2 Let be an optimal solution of () and an optimal solution of (). Then

It follows from this conclusion that can approximate well.

Theorem 2.1 and Theorem 2.2 show that an approximate solution to () is also an approximate solution to () when is sufficiently small.

Definition 2.1 A point is a -feasible solution or a -solution if,

Under this definition, we get the following result.

Theorem 2.3 Let be an optimal solution of () and an optimal solution of (). Furthermore, let be feasible to (P) and be -feasible to (P). Then,

Proof Since is -feasible to (P), then

(10)

Since is an optimal solution to (P), we have

Then by Theorem 2.2, we get

Thus,

Therefore, by (10), we obtain that

This completes the proof.

By Theorem 2.3, if an approximate optimal solution of () is -feasible, then it is an approximate optimal solution of (P).

For penalty function, there is a well known result of its exactness (see [3] ):

(*) There exists a, such that whenever, each optimal solution of is also an optimal solution of (P).

From the above conclusion, we can get the following result.

Theorem 2.4 For the constant in (*), let be an optimal solution of (). Suppose that for any, is an optimal solution of () where, then is a - feasible solution of (P).

Proof Suppose the contrary that the theorem does not hold, then there exists a, and, such that is an optimal solution for (), and the set is not empty.

Since is an optimal solution for () when, then for any, it holds that

(11)

Because is an optimal solution of (),is a feasible solution of (P). Therefore, we have that

On the other side,

which contradicts (11).

Theorem 2.4 implies that any optimal solution of the smoothed penalty problem is an approximately feasible solution of (P).

3. The Smoothed Penalty Algorithm

In this section, we give an algorithm based on the smoothed penalty function given in Section 2 to solve the nonlinear programming problem (P).

For, we denote

For Problem (P), let, for some. We consider the following algorithm.

Algorithm 3.1

Step 1. Given, and, let, go to Step 2.

Step 2. Take as the initial point, and compute

(12)

Step 3. Let, and

(13)

Let, go to Step 2.

We now give a convergence result for this algorithm under some mild conditions. First, we give the following assumption.

(A1) For any, , the set.

By this assumption, we obtain the following lemma firstly.

Lemma 3.1 Suppose that (A1) holds. Let be the sequence generated by Algorithm 3.1. Then there exists a natural number, such that for any,

Proof Suppose the contrary that there exists a subset, such that for any k, , and

Then there exists, such that for any, , where is given in Theorem 2.4. Therefore, by Theorem 2.4, when, it holds that

This contradicts.

Remark 3.1 From Lemma 3.1 we know that remains unchanged after finite iterations.

Lemma 3.2 Suppose that (A1) holds. Let be the sequence generated by Algorithm 3.1, and be any limit point of. Then

Lemma 3.3 Suppose that (A1) holds. Let be the sequence generated by Algorithm 3.1. Then for any k,

(14)

From Lemma 3.2 and Lemma 3.3, we have the following theorem.

Theorem 3.1 Suppose that (A1) holds. Let be the sequence generated by Algorithm 3.1. If is any limit point of, then is the optimal solution of (P).

Before giving another conclusion, we need the following assumption.

(A2) The function with respect to is lower semi-continuous at.

Theorem 3.2 Suppose that (A1) and (A2) holds. Let be the sequence generated by Algorithm 3.1. Then

Proof By Lemma 3.1, there exists, such that for any,. Thus,

Therefore,

From Assumption (A2), we know that. Therefore,

(15)

On the other side, by Lemma 3.3, when, we have

Then

(16)

Therefore, from (15) and (16),

Therefore,

The above theorem is different from the conventional conclusion in other literatures with respect to the convergence of penalty method.

In the following we give an approximate smoothed penalty algorithm for Problem (P).

Algorithm 3.2

Step 1. Let,. Given, and, let, go to Step 2.

Step 2. Take as the initial point, and compute

Step 3. If is an -feasible solution of (P), and, then stop. Otherwise, update and by applying the following rules:

if is an -feasible solution of (P), and, let, and

;

if is not an -feasible solution of (P), let and.

Let, go to Step 2.

Remark 3.1 By the analysis of the error estimates in Section 2, We know that whenever the penalty parameter is larger than some threshold, then for any, an optimal solution of the smoothed penalty problem is also an -feasible solution, which conversely gives an error bound for the optimal objective function value of the original problem.

4. Computational Aspects and Numerical Results

In this section, we will discuss some computational aspects and give some numerical results.

We apply Algorithm 3.2 to nonconvex nonlinear programming problem (P), for which we do not need to compute a global optimal solution but a local one. And in this case, we can also obtain the convergence by the following theorem.

For, we denote and

Theorem 4.1 Suppose Algorithm 3.2 does not terminate after finite iterations and the sequence is bounded. Then is bounded and any limit point of is feasible to (P), and there exist, and, such that

(17)

Proof First we show that is bounded. By the assumptions, there is some number L such that

Suppose to the contrary that is unbounded. Choose a subsequence of if necessary and we assume that

Then we get

which results in a contradiction since f is coercive.

We now show that any limit point of belongs to. Without loss of generality, we assume that

Suppose to the contrary that, then there exists some such that. Note that, are all continuous.

Note that

(18)

If, then for any k, the set is not empty. Because J is finite, then there exists a such that for any k is sufficiently large,.

It follows from (18) that

which contradicts the assumption that is bounded.

We now show that (17) holds.

Since for,

that is,

(19)

For, set

(20)

then,.

It follows from (19) and (20) that

Let

Then we have

When, we have that, , and

For, we get that. Therefore,. So (17) holds, and this completes the proof.

Theorem 4.1 implies that the sequence generated by Algorithm 3.2 may converge to a FJ point [17] to (P). The speed of convergence depends on the speed of the subprogram employed in Step 2 to solve the unconstrained optimization problem. Since the Function is continuously differentiable, we may use a Gradient-type method to get rapid convergence of our algorithm. In the following we will see some numerical experiments.

Example 4.1 (Hock and Schittkowski [18] ) Consider

The optimal solution to (P4.1) is given by with the optimal objective function value 22.627417. Let, , , , and in Algorithm 3.2. We choose for -feasibility. Numerical results for (P4.1) are given in Table 1, where for Table 1 we use a Gradient-type algorithm to solve the subproblem in Step 2.

Example 4.2 (Hock and Schittkowski [18] ) Consider

The optimal solution to (P4.2) is given by with the optimal objective function value −44. Let, , , , and in Algorithm 3.2. We choose for -feasibility. Numerical results for (P4.2) are given in Table 2, where for Table 2 we use a Gradient-type algorithm to solve the subproblem in Step 2.

Example 4.3 (Hock and Schittkowski [18] ) Consider

Table 1. Results with a starting point.

Table 2. Results with a starting point.

Table 3. Results with a starting point.

The optimal solution to (P4.3) is given by

with the optimal objective function value 680.6300573. Let, , , , and in Algorithm 3.2. We choose for -feasibility. Numerical results for (P4.3) are given in Table 3, where for Table 3 we use a Gradient-type algorithm to solve the subproblem in Step 2.

From the above classical examples, we can see that our approximate algorithm can produce the approximate optimal solutions of the corresponding problem successfully. But the convergent speed can be improved if we use the Newton-type method in Step 2 of Algorithm 3.2, which will be researched in our future work.

Funding

This research is supported by the Natural Science Foundations of Shandong Province (ZR2015AL011).

References

[1] Zangwill, W.I. (1967) Nonlinear Programming via Penalty Function Management. Science, 13, 334-358.

https://doi.org/10.1287/mnsc.13.5.344

[2] Di Pillo, G., Liuzzi, G. and Lucidi, S. (2011) An Exact Penalty-Lagrangian Approach for Large-Scale Nonlinear Programming. Optimization, 60, 223-252.

https://doi.org/10.1080/02331934.2010.505964

[3] Han, S.P. and Mangasarian, O.L. (1979) Exact Penalty Functions in Nonlinear Programming. Mathematical Programming, 17, 251-269.

https://doi.org/10.1007/BF01588250

[4] Ben-tal, A. and Teboulle, M. (1989) A Smoothing Technique for Non-Differentiable Optimization Problems. Lecture Notes in Mathematics, Springer Verlag, Berlin, Vol. 1405, 1-11.

https://doi.org/10.1007/BFb0083582

[5] Herty, M., Klar, A., Singh, A.K. and Spellucci, P. (2007) Smoothed Penalty Algorithms for Optimization of Nonlinear Models. Computational Optimization and Applications, 37, 157-176.

https://doi.org/10.1007/s10589-007-9011-6

[6] Pinar, M.C. and Zenios, S.A. (1994) On Smoothing Exact Penalty Functions for Convex Constarained Optimization. SIAM Journal on Optimization, 4, 486-511.

https://doi.org/10.1137/0804027

[7] Auslender, A., Cominetti, R. and Haddou, M. (1997) Asymptotic Analysis for Penalty and Barrier Methods in Convex and Linear Programming. Mathematics of Operational Research, 22, 43-62.

https://doi.org/10.1287/moor.22.1.43

[8] Gonzaga, C.C. and Castillo, R.A. (2003) A Nonlinear Programming Algorithm Based on Non-Coercive Penalty Functions. Mathematical Programming, 96, 87-101.

https://doi.org/10.1007/s10107-002-0332-z

[9] Xu, X.S., Meng, Z.Q., Sun, J.W., Huang, L.G. and Shen, R. (2013) A Second-Order Smooth Penalty Function Algorithm for Constarained Optimization Problems. Computational Optimization and Application, 55, 155-172.

https://doi.org/10.1007/s10589-012-9504-9

[10] Lian, S.J. (2012) Smoothing Approximation to L1 Exact Penalty Function for Inequality Constrained Optimization. Applied Mathematics and Computation, 219, 3113-3121.

https://doi.org/10.1016/j.amc.2012.09.042

[11] Chen, C. and Mangasarian, O.L. (1995) Smoothing Methods for Convex Inequalities and Linear Complementarity Problems. Mathematical Programming, 71, 51-69.

https://doi.org/10.1007/BF01592244

[12] Wu, Z.Y., Lee, H.W.J., Bai, F.S. and Zhang, L.S. (2017) Quadratic Smoothing Approximation to L1 Exact Penalty Function in Global Optimization. Journal of Industrial and Management Optimization, 1, 533-547.

https://doi.org/10.3934/jimo.2005.1.533

[13] Meng, Z.Q., Dang, C.Y. and Yang, X.Q. (2006) On the Smoothing of the Square-Root Exact Penalty Function for Inequality Constrained Optimization. Computational Optimization and Applications, 35, 375-398.

https://doi.org/10.1007/s10589-006-8720-6

[14] Wu, Z.Y., Bai, F.S., Yang, X.Q. and Zhang, L.S. (2004) An Exact Lower Order Penalty Function and Its Smoothing in Nonlinear Programming. Optimization, 53, 51-68.

https://doi.org/10.1080/02331930410001662199

[15] Lian, S.J. (2016) Smoothing of the Lower-Order Exact Penalty Function for Inequality Constrained Optimization. Journal of Inequalities and Applications, 2016, Article No. 185.

https://doi.org/10.1186/s13660-016-1126-9

[16] Liu, B.Z. (2009) On Smoothing Exact Penalty Functions for Nonlinear Constrained Optimization Problems. Journal of Applied Mathematics and Computing, 30, 259-270.

https://doi.org/10.1007/s12190-008-0171-z

[17] Mangasarian, O.L. and Fromovitz, S. (1967) The Fritz John Necessary Optimality Conditions in the Presence of Equality and Inequality Constraints. Journal of Mathematical Analysis and Applications, 17, 37-47.

https://doi.org/10.1016/0022-247X(67)90163-1

[18] Hock, W. and Schittkowski, K. (1981) Test Examples for Nonlinear Programming Codes. Lecture Notes in Economics and Mathematical Systems, Vol. 187, Springer-Verlag, Berlin, Heidelberg, New York.

https://doi.org/10.1007/978-3-642-48320-2