Recently, the integrated optimal control and parameter estimation (IOCPE) algorithm has been proposed  in solving the nonlinear optimal control problem, both for discrete time deterministic and stochastic cases (see for more detail in  -  ). In essence, the concept of the IOCPE algorithm is come from the dynamic integrated system optimization and parameter estimation (DISOPE) algorithm, which was developed by  . By using the DISOPE algorithm, optimal control of the deterministic dynamical systems, not only for continuous time but also for discrete time, has been widely discussed   . On this point of view, the applications of the DISOPE algorithm have been well-defined. Date back to the 70s,  and  proposed the integrated system optimization and parameter estimation (ISOPE) algorithm, which is for solving the static optimization problems. Since then, the development of ISOPE algorithm in the dynamic version is rapidly growing up till today.
In fact, the basic idea for ISOPE, DISOPE and IOCPE algorithms is the principle of model-reality differences    . Because the structure of the nonlinear optimal control problem is complex and solving such problem is computationally demanding, the simplified model for the original optimal control problem is proposed to be solved iteratively. By adding the adjusted parameters into the model used, the differences between the model used and the real plant can be measured. This measurement is done repeatedly, in turn, to update the optimal solution of the model used. Once the convergence is achieved, the iterative solution approximates to the true optimal solution of the original optimal control problem, in spite of model-reality differences    . Besides, for solving the discrete time nonlinear stochastic optimal control problem, the Kalman filtering theory is associated with the principle of model-reality differences in order to do state estimation and system optimization     .
By virtue of the evolution of these algorithms, the feedback optimal control law is provided in solving the nonlinear optimal control problems, and their effectiveness has been well-confirmed. Nevertheless, the applicability of the open-loop optimal control law in these algorithms shall be investigated such that the popularity of these algorithms could be promoted. This is because of the open-loop optimal control sequences could be generated by taking the advantage of the power of the state-of-the-art nonlinear programming (NLP) solver. Thus, as an efficient optimization technique, the conjugate gradient method   has been explored to solve the optimal control problem    since last few decades. Thereby, the use of the conjugate gradient method inspires us to explore this method in the IOCPE algorithm practically.
Hence, the application of the conjugate gradient method is discussed in this paper for solving the nonlinear optimal control problem, where the model-reality differences are considered. Apparently, the model-based optimal control problem, which is simplified from the nonlinear optimal control problem, is constructed. Follow from this, the Hamiltonian function is defined and the augmented cost function is obtained. Then, the set of the necessary conditions for optimality is derived. Consequently, the modified model-based optimal control problem is converted to be a nonlinear optimization problem. By applying the conjugate gradient approach, the nonlinear optimization problem is solved and the optimal control sequences are generated. With this open-loop control law, the dynamical system is optimized and the cost function is evaluated. For illustration, optimal control of an economic growth problem  is discussed. The results obtained show the applicability of the algorithm proposed.
The structure of the paper is organized as follows. In Section 2, the problem statement is described briefly, where the simplified model from the nonlinear optimal control problem is discussed. In Section 3, system optimization with parameter estimation is further discussed. The use of the conjugate gradient approach in solving the model-based optimal control problem is presented and the calculation procedure is summarized as an iterative algorithm. In Section 4, an economic growth problem is solved and the results are obtained. Finally, the concluding remarks are made.
2. Problem Statement
Consider a general discrete-time optimal control problem, given by
where and are, respectively, the control sequences and the state sequences. Here, represents the real plant, is the cost under summation and is the terminal cost, whereas is the scalar cost function and is the known initial state vector. It is assumed that all functions in Equation (1) are continuously differentiable with respect to their respective arguments.
This problem is regarded as the real optimal control problem, and is referred to as Problem (P). Note that the structure of Problem (P) is complex and nonlinear, solving Problem (P) requires the efficient computation techniques. On this point of view, the simplified model of Problem (P) is probably suggested to be solved in order to approximate the true optimal solution of Problem (P). Therefore, let us define this simplified model-based optimal control problem as follows:
where and are introduced as the adjusted parameters, whereas A is an transition matrix and B is an control coefficient matrix. Besides, and Q are positive semi-definite matrices, and R is a positive definite matrix. Here, is the scalar cost function.
This problem is referred to as Problem (M).
Notice that, due to the different structures and parameters, only solving Problem (M), without the adjusted parameters, would not obtain the optimal solution of Problem (P). However, by adding the adjusted parameters into Problem (M), the differences between the real plant and the model used can be calculated. In such a way, solving Problem (M) iteratively could give the correct optimal solution of Problem (P), in spite of model-reality differences.
3. System Optimization with Parameter Estimation
Now, introduce an expanded optimal control problem, which is referred to as Problem (E), given by
where and are introduced to separate the sequences of control and state in the optimization problem from the respective signals in the parameter estimation problem, and de
notes the usual Euclidean norm. The term and with are introduced to improve the convexity and
to facilitate the convergence of the resulting iterative algorithm. Here, it is classified that the algorithm is designed such that the constraints and are satisfied upon termination of the iterations, assuming that convergence is achieved. Moreover, the state constraint and the control constraint are used for the computation of the parameter estimation and matching scheme, while the corresponding state constraint and control constraint are reserved for optimizing the model-based optimal control problem. By virtue of this, system optimization and parameter estimation are mutually integrated.
3.1. Necessary Conditions for Optimality
Define the Hamiltonian function for Problem (E) by
where , and are modifiers. Then, the augmented cost function becomes
where and are the appropriate multipliers to be determined later.
Applying the calculus of variation    , the following necessary conditions for optimality are obtained:
1) Stationary condition:
2) Co-state equation:
3) State equation:
4) Boundary conditions:
5) Adjusted parameter equations:
6) Modifier equations:
with and .
7) Separable variables:
, , . (9)
Notice that the parameter estimation problem is defined by Equation (7) and the computation of multipliers is given by Equation (8). Indeed, the necessary conditions, which are defined by Equations (6a) to (6d), are the optimality for the modified model-based optimal control problem.
3.2. Modified Model-Based Optimal Control Problem
The modified model-based optimal control problem, which is referred to as Problem (MM), is given by
with the specified and , where the boundary conditions are given by and with the specified multiplier .
3.3. Open-Loop Optimal Control Law
For simplicity, define Problem (MM) as an equivalent nonlinear optimization problem with the initial control , given by
subject to for (11)
where the admissible control variable u is set to be
Let this problem as Problem (N). To proceed, it is noticed that solving Problem (N) could be done once the state Equation (6c) is solved forward and the costate Equation (6b) is solved backward with the corresponding control sequences u. In addition, the gradient function for the objective function is evaluated from
which can be calculated from the Hamiltonian function (4) and the stationary condition (6a) once the necessary conditions for optimality, given by Equations (6) - (9), are satisfied.
Suppose the gradient function (12) is represented as
Then, for arbitrary initial control , the initial gradient and the initial direction are, respectively, given by
and . (14)
By using the line search equation   , the control sequences can be generated from
where is determined from the one-dimensional search, that is,
Later, the gradient and the direction are updated as follow:
with the coefficient
where represents the iteration numbers.
Thus, we present the result on the obtaining optimal control law discussed above as a proposition, given below:
Proposition 1. Consider Problem (N). The control sequences , which is defined in Equation (15) and is represented by
is generated through a set of the direction vectors whose components are linearly independent. Also, the direction is conjugacy.
Proof: Refer  .
Here, the conjugate gradient algorithm for obtaining the optimal control law is summarized below:
Algorithm 1: Conjugate gradient algorithm
Data Choose the arbitrary initial control . Compute the initial gradient and the initial direction from Equation (14). Set = 0.
Step 1 Solve the state Equation (6c) forward in time from = 0 to = with the initial condition (6d) to obtain .
Step 2 Solve the costate Equation (6b) backward in time from = to = 0 with the boundary condition (6d), where is the solution obtained.
Step 3 Calculate the value of the cost functional from Equation (10).
Step 4 Determine the step size from Equation (16).
Step 5 Update the control from Equation (15).
Step 6 Update the gradient from Equation (17). If the gradient , stop, else go to Step 7.
Step 7 Compute the coefficient from Equation (19).
Step 8 Update the direction from Equation (18). Set , go to Step 1.
1) The initial control can be any valued-vectors, including the zero vector.
2) The gradient function for Problem (N) defined by Equation (11) is calculated from the stationary condition (6a). This is the turning point of using the conjugate gradient algorithm for solving Problem (M) defined by Equation (2) and Problem (MM) defined by Equation (10).
3) The optimal control sequences generated by the line search equation in Equation (15) is known as the open-loop control law.
4) The necessary conditions (6b) and (6c) shall be satisfied in solving Problem (N) defined by Equation (11).
3.4. Iterative Procedure
Accordingly, from the discussion above, a summary of the calculation procedure for the integrated system optimization and parameter estimation is made as follows:
Algorithm 2: Iterative procedure
Data . Note that A and B could be determined based on the linearization of at or from the linear terms of .
Step 0 Compute a nominal solution. Assume that and . Solve Problem (M) defined by Equation (2) to obtain and . Then, with and using from the data. Set , , and .
Step 1 Compute the parameters and from Equation (7). This is called the parameter estimation step.
Step 2 Compute the modifiers and from Equation (8). Notice that this step requires taking the derivatives of f and L with respect to and .
Step 3 With and , solve Problem (N) by using Algorithm 1. This is called the system optimization step.
a) Use Equation (15) to obtain the new control .
b) Use Equation (6c) to obtain the new state .
c) Use Equation (6b) to obtain the new costate .
Step 4 Test the convergence and update the optimal solution of Problem (P). In order to provide a mechanism for regulating convergence, a simple relaxation method is employed:
where are scalar gains. If and , within a given tolerance, stop; else set , and repeat the procedure starting from Step 1.
1) The variable is zero in Step 0. The calculated value of changes from iteration to iteration during the calculation procedure.
2) The conjugate gradient algorithm is applied to generate the control sequences for Problem (M) and Problem (MM), respectively.
3) Problem (P) is not necessary to be linear or to have a quadratic cost function.
4) The conditions and are required to be satisfied for the converged optimal control sequence and the converged state sequence. The following averaged 2-norms are computed and then they are compared with a given tolerance to verify the convergence of and :
5) The convergence result on the conjugate gradient algorithm can be referred to  , and the convergence result for Algorithm 2 is presented in  and  .
4. Illustrative Example
Consider a basic economic growth model   , which is a discrete time minimization problem, given by
where the payoff function and dynamics system are, respectively, defined by
Here, x is the capital stock, is the control variable, is the discount factor, whereas is a production function with constants , . The difference between the output and the next period’s capital stock is the consumption.
Let us refer this problem as Problem (P). In literature, the exact solution of Problem (P) is known  , and is given by
The unique optimal equilibrium for Problem (P) is given by
By using the specified parameters , and , the optimal equilibrium is  .
In the following, we introduce a simplified model-based optimal control model, which is derived from Problem (P) and is referred to as Problem (M), given below:
Note that Problem (M) and Problem (P) are different from the structures and the parameters used.
After running the algorithm proposed within the tolerance (10−6), the result is shown in Table 1. The initial cost, which is 13.072, is the cost spent before taking into account system optimization with parameter estimation. At the end of implementing the algorithm proposed, the final cost is 22.198. There are 41 iterations with 7.84 seconds to reach the convergence.
The graphical results for this economic growth model illustrate the application of the algorithm proposed. Figure 1 shows the final control trajectory and Figure 2 shows the final state trajectory, respectively. With this final control solution, it is observed that the final state towards to the steady state at x = 2.0673. Figure 3 shows the final costate trajectory, while Figure 4 and Figure 5 show, respectively, the adjusted parameters γ(k) and α(k). Overall, these solutions are in the optimal sense, which are verified by the satisfaction of the stationary condition shown in Figure 6.
5. Concluding Remarks
The use of the conjugate gradient approach in solving the nonlinear optimal control problem with model-reality differences was discussed in this paper. Essentially, the simplified model of the original optimal control problem, which is the linear optimal control problem by adding the adjusted parameters, is
Table 1. Simulation result.
Figure 1. Final control trajectory .
Figure 2. Final state trajectory (--) and state equilibrium (×××××).
Figure 3. Final costate trajectory .
Figure 4. Adjusted parameter .
Figure 5. Adjusted parameter .
Figure 6. Stationary condition g.
formulated. In solving this model-based optimal control problem, the conjugate gradient approach is employed to generate the open-loop control sequences such that the optimal solution is obtained. Here, the stationary condition is used to be the gradient function in the conjugate gradient approach. On the other hand, due on the different structure of the problems, the differences between the real plant and the model used, which is measured by the adjusted parameters repeatedly, are taken into consideration during the iteration calculation procedure. At the convergence, the optimal solution of the model used approximates to the true optimal solution of the original optimal control problem, in spite of model-reality differences. For illustration, the application of the algorithm proposed was discussed for solving a basic economic growth model. The results obtained show the efficiency of the algorithm proposed. In conclusion, the applicability of the algorithm is highly recommended.
The authors would like to acknowledge the Universiti Tun Hussein Onn Malaysia (UTHM) and the Ministry of Higher Education (MOHE) for the financial support for this study under the research grant FRGS VOT. 1561.
Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this paper.