Recently, there have been an increased interest and commercialization of desalination systems due to significant improvement in technology and the advantageous developments in membrane technology. The dynamics of an RO desalination system are highly nonlinear, constrained and subject to uncertainties such as membrane fouling and varying feed water quality. Therefore, the design of a suitable controller for the RO desalination system is a very challenging task.
There have been several approaches for controlling nonlinear systems in general such as the linear quadratic regulator (LQR) , proportional integral derivative controller (PID) , backstepping control  and sliding mode control (SMC) . Nevertheless, all these techniques usually do not take into account the actual constraints of the process and just consider the control effects. Furthermore, the parameters of the controllers are chosen aimlessly, hence the optimality of the system cannot be guaranteed.
Model predictive control has been applied to control RO desalination processes     . It is obvious that the performance of the model predictive controller largely depends on the quality of the predictive model used, especially if the system is complex and highly nonlinear. Several techniques have been used for system identification for the MPC, e.g., Kalman filtering , maximum likelihood estimation  . However, it is known that the Kalman filter requires knowledge of the mathematics behind the system, which we know is very difficult to obtain for highly complex processes such as the RO desalination system with several unknown disturbances, and the physical phenomena such as membrane fouling. Artificial Neural Networks (ANNs) have proven to be very good function approximators and do not need any mathematical model, but the input-output data of the system . There have been applications of ANNs for the MPC control   , especially the Multilayer Perceptron (MLP). The MLP has some limitations to time variant systems, because the learned results are static input-output maps. Furthermore, the prediction steps of the MLP are limited .
In  , Recurrent Neural Networks (RNNs) were introduced into the structure of the MPC, because they can capture the system dynamics and provide long-range predictions . It is well-known that RNNs have issues with vanishing and exploding gradients, which makes their training difficult sometimes, therefore we propose to use a special form of RNN, i.e., the Long Short Term Memory (LSTM). LSTM is a special version of RNNs structure that was designed to model chronological sequences and their long-range dependencies more precisely than conventional RNNs .
Even though, it is not new to combine MPC with recurrent neural networks   , the application of LSTM as the predictive model for the MPC for desalination processes is hardly found in literature. This fact motivated us to put our focus on system identification using LSTM with a view towards closed-loop control with MPC for control of a RO desalination plant. The new contributions of this paper are the following:
• Introduction of LSTM as the predictive model in MPC to capture nonlinearities
• The combined structure of LSTM and MPC is new to RO desalination control
The remainder of the paper is outlined as follows. In Section 2 the model of the RO desalination plant and some scenarios for assessing the performance of the control system in closed loop will be described. Following this, in Section 3, a section about the methods and materials will be given, in which the method of system identification using a LSTM and the problem formulation for the MPC using the identified LSTM as the prediction model will be described. Finally, the results of the system identification and the closed loop simulations control performance and discussions will be given in Section 4.
2. RO Desalination Plant Model and Control Scenarios
In this section, the model of the RO desalination plant and some scenarios for assessing the performance of the control system in closed loop will be described.
2.1. RO Desalination Plant Model
A RO desalination plant shown in Figure 1 is used as the nonlinear plant on which the LSTM-based Model predictive control algorithm is applied to control the nonlinear process. The configuration of the system includes two tanks: a feed tank, and another tank for draining permeate. Furthermore, the plant includes reverse osmosis unit and a high pressure pump. A high-pressure pump is used to pump the water from the feed tank to the pressure ( ) into the RO unit. From the inflows and outflows of the feed tank, it is obvious that the feed water total dissolved solids (feed TDS)—feed water concentration ( ) is changing constantly, because some TDS leave with permeate ( ), some TDS are lost due to adhesion on the membrane surface and some TDS ( ) enter the system with the filling water ( ) for the feed water tank. The permeate concentration ( ) and the brine concentration ( ) and the total permeate quantity ( ) and brine ( ) at the outlet of the membrane module define the operating conditions of the RO unit itself and they can be controlled by adjusting the feed pressure at the RO unit inlet.
From Figure 1, the mass and the salt balances for the feed tank are given by the following equations:
Figure 1. Reverse osmosis desalination plant, —Feedwater flow rate,x—Concentration [g/L], —Brine salinity [g/L], —Feed salinity [g/L], —Permeate salinity [g/L], and —Brine flow rate [L/min], —Permeate flow rate [L/min].
Expanding Equation (2) gives,
Substituting Equation (1) into Equation (3), Equation (4) is obtained to:
Finally, the feed tank can be described by Equations (1) and (5).
The same can be done to characterize the permeate tank to get the following two equations of mass and salt balances:
Substituting as we did previously, the two equations that describe the permeate tank can be obtained to Equations (6) and (8)
The differential Equations (1), (5), (6) and (8) coupled with theories of El-Dessouky and Ettouney in Equations (9)-(14) can be used to describe the RO system and give a complete characterization of the plant .
The salt passage rate through the reverse osmosis membrane can be expressed as in Equation (9).
where is the salt permeability coefficient at the reference temperature , the total membrane area is denoted by , is the concentration polarization factor, is the temperature correction factor for salt permeability, is the net concentration, and is the permeate concentration.
The permeate flow rate (water passage rate) required in Equation (1) given by Equation (11) is a function of the membrane differential pressure and the net osmotic pressure .
where is the water permeability coefficient at the reference temperature , is the temperature correction factor for water permeability, and is the permeate density.
The concentration of the permeate and the brine are given by Equations (12) and (14), respectively:
where is the permeate flux, is the mass transfer coefficient, is the seawater dynamic viscosity, is the cake layer resistance, and is the intrinsic membrane resistance.
2.2. Control Scenarios
Since this study has the main focus on process control, the system performance should be evaluated in closed loop control, where the system will be tracking a setpoint. So, the objective of the controller is to bring the RO desalination system quickly and smoothly to target set-point of the permeate flow rate and keep the permeate concentration under by adjusting the feed pressure. Furthermore, it is important to compare the closed-loop performance of the LSTM-based MPC against a classical non-linear MPC controller that utilizes the true RO desalination plant model, as described in Equations (1)-(14), directly.
3. Methods and Materials
The procedure for using an LSTM as the predictive model in the MPC comprises of several steps starting from 1) generating a dataset by acquiring data from the system using perturbations of the manipulated variables, here in our case, the feed pressure; 2) dividing the dataset into training and validation sets and training the LSTM on the training dataset while testing the network on the validation dataset for early stopping. There are some hyperparameters which need to be selected to find the best performance. This can be done manually, whereby several network configurations are trained and the best performing network selected, or one can use Bayesian optimization to find the parameters automatically ; 3) integrating the LSTM with the best performance with the MPC and 4) finally run closed loop simulations with LSTM-based MPC to evaluate its control performance.
3.1. Internal Model Using Long Short Term Memory Network
The task of system identification is main focus of this section and comprises of approximating the RO desalination system as described by Equations (1)-(14). The p-step ahead prediction issue is supposed to be of vital important interest for the control using MPC. Deep neural networks are universal function approximators and can be used to capture the nonlinear dynamics of systems. They are relatively simple to obtain and evaluate in real-time. To them belongs the LSTMs that can better capture temporal dependencies in the dynamical system. Especially for predictive control, the LSTMs are particularly useful. They can be used to make the required p-step ahead predictions of state variables, based on the fact that the prediction for time-step p depends solely on the current state and all control actions in time-step . The time-step predictions used in the time-step p prediction are equally dependent on the current state and all control behaviour in time-step , etc.
Figure 2 shows the LSTM structure for the p-step ahead prediction problem. It is made out of repeating cells with four interacting components forming each layer, and in our case each cell represents a time-step, so that the state of the cell representing time-step serves as the input for a cell representing time-step . Each cell contains user-specifiable N number of hidden nodes that encode the state representation. These cells use several gating functions, like the “forget”, “input” and “output” gating functions, that serve to modulate the propagation of signals between cells. This cell structure avoids the gradient vanishing or exploding problem.
The basic LSTM cell structure (Figure 2(b)) is fully mathematically described in the appendix of . It has three inputs denoted by , and and two outputs given by and . At any given time step k, is the hidden state, is the cell state, is the current input.
The first layer is a sigmoid layer which has two inputs and . represent the hidden state of the previous cell. This is called the forget gate because its output decides which information of the previous cell is to be included.
Figure 2. (a) LSTM structure for the p-step ahead prediction problem and (b) LSTM internal model structure with the three gates—forget gate, input gate and output gate.
The second layer is also a sigmoid layer and represents the input gate that decides which new information is to be added to the cell. It takes two inputs and . A vector of the new candidate values is created by the tanh layer.
The two layers are then composed to determine the information to be stored as the cell state. The operator * denotes point-wise multiplication. The point-wise multiplication of the input gate and the vector of the new candidate values gives the amount of information to be added to the LSTM cell state. This result is added with the result of the forget gate multiplied with previous cell state to produce the current cell state .
Finally, the output of the LSTM cell is calculated using a sigmoid and a tanh layer, whereby the sigmoid layer determines the part of the cell state which will be present in the output whereas tanh layer shifts the output in the range of [−1, 1]. The results of the two layers undergo point-wise multiplication to produce the output of the cell.
where is the cell output which corresponds to the state vector prediction for time-step k. is initialised in this study by using .
The regressors required to predict are henceforth represented by , and they are introduced into the LSTM in a fashion illustrated in Figure 2(a). Equation (22) below serves as a shorthand to describe the LSTM:
An LSTM is characterized by the values of the weights and biases for the different gates Forget gate ( , ), Input gate ( , ), Output gate ( , ), , , and for all layers, and these values constitute the set of parameters. These parameters are learnt from training data by minimizing the predictive error of the model on the training set as determined through a user-specified loss function. The learning process is performed through the back-propagation through time (BPTT) algorithm that estimates the gradient of the loss function as a function of the weights, and an optimization algorithm that uses the calculated gradient to adjust the existing weights. The adaptive moment estimation algorithm (Adam)  is an example of an optimization algorithm that is widely used. In the BPTT, the weights are initialized, the information is passed through the different gates, the output and current cell state are calculated, the gradient through back propagation through time at time step k are calculated using chain rule and finally using all gradients, the weights associated with input gate, output gate, and forget gate are updated.
3.2. Data Acquisition
For training the LSTM, a dataset which covers the whole operating range of the RO desalination plant was collected by perturbation of the manipulated variable, the feed pressure and recording the dynamic system response. A pre-defined sequence of the manipulated variable, is introduced into the system and the dynamic response , is recorded. Such a signal for the feed pressure and the dynamic response for permeate flow rate and permeate concentration , total permeate quantity and permeate concentration are shown in Figure 3 and Figure 4(a) and Figure 4(b), respectively. denotes the final time-step for the perturbation experiment. The perturbation is sampled at . , is the measured system output at time-step k after , has been applied to the system for a period of . These correspondences of the input and output variables are referred to in machine learning terminology as labels, and the data set is thereafter constructed from both the experimental sequences and their associated labels. For thep-step ahead prediction problem, each data point thus takes the form with the associated label , . data points can thus be extracted from each experimental sequence.
The input to the system, the feed water concentration is an uncertainty. Therefore, Gaussian noise was added to its signal before it was used to excite the system (Figure 3).
Using the normal approach in machine learning, before training the LSTM, the labeled dataset is split into three parts with one part for training (data used for adapting the network weights), one part for validation and the last part for testing.
Figure 3. Perturbating the manipulated variable, feed pressure. The feed concentration is modeled as a disturbance.
Figure 4. Dynamic system response to the perturbation signal in Figure 3. (a) The permeate flow rate and the permeate concentration; (b) The total permeate produced and the total permeate tank concentration.
3.3. Nonlinear Model Predictive Control Problem
The structure of the model predictive controller for a RO desalination system is shown in Figure 5. Briefly explained, the model predictive controller (MPC) decides m control moves for the future, , that minimizes an objective function over a finite prediction horizon of p steps by utilizing the dynamic system predictions for those p steps, . Typically, the objective function is chosen to penalize large control effort, which means higher power consumption for the actuator, and discrepancies between the state vector and the set-point at each time instance. Constraints on input and output may also be factored into the MPC formulation. Since MPC performance depends on the quality of the system’s predictions, a reasonably accurate model obtained through system identification is crucial.
The Equations (23)-(26) below describe the MPC problem
Figure 5. Schematic representation of a model predictive controller with full state feedback.
where is the prediction horizon, the control horizon, the prediction of the state vector for the discrete-time stepk obtained from the LSTM, described in Equation (16), the set-point at time-step k, the manipulated vector for time-step k, the discrete-time rate of change of the manipulated vector which corresponds to the control action size at time-step k, symmetric positive semi-definite weight matrices, and the lower and upper limits for and the rate of change of at time-step k.
Within this formulation, no changes within actuator position are assumed beyond the time-stage , i.e., .
In general this problem of optimization is not convex and therefore does not have special structures suitable for global optimality. Therefore, this is a Non-Linear Programming (NLP) problem, and it can be solved with modern off-the-shelf solvers. For every step of the time, this problem is solved to yield the optimal control chain for that time-step, . The first element, , is applied to the system until the next instant of sampling, where the problem is again resolved to yield another optimal control sequence. This process is then repeated in the form of a moving horizon. The complete procedure of the model predictive control is shown in Table 1.
4. Results and Discussions
The results will be discussed in two parts, the first part is about the results of the system identification and the second part describes the closed loop results for the MPC.
4.1. Model Identification Results
To measure the LSTM model predictive capability, we used the mean absolute error (MAE), the root mean square error (RMSE) as well as the correlation coefficient ρ. The model is implemented in Python environment on a PC with Intel (R) Core (TM) E5-2620 CPU, 62 GB memory. The training for 10 epochs took 1.45 s and the prediction for the test data of 1871 data points, about 0.02 s and did not show significant improvement after five epochs. Figures 6(a)-(d) show the validation MAE loss functions for the permeate flow rate, permeate concentration, total permeate flow and the total permeate concentration for 10 epochs.
Figure 6. Loss functions of the (a) permeate flow rate ; (b) permeate concentration ; (c) total permeate flow quantity ; (d) total permeate concentration .
Table 1. NMPC algorithm.
A sharp drop in the MAE in the first a few iterations is shown. The training cycles stopped after 10 epochs with a smallest validation MAE value for the permeate flow rate, permeate concentration, total permeate flow and the total permeate concentration of 0.030, 0.0355, 0.0.0052 and 0.0039, respectively.
The hyperparameters in the prediction model such as the learning rate, batch size, dropout filtersize etc., need to be explored carefully to achieve the best prediction results. We utilize Bayesian optimization to search for these hyperparameters efficiently. From the Bayesian optimization, the best LSTM for system identification was found with the key parameters shown in Table 2.
Table 3 illustrates the model performance of the proposed method. Benefiting from the temporal convolutional architecture, dilated convolution and the residual unit, the method achieves remarkable predictive accuracy for the permeate flow rate, total permeate flow, permeate concentration and the total permeate concentration. The smaller the RMSE of the model on the test data, the better its general predictive power.
Table 2. Key hyperparameters of the LSTM.
Table 3. Model performance on the test data based on correlation coefficient and root mean square error (RMSE).
Figure 7(a), Figure 8(a), Figure 9(a) and Figure 10(a) reveal a good fit of the LSTM to the training data for the permeate flow rate, total permeate flow, permeate concentration and the total permeate concentration, respectively, and testifies to the model’s ability to reflect highly dynamic outputs from highly dynamic training data. The validation to determine the predictive capability of the model on a different data set was performed and Figure 7(b), Figure 8(b), Figure 9(b) and Figure 10(b) show that the model succeeded in capturing the general trends for previously unseen test data for the permeate flow rate, total permeate flow, permeate concentration and the total permeate concentration, respectively.
4.2. LSTM-Based MPC Closed-Loop Control Performance
The MPC controller in this study was implemented in Python version 3.6.5 through the scipy.optimize.minimize function, and the sequential least squares quadratic programming (SLSQP) algorithm was selected as the option for this solver.
The parameters for the MPC controller were set as shown in Table 4 and its main objective was to track a target set point trajectory as fast and as smooth as possible. The LSTM-based system was compared to a system which uses the true model of the RO desalination system and the results will be discussed in the following.
The response graphs in Figure 11 show that the LSTM-based MPC strategy successfully tracks the signal showing the robustness and successful set point tracking ability of the controller employed to RO desalination system. To be able to compare the performance of the two controllers quantitatively, we designed a
Figure 7. System identification performance of the optimised LSTM for the permeate flow rate, . (a) Training performance of the optimised LSTM; (b) Validation performance of the optimised LSTM on test data.
Table 4. Parameter settings for the model predictive controller.
Figure 8. System identification performance of the optimised LSTM for the total permeate flow, . (a) Training performance of the optimized LSTM; (b) Validation performance of the optimised LSTM on test data.
performance metric I given in Equation (28). This metric gives an indication of how good the LSTM-based MPC is compared to the MPC, which uses the full RO desalination system model as the predictive model.
For the target set point trajectory shown in Figure 11, the performance metric I for the LSTM-based MPC was which shows slight deviations but a very good performance.
The results of the permeate concentration in Figure 12 shows that the model predictive controller could achieve close set point tracking (Figure 11), while staying in the required constraints of the permeate concentration.
Figure 9. System identification performance of the optimised LSTM for the permeate concentration, . (a) Training performance of the optimized LSTM; (b) Validation performance of the optimised LSTM on test data.
A nonlinear model predictive controller for RO desalination systems has been presented. To take model uncertainties, constraints, nonlinear dynamics into account, the system utilizes an LSTM Network as the predictive model. The LSTM can capture complex nonlinear dynamic behavior and provide long-range predictions even in the presence of disturbances. The main aim was to control the permeate flow rate obeying the constraints on the permeate concentration by manipulating the feed pressure. The LSTM based MPC was tested on reference signals which exhibits, the possible nonlinear process dynamics occurring inside a real RO desalination plant. It can be seen from the response graphs that the NMPC strategy successfully tracks the reference signal. These results illustrate and prove the tracking ability of LSTM-based MPC controller. Almost offset free and very close set point tracking is obtained using the strategy.
Figure 10. System identification performance of the optimised LSTM for the total permeate concentration, . (a) Training performance of the optimized LSTM; (b) Validation performance of the optimised LSTM on test data.
Figure 11. Closed loop results of the two model predictive controllers. Measured CV is the result of the MPC with the true model and measured CV-LSTM is the result of the MPC with the LSTM as predictive model.
Figure 12. Closed loop results for the permeate concentration, where the red line is the maximum allowed concentration.
Our Acknowledgments go to “ICON—International Cooperation and Networking” an internal funding program launched by the Fraunhofer-Gesellschaft for sponsoring the ICON Project “WASTEC”—(water supply technologies for desalination and microbial control in food production for Africa).
 Liu, C.L., Pan, J. and Chang, Y.F. (2016) PID and LQR Trajectory Tracking Control for an Unmanned Quadrotor Helicopter: Experimental Studies. 2016 35th Chinese Control Conference, Chengdu, 10845-108503.
 Janghorban, I., Ifaei, P., Rashidi, Z. and Yoo, C.K. (2016) Control Performance Evaluation of Reverse Osmosis Desalination System Based on Model Predictive Control and PID Controllers. Desalination and Water Treatment, 57, 1-8.
 Lu, H., Liu, C., Coombes, M., Guo, L. and Chen, W.-H. (2016) Online Optimisation-Based Backstepping Control Design with Application to Quadrotor. IET Control Theory & Applications, 10, 1601-1611.
 L’Afflitto, A., Anderson, R.B. and Mohammadi, K. (2018) An Introduction to Nonlinear Robust Control for Unmanned Quadrotor Aircraft: How to Design Control Algorithms for Quadrotors Using Sliding Mode Control and Adaptive Control Techniques [Focus on Education]. IEEE Control Systems Magazine, 38, 102-121.
 Sassi, K. and Mujtaba, I. (2010) Simulation and Optimization of Full Scale Reverse Osmosis Desalination Plant. Computer Aided Chemical Engineering, 28, 895-900.
 Manenti, F., Nadezhdin, I.S., Goryunov, A.G., Kozin, K.A., Baydali, S.A., Papasidero, D., Rossi, F. and Potemin, R.V. (2015) Operational Optimization of Reverse Osmosis Plant Using MPC. Chemical Engineering Transactions, 45, 247-252.
 Kargar, M. and Mehrad, R. (2020) Robust Model Predictive Control for a Small Reverse Osmosis Desalination Unit Subject to Uncertainty and Actuator Fault. Water Supply, 20, 1229-1240.
 Feng, G., Lai, C. and Kar, N.C. (2017) Expectation-Maximization Particle-Filter- and Kalman-Filter-Based Permanent Magnet Temperature Estimation for PMSM Condition Monitoring Using High-Frequency Signal Injection. IEEE Transactions on Industrial Informatics, 13, 1261-1270.
 Zhao, J. and Mili, L. (2018) Sparse State Recovery versus Generalized Maximum-Likelihood Estimator of a Power System. IEEE Transactions on Power Systems, 33, 1104-1106.
 Thune, P. and Enzner, G. (2017) Maximum-Likelihood Approach with Bayesian Refinement for Multichannel-Wiener Postfiltering. IEEE Transactions on Signal Processing, 65, 3399-3413.
 Tang, J., Deng, C. and Huang, G.-B. (2016) Extreme Learning Machine for Multilayer Perceptron. IEEE Transactions on Neural Networks and Learning Systems, 27, 809-821.
 Antão, R., Antunes, J., Mota, A. and Escadas Martins, R. (2020) Model Predictive Control of Non-Linear Systems Using Tensor Flow-Based Models. Applied Sciences, 10, 3958.
 Fairbank, M., Li, S., Fu, X., Alonso, E. and Wunsch, D. (2014) An Adaptive Recurrent Neural-Network Controller Using a Stabilization Matrix and Predictive Inputs to Solve a Tracking Problem under Disturbances. Neural Networks, 49, 74-86.
 Li, S., He, J., Li, Y. and Rafique, M.U. (2017) Distributed Recurrent Neural Networks for Cooperative Control of Manipulators: A Game-Theoretic Perspective. IEEE Transactions on Neural Networks and Learning Systems, 28, 415-426.
 Wong, W., Chee, E., Li, J.L. and Wang, X.N. (2018) Recurrent Neural Network-Based Model Predictive Control for Continuous Pharmaceutical Manufacturing. Mathematics, 6, 242.
 Ling, Z.-H., Ai, Y., Gu, Y. and Dai, L.-R. (2018) Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26, 883-894.
 Pan, Y. and Wang, J. (2012) Model Predictive Control of Unknown Nonlinear Dynamical Systems Based on Recurrent Neural Networks. IEEE Transactions on Industrial Electronics, 59, 3089-3101.
 Cheng, L., Liu, W., Hou, Z.G., Yu, J. and Tan, M. (2015) Neural Network Based Nonlinear Model Predictive Control for Piezoelectric Actuators. IEEE Transactions on Industrial Electronics, 62, 7717-7727.
 Liu, W.C., Cheng, L., Zhou, C., Hou, Z.G. and Tan, M. (2016) Neural-Network Based Model Predictive Control for Piezoelectric-Actuated Stick-Slip Micro-Positioning Devices. IEEE International Conference on Advanced Intelligent Mechatronics, Banff, 1312-1317.