Estuaries act as a bridge between riverine and marine systems, transporting dissolved substances and suspended particles from rivers to coastal seas. In the past decades, coastal environments have become more stressed by pollutants introduced from the surrounding land. The fates of pollutants are strongly affected by physical processes, which interplay with biogeochemical processes in estuaries  . Salinity is an important indicator of the health of the aquatic ecosystem in tidal estuaries. The increase in salinity intrusion in a tidal estuary river may have an adverse effect on the drinking-water supply and aquatic environment  .
Numerical models have been successfully applied to the simulation of the hydrodynamic characteristics of estuaries. Although full three-dimensional models are available to predict salt water intrusion in an estuary, a large amount of observational data and a considerable amount of effort are often needed to calibrate and validate these numerical models for further applications  -  . Because of the existing difficulties and challenges in the prediction of salinity variations using hydrodynamic model, a relatively novel computational approach, artificial neural networks (ANNs), which was widely accepted in many disciplines, provides an alternative method for one-step-ahead understanding and management of hydrodynamic processes. ANNs are well-suited for this application because of their informative processing characteristics, such as nonlinearity, parallelism, noise tolerance, and learning capability   .
ANNs are a branch of artificial intelligence developed in the 1950s that aims at imitating the biological brain architecture. They are parallel-distributed systems made of many interconnected nonlinear processing elements, called neurons  . Renewed interest in ANNs has developed in the last decade, mainly because of the availability of suitable hardware (e.g., parallel computers, analogue/digital neural cards for personal computers) that has made them convenient for fast data analysis and information processing.
ANNs are now drawing interest from researchers in various fields because they are capable of providing a neurocomputing approach for solving complex problems that might otherwise not have a tractable solution. ANNs have been applied to a broad range of fields including nonlinear system modeling, function approximation, pattern recognition and synthesis, and prediction. An ANN is a nonlinear mathematical structure that is able to represent arbitrarily complex nonlinear processes that relate the inputs and outputs of any system. One needs to train the network through representative examples of the desired mapping. The ANN can then adapt itself to reproduce the desired output when presented with training input. This means that the network has the ability to learn (i.e., to extract import features) from the data and to generalize for future classification or prediction. Recently, ANNs have provided a significant number of promising results in the field of hydrology and water resources, in areas such as rainfall estimation  -  , groundwater modeling      , estuary water- stage forecasting     , and reservoir operation    . Although the neural networks applied to different hydrological processes have yielded many exciting results, this method is still rarely applied in modeling the salinity distributions of tidal estuaries  or compared with predictions of salinity obtained using a hydrodynamic model  .
The main aim of this paper was to implement two ANN models which are a back- propagation neural network (BPNN) and a radial basis function neural network (RBFNN) and a physically based three-dimensional hydrodynamic model to predict the time-series salinity variation in response to the input forcing functions of the water stage and freshwater discharge in the Danshui River estuary of northern Taiwan.
2. Description of Study Site
The Danshui River estuarine system (Figure 1) is the largest estuary in Taiwan. There are three main tributaries including the Dahan River, the Xindian River, and the Keelung River. Its drainage basin covers the capital city of Taipei. The watershed of the Danshui River encompasses 2726 km2 and has a combined length of 158.7 km. Annual precipitation in the region ranges between 1500 mm and 2500 mm, with the majority coming in late spring (May) to early fall (October). Long-term average annual river flow rate is 6.6 × 109 m3/y. Except during a flood event, the astronomical tide may reach as far upriver as the Chenglin Bridge on the Dahan River, the Xiulang Bridge on the Xindian River, and the Jiangbei Bridge on the Keelung River. Tidal propagation is the dominant mechanism controlling the water surface elevation and the ebb and flood flows. The M2 tide is the primary tide constituent at the river mouth, with a tidal range of 2.17 m at mean tide and up to 3 m at spring tide  . Seawater intrudes upriver as a result of tidal advection and two-layer estuarine circulation. Salinity varies on an intratidal time scale in response to the ebb and flood of the flows and in various longer time scales in response to freshwater inflow. The limit of salt intrusion may reach beyond 25 km in the Dahan River from the river mouth during the period of low flow.
The average river discharges at the upstream limits of the tide are 63.1 m3/s, 72.7
Figure 1. A map of the Danshui River estuarine system.
m3/s, and 26.1 m3/s, respectively, in the Dahan River, the Xindian River, and the Keelung River. Due to the shallowness of the estuary, it is well mixed in most areas with the exception for a short section near Guandu Bridge (Figure 1), where the channel depth reaches 13 m and two-layer circulation occurs  .
3. Materials and Methods
3.1. Artificial Neural Networks
A neural network consists of a large number of simple processing elements that are called neurons and nodes. Each neuron is connected to other neurons by means of direct communication links, each with an associated weight. The weights represent information being used by the net to solve a problem. An input layer, a hidden layer where the elaborations are performed, and an output layer constitute the most commonly used perceptron, called a single-hidden-layer perceptron. The single unit of the network, i.e., the model neuron, is shown in Figure 2. Neural networks can be classified into many types based on their structures. In the current study, a back-propagation neural network (BPNN) and a radial basis function neural network (RBFNN) were chosen to establish the present model.
3.2. Back-Propagation Neural Networks (BPNNs)
The BPNN proposed by Rumelhart et al.  is widely used among all of the ANNs. It is a multiple-layer network with nonlinear differentiable transfer functions. It can be used to solve many nonlinear problems. The input vectors and its corresponding target vectors are used to train a BPNN until it can approximate a specified minimum error or maximum epochs. A BPNN with weightings, biases, a sigmoid layer, and a linear output layer has the capacity to approximate any function with a finite number of discontinuities  .
Figure 2 presents the architecture of a BPNN. It shows that Pi is the input vector; IW and b1 are the weights and biases, respectively, between the input layer and the hidden layer; Trans. is the transfer function, and LW and b2 are the weights and biases, respectively, between the hidden layer and the output layer. We use sigmoid transfer functions in the hidden layer and linear transfer functions in the output layer to extrapolate beyond the range of the training data. Detailed equations to describe the connections between input and hidden-layer neurons and those between hidden and output-layer neurons can be found in Chen et al.  and will not be further described here.
Kisi  reported that the Levenberg-Marquardt technique  is more powerful than the conventional gradient descent techniques   to find the weight and the
Figure 2. The architecture of the back-propagation neural network.
bias matrices in each iteration. Therefore the Levenberg-Marquardt algorithm was adopted in BPNNs. The toolbox of neural network includes the Levenberg-Marquardt technique which is incorporated into the ANN model to be executed in the Matlab environment.
3.3. Radial Basis Function Neural Network (RBFNN)
RBFNN, which is multilayer and feedforward, is often used for strict interpolation in multidimensional space. The term “feedforward” means that the neurons are organized in the form of layers in a layered neural network  . In the present study, the architecture of RBFNN consists of three layers: an input layer, a single hidden layer, and an output layer. Figure 3 shows the architecture of the RBFNN. The input layer is composed of input nodes. The only hidden layer consists of locally tuned units, and each unit has a radial basis function acting as a hidden node. The output of the RBFNN can be calculated according to Equation (1).
where x is the input vector, is the center of the radial basis function for hidden layer n, are the weights in the output layer, and denotes the Euclidean distance between the center of the radial basis function and the input. Two functions, multiquadratic function and Gaussian activation function, are the most commonly used for radial basis function. A range of studies have indicated that the results are insensitive to different basis functions   . In this study, the radial basis function is chosen as the multiquadratic function and defined in Equation (2).
where σ is the standard deviation and c is the center of the multiquadratic function.
A main challenge in the design of RBFNN is the selection of centers. The simplest way to select the centers of RBFNN is random method  . However, the orthogonal least squares (OLS) algorithm can provide an optimum number of centers in RBFNN from the training patterns  . The classical Gram-Schmidt and modified Gram- Schmidt methods  can be used to deal with orthogonalization procedure  . The OLS method based on classical Gram-Schmidt offers a systematic way for center selection
Figure 3. The architecture of the radial basis function neural network.
and significantly reduces the size of the RBFNN.
The OLS learning algorithm is a procedure that iteratively selects the best regressors (radial basis unit) from a set of available regressors. This set is composed of a number of regressors equal to the number of available data, and each regressor is a radial unit centered on a data point. The OLS algorithm proceeds iteratively by selecting the next best regressor from a set by applying a Gram-Schmidt orthogonalization until reaching the tolerance, so the contribution of each vector of this new orthogonal base can be determined individually among the available regressors. A detailed description of the algorithm can be found in Ham & Kostanic  .
3.4. Three-Dimensional Hydrodynamic Model
The ELCRIC (Eulerian-Lagrangian CIRCulation) model is used for simulating salinity in a tidal estuary, because this model is cost efficiency to prevent the time step constraints and the unstructured grid can be used to fit the complex coastline and channel. The model solves the shallow water equations using a semi-implicit Eulerian-Lagran- gian finite volume/finite difference method that relies on horizontally unstructured grids and unstretched z-coordinates. ELCIRC allows for the use of state-of-the-art turbulence closure schemes, includes terms for the tidal potential and atmospheric pressure gradients, and provides a detailed description of air-water exchanges.
The model solves for the free surface elevation, three-dimensional water velocity, and salinity using hydrostatic equations based on the Boussinesq approximation, which represents mass conservation, momentum conservation, and conservation of salt:
where are the horizontal Cartesian coordinates; are the latitude and longitude; z is the vertical coordinate, positive upward; t is time; is the z-coordinate at the reference level (mean sea level); is the free-surface elevation; is the bathymetric depth;, and w are the velocities in the and z directions, respectively; f is the Coriolis force; g is the acceleration of gravity; p is pressure; is the tidal potential; α is the effective Earth elasticity factor (≈0.69); is the water density; is atmospheric pressure at the free surface; S is salinity; is the vertical eddy viscosity; is the vertical eddy diffusivity for salinity; and are the horizontal diffusion for the momentum equation in the x and y directions, respectively; and is the horizontal diffusion for the transport equation. By default, the reference value is set to 1025 kg/m3.
In ELCRIC, the barotropic pressure gradient in the momentum equation and the flux term in the continuity equation are treated semi-implicitly, with the implicitness factor. The vertical viscosity term and bottom boundary condition for the momentum equations are treated fully implicitly; all other terms are treated explicitly. This ensures both stability  and computational efficiency. The normal component of the horizontal momentum equations is solved simultaneously with the depth-integrated continuity equation, i.e., there is no mode splitting between these equations. The total derivatives of the normal velocity can be discretized using either Eulerian or Lagrangian backtracking method  . However the latter has the advantage of preventing advection by imposing stability constraints on the time step  . There are three conventional numerical approaches, including finite difference, finite volume, and finite element methods used to solve the continuity and momentum equations. To efficiently solve the governing equations, the vertical velocity is solved from the three-dimensional continuity equation using a finite-volume approach. The tangential component of the horizontal momentum equation is formally solved with finite differences. The transport equation for salinity is solved at both polygonal vertices (nodes) and the centers of element sides using finite differences. The solution requires backtracking along characteristic lines to account for the most recent flow field. After the salinity is found, the density is calculated from the equation of state and is fed back into the momentum equations at the next time step (i.e., the baroclinic term is treated fully explicitly). To solve the turbulent closure model, the eddy viscosity and diffusivity are computed at each time step prior to the solution of the momentum equation, using information from the previous time step. Details of the numerical discretization can be found in Zhang et al.  .
4. Results and Discussion
There are many indices to assess the model performance between model prediction and observational data. In this study, we used the commonly acceptable indices to evaluate the performances of the BPNN, RBFNN, and three-dimensional hydrodynamic models, including the root mean square error (RMSE), mean absolute error (MAE), and correlation coefficient (CC) to calculate the accuracy of the model prediction.
where and. N is the total number of data; Sp is the
simulated salinity; and So is the measured salinity.
4.1. BPNN Model Training and Verification
The measured data of hourly salinity at 0.5 m depth below the water surface at the Guandu Bridge were collected and adopted for training and verification phases with the ANN models (BPNN and RBFNN) and for model calibration and verification with the three-dimensional hydrodynamic model. The data sets collected from 18 to 22 May, 16 to 22 October, and 26 to 30 October 2002 (totaling 4320 data points) were used for BPNN and RBFNN model training and for hydrodynamic model calibration. The data sets collected from 30 May to 2 June and 11 to 15 November 2002 (totaling 2592 data points) were applied for BPNN and RBFNN model verification and for hydrodynamic model verification. Those historical data of 2002 were continuously measured data at the Guandu Bridge station to be taken in this study. Some measured data in the Danshui River estuarine system can not be adopted for model calibration and verification because the data sets are not continuous.
For a given input set, the BPNN produces an output, and this response is compared with the known desired response of each neuron. The weights of the neural network are then changed to correct or reduce the error between the output of the neuron and the desired response, and this process continues. The weights are continually changed until the total error of all training sets is reduced below the accepted error.
This study builds a BPNN model to simulate estuary salinity at the Guandu Bridge station. The BPNN used in this study is composed of three layers with nodes in adjacent layers fully connected. This is, only one hidden layer is employed. Because the focus is on single variable predicting, one output node is exclusively used in the output layer. The number of input nodes and the number of hidden nodes are the two major experimental factors.
Figure 4 illustrates the structure of the BPNN model for salinity prediction. The input units include time-series water stages and freshwater discharges at different stations. The hourly water stages at the observed stations include the Danshui River mouth, Taipei Bridge, and Dazhi Bridge, while the upstream reaches of the Dahan River, Xindian River, and Keelung River are specified with hourly freshwater discharges. Figure 5 and Figure 6 present the time-series freshwater discharges in the Dahan River, Xindian River, and Keelung River and the time-series hourly water stages at the Danshui River mouth, respectively.
With different combinations of the number of hidden layers and the number of hidden neurons, the BPNN model was trained until the best architecture was obtained. The Mean Square Error (MSE) criterion was used to evaluate the performance of the model during the training phase. Satisfactory results were obtained using only one hidden layer with seven neurons. To prevent overtraining the model, this study chose a learning rate of 0.01 and a momentum of 0.3. Therefore, the architecture of 6-7-1 (i.e., input neurons, hidden neurons, and output neurons) was chosen as the best BPNN
Figure 4. The structure of the BPNN model used for salinity prediction.
Figure 5. Upstream freshwater discharges in the Dahan River (solid line), Xindian River (dashed line), and Keelung River (dotted-dashed line) used for the (a) training (calibration) and (b) verification phases.
Figure 6. Time-series water stages at the Danshui River mouth used for the (a) train- ing (calibration) and (b) verification phases.
architecture, which was then tested using the data sets from the year 2002. The comparison between observed salinities and BPNN predicted salinities at the Guandu Bridge during the training and verification phases is presented in Figure 7. It should be noted that the errors between the predicted and observed salinities are also shown in the figure. The BPNN model during the training and verification phases was found to underestimate the peak during flood tide and to over predict minimal salinities during ebb tide. The RMSE, MAE, and CC during the training phase were found to be 3.88 ppt, 2.79 ppt, and 0.76, respectively, while during the verification phase they are 3.81 ppt, 2.96 ppt, and 0.71, as shown in Table 1.
4.2. RBFNN Model Training and Verification
The RBFNN methodology has the advantage that hidden weights or center vectors are optimized first before they are forwarded to the output layer. Hidden-output layer weights are further optimized using the Orthogonal Least Squares (OLS) method. Therefore, there was no backflow at the input-hidden layer, which reduces the simulating time. The best RBFNN architecture (i.e., number of center vectors in the hidden layers) was obtained by trial and error based on the mean square error in the training and verification phases. The architecture of 6-16-1 (i.e., input neurons, hidden neurons,
Figure 7. Comparison of the temporal variation of the observed and predicted salinities at the Guandu Bridge during the (a) training and (b) verification phases for the BPNN model. Note that circles are observations and the solid line represents the BPNN modeling results.
Table 1. The performances of BPNN, RBFNN and three-dimensional models in salinity simulation.
Note: RMSE represents root mean square error; MAE represents mean absolute error; CC represents correlation coefficient.
and output neurons) was chosen as the best RBFNN architecture. In this study, we set 0.9 as the tolerance of the OLS algorithm, and 16 center vectors were selected after the OLS algorithm was completed.
Figure 8 presents a comparison of the salinity predicted by the RBFNN model during the training and verification phases and that observed at the Guandu Bridge. During the training and verification phases, the RBFNN model underestimated the peak during flood tides and over predicted the minimal salinities during ebb tides. The RMSE, MAE, and CC during the training phase were found to be 3.76 ppt, 2.72 ppt, and 0.77, respectively, while during the verification phase, they were 3.63 ppt, 2.90 ppt, and 0.75, as shown in Table 1.
Figure 8. Comparison of the temporal variation of observed and predicted salinities at the Guandu Bridge during the (a) training and (b) verification phases for the RBFNN model. Note that circles are observation and the solid line represents the RBFNN modeling results.
4.3. Calibration and Verification of the Three-Dimensional Hydrodynamic Model
Bottom topography is an important factor that affects the flow properties in environmental modeling. The bottom topography data in the coastal sea and Danshui River estuarine system (Figure 9(a)) were obtained from the National Center for Ocean Research and Water Resources Agency, Taiwan. The greatest depth in the study area is 110 m (below mean sea level) near the northeast corner of the model in the coastal sea. The model mesh for the Danshui River estuarine system and its adjacent coastal sea
Figure 9. (a) Bathymetry contour map of the Danshui River estuarine system and (b) an unstructured grid representing the modeling domain.
consisted of 6969 polygons (Figure 9(b)). To save computational time, higher-resolu- tion grids were used in the Danshui River estuary, and coarser grids were used in the coastal sea. Because the model domain includes deep bathymetry in the coastal ocean and a shallow area in the channel, thirty-two levels varying in thickness from 1 to 10 m were used for vertical discretization in ELCIRC. The time step () was chosen to be 300 seconds for the consideration of model stability.
The model calibration and verification of salinity was conducted using daily freshwater discharge values from upriver regions of the Dahan River, the Xindian River, and the Keelung River in 2002 (shown in Figure 5). A five-constituent tide (i.e., M2, S2, N2, K1, and O1) was adopted as the forcing function at the coastal sea boundaries. The salinities at the open boundaries with the coastal sea were set to 35 ppt. Salinity values at the upstream boundaries of the three tributaries were set to zero due to freshwater discharge from the boundaries.
Before the model was calibrated and verified with measured salinity variations, the model was performed to compare with the measured tidal level and tidal current at the Guandu Bridge (shown in Figure 1). The results indicate that MAE and RMSE values between the measured hourly water surface elevations and the computed results are 0.21 m and 0.30 m, respectively, for model calibration, while MAE and RMSE values are 0.28 m and 0.34 m, respectively, for model verification. An intensive survey of the tidal current was conducted at transect of the Guandu Bridge on April 26, 2002. The current was measured by trained technicians on boats every half hour for a period of 13 daylight hours. The measured longitudinal velocities were used to validate the model. The MAE and RMSE values at the Guandu Bridge are 0.156 m/s and 0.188 m/s, respectively. We found that the model was satisfactorily validated with water surface elevation and tidal current. Finally, the constant bottom roughness height (zo = 0.005 m) was adopted in the model simulation.
The model was then performed to compare with the measured salinity variation at the Guandu Bridge. Figure 10 illustrates the model calibration and verification results for the salinity from the three-dimensional hydrodynamic model. The lower salinity conditions were over predicted, whereas the higher salinity values were underpredicted for the model calibration and verification. The RMSE, MAE, and CC for the model calibration were found to be 4.49 ppt, 3.80 ppt, and 0.79, respectively, while for the model verification, they were 5.09 ppt, 4.25 ppt, and 0.75, as shown in Table 1.
4.4. Comparison of Predicted Results Using ANN and Three-Dimensional Hydrodynamic Models
During the comparison of the results, terminologies such as training and verification of the ANN (BPNN and RBFNN) approaches were considered analogous to calibration and verification of the three-dimensional hydrodynamic model, respectively. According to the performance different approaches, one month of simulation using a three-dimen- sional hydrodynamic model requires approximately 2.5 hours of Central Processing Unit (CPU) time on an Intel Core I5 Personal Computer (PC), while the ANN (BPNN and RBFNN) models require only 3 minutes. To assess the relative performance of the models in predicting the salinity, values of performance indices obtained from
Figure 10. Comparison of the temporal variation of the observed and predicted salinities at the Guandu Bridge for the three-dimensional hydrodynamic model (a) calibration and (b) verification. Note that circles are observations and the solid line represents the modeling results.
the BPNN, RBFNN, and three-dimensional hydrodynamic models were compared. The performance indices of the BPNN and RBFNN models yielded during the training phase were compared with the corresponding performance indices yielded during the three-dimensional hydrodynamic model calibration. The RMSE, MAE, and CC obtained from the best BPNN and RBFNN architectures were compared with calibration results obtained using the three-dimensional hydrodynamic model. Similarly, the verification results of the BPNN and REFNN models were compared with the verification results of the three-dimensional hydrodynamic model.
Figure 11 presents the scatter plots of the comparisons between the measured and predicted salinity with BPNN, RBFNN, and three-dimensional hydrodynamic models. The BPNN and RBFNN predicted salinities were noted to be distributed uniformly
Figure 11. Scatter plots of observations versus predictions using the (a), (b) BPNN model, (c), (d) RBFNN model, and (e), (f) three-di- mensional hydrodynamic model.
about the best-fit line. The performance of the BPNN and RBFNN was better than that of the three-dimensional hydrodynamic model during the training and verification phases, as revealed by the values of the RMSE and MAE. There are small differences in the CC value with the three predicted models. However, all CC values are above 0.70, as shown in Table 1. According to the values of RMSE and MAE, the RBFNN was found to be better than the BPNN. However, the results show that the ANN (BPNN and RBFNN) approach is capable of reproducing the nonlinear relationship between the input and output. Although the ANN models may predict the salinity variations, the limitation of the ANN models is that they are black box models that fail to simulate the internal physical processes in a tidal estuary. The three-dimensional hydrodynamic model is a physically based model that can be adopted to simulate salinity variations at different layers (i.e. vertical profile) and locations.
The simulation of physical processes is crucial importance to environmental management. Engineering alterations to the estuaries may change the salinity distribution, resulting in negative effects on the estuarine ecosystems   . Three-dimensional hydrodynamic models have been widely implemented to provide detailed resolutions as a result of engineering alterations, however a large amount of observational data and a considerable amount of effort are necessary to calibrate and validate the models before these models are applied for prediction. Therefore, ANN models which are the data-driven approaches would be useful for engineers to make rapid evaluation of salinity changes due to the modifications in the tidal estuaries   .
Salinity is an important indicator of the health of the aquatic ecosystem in tidal estuaries. In the present study, estuarine salinities were predicted using ANN (BPNN and RBFNN) and three-dimensional hydrodynamic models. The driven forcing functions of the ANN models include the time-series water stages and the freshwater discharges at different stations, which served as input. Although the BPNN and RBFNN approaches are black-box models, the results revealed that the salinity values predicted by the BPNN and RBFNN models are better than those predicted by the three-dimensional hydrodynamic model. Because the ANN approach is a data-driven technique, the predictability could be further improved by providing an appropriate and large number of input-output data sets during training.
In the present study, we focus on the salinity prediction instead of forecasting. In future study, different lead-time forecast in salinity variations can be developed for environmental management. The soft computing techniques such as the combining fuzzy optimal model with a genetic programming  , neural network and genetic programming  , support vector machine   , dendrochronlogy  , hybrid neural network  , and binary-coded particle swarm optimization and extreme learning machines  can also be developed to improve the prediction of salinity distributions along the tidal estuary.
This study is financially supported, in part, by the Ministry of Science and Technology (MOST), Taiwan under grant No. 104-2625-M-239-002. The authors would like to appreciate the Taiwan Water Resources Agency and Industrial Technology Research Institute for providing the observational data.