ariable for these analyses was river discharge (disch) measured in cubic metre per second whiles the independent variables were time (month & year) and space (loc) which are the various gauge stations along the Black Volta River considered in this study, namely Lawra, Chache, Bui and Bamboi. The covariates included rainfall (rain) measured in millimetres, relative humidity (humid), elevation (elev) measured in meters, soil type (soil) and land use (luse) which was considered as a random effect. Interactions between some of these variables were also considered especially the space-time interactions.

2.3. Models and Analyses

After checking the relationship between river discharge (disch) and all predictors, independent models were constructed for all covariates to determine their effect on disch and, if it resulted significant, its nature (linear or nonlinear) was also determined. If a covariate was significant as a unique predictor, but not significant jointly with other predictors, it was removed. For instance, it was observed in the case of elevation (elev), which had a significant parameter as a unique predictor, but was eliminated in models which had several predictors because its effect resulted no more significant. This preliminary analysis is omitted from the results for the sake of brevity.

The response variable ‘river discharge in gauge station i’ ( $disc{h}_{i}$ ) is modelled using a generalized additive mixed model (GAMM)    , as shown in Table 1.

2.4. Parameter Estimation

The GAMMs in Table 1 can be expressed as generalized linear mixed models (GLMMs);

$\mathrm{log}\left({\mu }_{i}\right)={X}_{i}\theta$ (1)

where ${X}_{i}$ is a row of the model matrix containing all components of the model. That is, all explanatory variables of fixed and random effects, and all the basic functions evaluated at observation i. The parameter $\theta$ contains the coefficients of fixed terms, the random land use effects and the bases. We estimate parameters with maximum likelihood (ML) estimation of the smoothness parameters, by integrating out the part of $\theta$ in the log likelihood function that is in the range space as described in  .

2.5. Model Selection and Validation

Model selection was based on Akaike information criterion (AIC), Bayesian information criterion (BIC), adjusted R-squared, the root mean squared prediction error (RMSPE), and Nash-Sutcliffe efficiency (NSE). However, the key indicators of performance were the RMSPE which is independent of the likelihood and NSE. The RMSPE and NSE are calculated using Equations ((2) and (3)) respectively.

Table 1. Summary of models considered.

where ${\mu }_{i}=E\left(disc{h}_{i}\right)$ and $\mathrm{log}\left(disc{h}_{i}\right)$ is assumed to follow the Gaussian distribution. ${\beta }_{0}$ is the intercept parameter, ${\beta }_{1}$ and ${\beta }_{2}$ are parameter estimates of fixed effects while ${f}_{1-4}$ are smooth functions of the covariates which are represented using a cyclic cubic regression spline. $k\left(i\right)$ indexes the land use at the ith gauge station and $lus{e}_{k\left(i\right)}$ is a random land use effect. ${f}_{5}$ is a tensor product of cyclic cubic regression splines.

$\text{RMSPE}=\sqrt{\frac{{\sum }_{i=1}^{n}{\left({y}_{i}-{\stackrel{^}{y}}_{i}\right)}^{2}}{n}}$ (2)

$\text{NSE}=1-\left[\frac{{\sum }_{i=1}^{n}{\left({y}_{i}-{\stackrel{^}{y}}_{i}\right)}^{2}}{{\sum }_{i=1}^{n}{\left({y}_{i}-{y}^{mean}\right)}^{2}}\right]$ (3)

where ${y}_{i}$ and ${\stackrel{^}{y}}_{i}$ are the observed and predicted river discharges for n months, ${y}^{mean}$ is the mean of the observed data.

For model checking and investigating whether the final selected model has disentangled spatial and temporal correlation in residuals, several diagnostics were used. The QQ-plot and histogram of residuals were used to check normality of residuals. Homoscedasticity of residuals was checked using scatter plot of residuals versus predictors. Also, the relationship between response and fitted values was checked as a visual goodness of fit verification using scatter plot of response versus fitted values.

2.6. Software

Data analysis was done in the R programming environment version 3.2.4  and models were fit using the MGCV package  .

3. Results and Discussion

3.1. Descriptive Statistics

In generalized regression models, it is important and necessary to study the distribution function of the response variable (disch) in order to select both response distribution and link fuction. Boxplots for disch and log (disch) are reported in the upper panels of Figure 4 while the normal QQ-plots for disch and log (disch) are reported in the lower panels. We observe from the figure that log (disch) gives a good approximation to the normal distribution. Hence the Gaussian distribution was considered as the underlying theoretical distribution of disch in the GAMMs with log link function.

The time series plots of river discharge at the various gauge stations are shown in Figure 5, which indicates an obvious seasonality in discharge at all four gauge stations. This suggests that smooth functions may be represented using cyclic cubic regression splines.

3.2. Model Selection

The first GAMM (Model 1) included fixed effect of soil type, random effect of land use, and smooth functions of rainfall and humidity but excluded space and time effects, and was able to explain only about 19.2% (R-sq. (adj) = 0.192) of the variability in river discharge as shown in Table 2. The second GAMM (Model 2) added the main effects of space and time to model 1 which resulted in explaining about 40.1% of the variability in river discharge. The third GAMM (Model 3) added only the space-time interaction effect to model 1 and resulted in explain-

Figure 4. In the upper panels: Boxplots for river discharge (on the left) and the log transformation of river discharge (on the right); in the lower panels: Normal QQ-plots for river discharge (on the left) and the log transformation of river discharge (on the right).

Figure 5. Time series plots for river discharge at the various gauge stations.

Table 2. Model selection criteria.

ing about 72.4% of the variability in river discharge while the final GAMM (Model 4) added both the main and interaction effects of space and time to model 1 and resulted in explaining about 82.1% of the variability in river discharge. This provides an indication of the very significant role space-time effects play in modelling river discharge, but are usually ignored.

Furthermore, the RMSPEs and NSE values in Table 2 indicate satisfactory performance for all GAMMs considered. However, a comparison among them using AIC and BIC values clearly indicate a much better performance by the GAMM which included both the main and interaction effects of space and time (Model 4).

3.3. Parameter Estimates of the Selected Model

Parameter estimates of the selected GAMM are reported in Table 3 and Table 4. We observe from those Tables that, parameter coefficients (both smooth terms and non-smooth terms) were all significant at the 0.05 level.

3.4. Diagnostic Checks

Basic diagnostic plots of the selected GAMM are reported in Figure 6. The QQ-plot of residuals shows an evident arrival of residual quantiles at the theoretical normal quantiles and a near symmetry observed in the histogram of residuals as well. The scatter-plot of residuals versus the linear predictor indicates

Table 3. Parameter coefficients.

Table 4. Approximate significance of smooth terms.

Figure 6. Diagnostic for selected GAMM. In the upper panels: QQ-plot of residuals (on the left) and plot of residuals vs. Linear predictor (on the right); In the lower panels: Histogram of the residuals (on the left) and plot of the response vs. Fitted values (on the right).

an accentuated homoscedasticity of residual variance while that of the response versus fitted values shows independence of the residuals. All in all, diagnostics of the selected GAMM are quite good.

4. Conclusions

We have effectively used GAMMs for modelling space-time river discharge data in this paper. GAMMs provide a flexible framework which allows for smooth effects of covariates and smooth effects of space and time. In other applications such as repeated observations of weather station data, the use of spatio-temporal dynamic models or state-space models have been proposed. Four GAMMs were explored, two with space-time interactions and two without space-time interactions. The comparison of the performance of the models with space-time interactions and those without space-time interactions based on AIC and BIC suggests that in this application, the former is better overall and in particular for modelling variations in river discharge data. Further, a model with space and time main effects performed better compared with one without space and time main effects.

Acknowledgements

This study was supported by the Akosombo Kpong Dams Reoperation and Reoptimization Study hosted by the Water Resources Commission (WRC) of Ghana and funded by the African Development Bank (ADB). We sincerely thank WRC and ADB for the support.

Cite this paper
Iddrisu, W. , Nokoe, K. , Luguterah, A. and Antwi, E. (2017) Generalized Additive Mixed Modelling of River Discharge in the Black Volta River. Open Journal of Statistics, 7, 621-632. doi: 10.4236/ojs.2017.74043.
References
   UNESCO (1985) Teaching Aids in Hydrology. Universitaires de France, Vendome.

   Ampadu, B., Chappell, N.A. and Kasei, R.A. (2013) Rainfall-Riverflow Modelling Approaches: Making a Choice of Data-based Mechanistic Modelling Approach for Data Limited Catchments: A Review. Canadian Journal of Pure & Applied Sciences, 7, 2571-2580.

   Xu, C. (2002) Textbook of Hydrologic Models. Uppsala University, Sweden.

   Getachew, H.E. and Melesse, A.M. (2012) The Impact of Land Use Change on the Hydrology of the Angereb Watershed, Ethiopia. International Journal of Cosmetic Science, 1, 1-7.

   Chow, V.T., Maidment, D.R. and Mays, L.W. (1988) Applied Hydrology.

   Shrestha, R.R. and Nestmann, F. (2009) Physically Based and Data-Driven Models and Propagation of Input Uncertainties in River Flood Prediction. Journal of Hydrologic Engineering, 14, 1309-1319.
https://doi.org/10.1061/(ASCE)HE.1943-5584.0000123

   Gosain, A.K., Mani, A. and Dwivedi, C. (2009) Hydrological Modelling-Literature Review. Advances in Fluid Mechanics, 339, 63-70.

   Nor, N.I., Harun, S. and Kassim, A.H. (2007) Radial Basis Function Modeling of Hourly Streamflow Hydrograph. Journal of Hydrologic Engineering, 12, 113-123.
https://doi.org/10.1061/(ASCE)1084-0699(2007)12:1(113)

   Jajarmizadeh, M., Harun, S.B. and Salarpour, M.M. (2011) A Concept of Classification for Hydrological Models. Proceedings of 1st Iranian Studies Scientific Conference. Malaysia, Universiti Putra Malaysia, Kuala Lumpur.

   Beven, K.J. and Kirkby, M.J. (1979) A Physically Based, Variable Contributing Area Model of Basin Hydrology. Hydrological Sciences Journal, 24, 43-69.
https://doi.org/10.1080/02626667909491834

   Vieux, B.E., Cui, Z. and Gaur, A. (2004) Evaluation of a Physics-Based Distributed Hydrologic Model for Flood Forecasting. Journal of Hydrology, 298, 155-177.

   Marsik, M. and Waylen, P. (2006) An Application of the Distributed Hydrologic Model CASC2D to a Tropical Montane Watershed. Journal of Hydrology, 330, 481-495.

   Shamseldin, A.Y. (2010) Artificial Neural Network Model for River Flow Forecasting in a Developing Country. Journal of Hydroinformatics, 12, 22-35.
https://doi.org/10.2166/hydro.2010.027

   Kisi, O. (2004) River Flow Modeling Using Artificial Neural Networks. Journal of Hydrologic Engineering, 9, 60-63.
https://doi.org/10.1061/(ASCE)1084-0699(2004)9:1(60)

   Cigizoglu, H.K. (2003) Estimation, Forecasting and Extrapolation of River Flows by Artificial Neural Networks. Hydrological Sciences Journal, 48, 349-361.
https://doi.org/10.1623/hysj.48.3.349.45288

   Taormina, R., Chau, K.-W. and Sethi, R. (2012) Artificial Neural Network Simulation of Hourly Groundwater Levels in a Coastal Aquifer System of the Venice Lagoon. Engineering Applications of Artificial Intelligence, 25, 1670-1676.

   Nayak, P.C., Sudheer, K.P. and Ramasastri, K.S. (2005) Fuzzy Computing Based Rainfall-Runoff Model for Real Time Flood Forecasting. Hydrological Processes, 19, 955-968.
https://doi.org/10.1002/hyp.5553

   Liong, S.-Y., Lim, W.-H., Kojiri, T. and Hori, T. (2000) Advance Flood Forecasting for Flood Stricken Bangladesh with a Fuzzy Reasoning Method. Hydrological Processes, 14, 431-448.
https://doi.org/10.1002/(SICI)1099-1085(20000228)14:3<431::AID-HYP947>3.0.CO;2-0

   McKerchar, A.I. and Delleur, J.W. (1974) Application of Seasonal Parametric Linear Stochastic Models to Monthly Flow Data. Water Resources Research, 10, 246-255.
https://doi.org/10.1029/WR010i002p00246

   Noakes, D.J., McLeod, A.I. and Hipel, K.W. (1985) Forecasting Monthly River Flow Time Series. International Journal of Forecasting, 1, 179-190.

   Rosenberg, E.A., Wood, A.W. and Steinemann, A.C. (2011) Statistical Applications of Physically Based Hydrologic Models to Seasonal Streamflow Forecasts: Statistical Applications of Physically Based Models. Water Resources Research, 47, n/a.

   Chau, K.W., Wu, C.L. and Li, Y.S. (2005) Comparison of Several Flood Forecasting Models in Yangtze River. Journal of Hydrologic Engineering, 10, 485-491.
https://doi.org/10.1061/(ASCE)1084-0699(2005)10:6(485)

   Veiga, V.B., Hassan, Q.K. and He, J. (2014) Development of Flow Forecasting Models in the Bow River at Calgary, Alberta, Canada. Water, 7, 99-115.
https://doi.org/10.3390/w7010099

   Sivakumar, B., Jayawardena, A.W. and Fernando, T. (2002) River Flow Forecasting: Use of Phase-Space Reconstruction and Artificial Neural Networks Approaches. Journal of Hydrology, 265, 225-245.

   Iddrisu, W.A., Nokoe, K.S., Osei, F.B. and Antwi, E.O. (2016) Spatial Bayesian Methods of Flow Forecasting in the Black Volta River. European Journal of Scientific Research, 137, 89-105.

   Lin, X. and Zhang, D. (1999) Inference in Generalized Additive Mixed Modelsby Using Smoothing Splines. Journal of the Royal Statistical Society, 61, 381-400.
https://doi.org/10.1111/1467-9868.00183

   Fahrmeir, L. and Lang, S. (2001) Bayesian Inference for Generalized Additive Mixed Models Based on Markov Random Field Priors. Journal of the Royal Statistical Society. Series C, Applied Statistics, 50, 201-220.
https://doi.org/10.1111/1467-9876.00229

   Wood, S. (2006) Generalized Additive Models: An Introduction with R. CRC Press, Boca Raton.

   Augustin, N.H., Trenkel, V.M., Wood, S.N. and Lorance, P. (2013) Space-Time Modelling of Blue Ling for Fisheries Stock Management. Environmetrics, 24, 109-119.
https://doi.org/10.1002/env.2196

   Allwaters Consult Limited. Diagnostic Study of the Black Volta Basin in Ghana. Global Water Initiative (GWI), CARE International, Catholic Relief Services (CRS), and the Regional Office for Central and West Africa of the International Union for Conservation of Nature (IUCN-PACO).

   Barry, B., Obuobie, E., Andreini, M., Andah, W. and Pluquet, M. (2005) Comprehensive Assessment of Water Management in Agriculture. Comparative Study of River Basin Development and Management. International Water Management Institute IWMI.

   Wood, S.N. (2011) Fast Stable Restricted Maximum Likelihood and Marginal Likelihood Estimation of Semiparametric Generalized Linear Models. Journal of the Royal Statistical Society, 73, 3-36.
https://doi.org/10.1111/j.1467-9868.2010.00749.x

   R Core Team (2016) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing.

Top