The application of rainfall-runoff models is essential for understandings of hydrological processes and to provide a practical remedy to water resource problems  . However, data demands and lack of parsimony (simplicity with minimum data requirement) in model parameters can still be a major constraint when it comes to apply models in real life problem solving  . Appropriate stream flow forecasts can be used to protect property damages and to avoid losses of lives. They shed relying more on rainfall data and observed stream flows. These inputs will then be used by computerized hydrological models to simulate the amount of runoff generated in a specified watershed and timing of peak runoff. Usually hydrologists evaluate and forecast stream flows at the outlets and sub watershed levels on the days’ time scale for the study of water resources and management of watersheds. The forecast can be employed to minimize flood damages using measures such as through early warning systems.
In the Lake Tana Basin of Ethiopia, flood is occurring frequently and damage associated area is high on the eastern and north eastern parts of Lake Tana called the Fogera and Dembia flood plains. This is mostly attributed to overflow of river banks that cause inundation of flood plains, though the rise of the lake level has its own influence. In the (sub) humid monsoonal climate of Lake Tana sub-basin, rivers build up from continuous and intense rainfall in the watersheds and when coupled with the local rainfall on the flood plains, result in severe flooding problems.
Different hydrological models for predicting discharge from various inputs have been tested with water balance approach in the Lake Tana basin, for instance the Parameter Efficient semi-Distributed Watershed Model (PED-WM) developed by  and used in Lake Tana basin by    . Hydrologiska Byrans Vatten balansav delning-Integrated Hydrological Modeling System (HBV-IHMS), a daily time step watershed model  was tested by  and Soil Water Assessment Tool (SWAT) developed by  , was used by   in the Lake Tana basin. However each of the models area is not tested for the flood forecasting system in Lake Tana basin. Testing the applicability of flood forecasting models would be vital to predict the flood in the flood prone basin of Lake Tana for the purposes of mapping the extent of flooding and for developing flood early warning systems.
In the Lake Tana basin, testing the applicability of flood forecasting models like GFFMS that have a number of flood forecasting packages are important because they provides different forecasting packages based on the availability of data and catchment characteristics. Therefore, in this study the performance of the Flood Forecasting and Modeling System (GFFMS) was evaluated for stream flow forecasting with lead times of one up to six days. In addition, the effectiveness of GFFMS forecasting tool in providing stream flow forecasting is evaluated and three forecast updating methods: Autoregressive (AR), linear transfer function (LTF) and neuron network updating (NNU) were tasted for the applicability in three major rivers (Gumara, Megech and Gilgel Abay) in the Upper Blue Nile basin of the sub-humid Ethiopian highlands. However, the study only compares methods embedded in GFFMS for stream flow forecasting but not forecast stream flow quantiles for the three rivers in the basin.
2. Materials and Methods
2.1. Description of the Study Area
Lake Tana Basin (Figure 1) is the major source of the head waters of the Blue Nile River, which lies in a natural drainage basin of about 15,101 km2 considering outlet near Chara-Chara weir  . Lake Tana is situated in the northern highlands at an altitude of approximately 1800 meters a.s.l. More than 40 rivers are feeding the lake with 4 major perennial rivers namely Ribb, Gilgel Abay, Gumara and Megech. These Rivers contribute approximately more than 93% of the inflow to the Lake Tana   . The average temperature in the basin ranges from 21.5˚C - 22.5˚C   . The annual mean rainfall ranges from 1200 - 2400 mm/year based on stations in each watershed in this study. The livelihood of the community in the area depends on mainly Agriculture. The description of the study area is shown in the Figure 1.
Precipitation and temperature data for the watersheds were from the National Meteorological Agency of Ethiopia, Bahir Dar branch. Daily discharge data for the three watersheds were collected from Ministry of Water Irrigation and Electricity
Figure 1. Map of Ethiopia and the Lake Tana sub-basin and locations of the study watersheds.
(MoWIE). Potential evapo-transpiration (PET) was estimated for both of the watersheds using the temperature method  .
2.3. GFFMS Model Description
The GFFMS model is developed at the department of engineering hydrology national university of Ireland, Galway, Ireland. The five models embedded in the software are; system theoretic models; simple linear model (SLM), linear perturbation model (LPM), linearly varying gain factor model (LVGF) and artificial neuron network (ANN) and one conceptual model which is soil moisture accounting and routing (SMAR) model. Ordinary least square solution for (SLM, LPM, and LVGF), conjugate gradient algorithm for (ANN) and Rosenbrock, simple search and genetic optimization methods for (SMAR) are used for calibration of the model parameters.
Three model output combination techniques (simple average method-SAM, weighted average method-WAM and neural network method-NNM) are evaluated to combine model simulation results. The performance of three forecast updating methods embedded in the software namely: autoregressive component-AR, linear transfer function component―LTF and Neural network updating―NNU was also evaluated. The Galway River Flow Forecasting and Modeling System (GFFMS) are used to forecast stream flow quantiles for a continuous long-term watershed scale and in a daily time basis. The software requires readily available hydro-meteorological data. Descriptions of the models are presented below.
2.3.1. System-Theoretic Models
1) Simple Linear Model (SLM)
The SLM  is a black-box, single-input single-output model which comprises both parametric and non-parametric forms and have a basic assumption of SLM is a linear time-invariant relationship between rainfall and discharge.
2) Linear Perturbation Model (LPM)
The LPM  similar to SLM is a black-box single-input single-output model which comprises both parametric and non-parametric forms. The model uses the seasonal information inherent in the rainfall and discharge series. It assumes that during a year in which the rainfall is identical to its seasonal expectation, the corresponding discharge hydrograph will be identical to its seasonal expectation. In all other years, when the rainfall and discharge values depart from their respective seasonal expectations, the departure time series is assumed to be related by a linear time-variant system  . The discrete non-parametric and parametric forms of the LPM will be expressed in a similar fashion as those of SLM with input-output system referring to respective departures of rainfall and discharge from their seasonal expectations.
3) Linearly Varying Gain Factor (LVGF) Model
This model involves the variation of the gain factor with a selected index of the prevailing watershed wetness (Zt). The LVGF model  output has a discrete convolution summation equation based on the concept of a time-varying gain factor Gt. as indicated Equations (1)-(4).
The Gt is linearly related to an index of the soil moisture state Zt of the watershed as:
where, a and b are constants and the soil moisture state can be obtained from the outputs of SLM an auxiliary input as follows:
where: = The mean of the observed discharge, Gt = gain factor (runoff coefficient), = estimate of gain factor of SLM and = estimates of pulse response ordinates of SLM. Then gain factor for SLM is calculated from the ratio of the total output volume to the total input volume.
4) Artificial Neural Network Model (ANN)
In the GFFMS (Figure 2) the neural network model consists of three layers: an input layer, one output layer and a hidden layer  . For a neuron either in the input or in the output layer each received input (Qi) is transferred to its output (Qout) by a mathematical function:
where: f() = the transfer function, m = total number of inputs/total number of neurons in the preceding layer, wo = neural threshold (a base line value independent of the input) and wi input connection pathways weight. The non-linear transfer function adopted for the neurons of the hidden and output layers is the widely used logistic/sigmoid function bounded in the range [0, 1] and wi, wo and σ are parameters of the network configuration determined by conjugate gradient method  . Graphically the overall schematic representation of this model is shown in Figure 3.
2.3.2. Conceptual Rainfall-Runoff Model
1) Soil Moisture Accounting and Routing (SMAR) Model
The SMAR Model is a development of water balance layers conceptual rainfall- runoff model introduced by  , its water balance component being based on the Nash model  . The non-linear water balance (soil moisture accounting)
Figure 2. Schematic diagram of SMAR model (Source:  ).
Figure 3. Transformation processes of inputs to outputs for ANN model (Shamseldin, et al., 1997).
component preserves the balance between rainfall, the evaporation, the generated runoff and the changes in the various layers of soil moisture storage. The routing component simulates the attenuation and the diffusive effects of the watershed by routing the various generated runoff components through linear time-invariant storage elements. For each time step, the combined output of the two routing elements adopted becomes the simulated discharge forecast produced by the model  .
The three alternative automatic optimization algorithms, i.e., the genetic algorithm  , the Rosenbrock method  and simplex method  are available for the calibration of the SMAR model. These methods were used individually and the best optimization method (in terms of numerical efficiency criteria) was selected. The version of SMAR used in the present study has nine parameters, five of which control the overall operation of the water budget component, while the remaining four control the operation of the routing component and the schematic diagram of SMAR model is as shown in Figure 2.
2) Model Output Combination Techniques (MOCTs)
In stream flow forecast it is common to combine models forecast outputs, for this purpose GFFMS provides Simple Average Method (SAM), Weighted Average Method (WAM) and Neural Network Method (NNM) combination techniques  . The SAM is the simplest method for combining the outputs of different individual models and was used to combine outputs of models having nearly the same stream flow forecasts. The WAM is another method of forecast outputs combination which can give more weight to outputs of better performing models than the SAM. The NNM; was used when a non-linear function is needed for the combinations of the outputs. In GFFMS, the multi-layer feed-forward neural network is used for combination technique consists of an input layer, output layer and hidden layer between the input and output layers.
2.3.3. GFFMS Methods in Updating Mode
Ideally, the simulation model should so resemble the actual system that the residuals in the calibration period should be a series of unrelated quantities of zero expectation and of small variance. However, in most actual fittings, persistence in the residual is observed. This phenomenon results from inadequacies in the model structure, incorrect estimation of model parameters, errors in the data and absence of any consistence relationship in the data. Therefore observation of the structure of this persistence can provide the basis for an updating procedure whereby the output from the model can be modified, prior to issuing the forecast.
Autoregressive Method (AR): this model is used to forecast the errors of simulations from each model and then use these errors to update the simulated discharges. Linear Transfer Function Method (LTF): the operation of a continuous linear time-invariant system can usually be defined by a general linear differential equation of the form:
is a deferential operator and is a gain factor.
The parameter B allows for a pure delay. For a linear system, the coefficients must be independent of X and Y and none of the powers of the derivatives can be greater than unity, although the order of the derivatives may be unlimited. When the input and output are observed at discrete intervals (in blocks of average intensity) is the LTF model defined by a linear-difference equation of the form:
where: is an AR parameters with , = MA parameters and b = pure time delay restricted to integer values only; with addition of an error term (E) Equation (8) can be written as:
The parameters of the model are estimated by method of OLS  . The order of the model (r, s) i.e., the numbers of AR and MA terms and the extent of the pure lag must be pre-selected. To ensure optimum values of r, s and b the calculation was repeated with different selections of these parameters. Using past observed values of y as input, the model automatically provides an updating procedure, because recently observed values of Y are used in obtaining the new estimated values of Y. Forecast values of the input variables will be needed for the computation of one and to six day lead time flow values. The forecast origin is the last data in the observed series.
Neural Network Updating (NNU): this model is a non-linear input-output updating model which enables the forecasting of the future values on the basis of the values of one/more exogenous input time series. The simulation mode discharge time series produced by the simulative models constitutes the exogenous input, which is used with the observed discharge in providing the updated discharge forecasts of the method. The ANN may be used as a real-time discharge forecast updating technique  , wherein the ANN operates on both the discharge forecasts and on the recent observed discharge values in order to produce updated forecasts, these input discharge forecasts being either those of an individual basic rainfall-runoff model or those produced by a forecast combination method (i.e., as an alternative to use AR model for forecast error updating).
2.4. Evaluation of Model Performance
Nash Sutcliffe efficiency (NSE), Index of Agreement (IOA), Relative Volume Error (RVE) and Relative Error of the Peak (REP) objective functions was used to describe the predictive accuracy of the model as long as there is observed discharge. The NSE  measures the efficiency of the model (overall fit of the hydrographs). The IOA is used to overcome the insensitivity of NSE to differences in the observed and forecasted means and variances, the RVE is used to evaluate the agreement in the volume of the forecasted and observed discharges and the REP is used to evaluate peak individual stream flows.
3. Results and Discussion
In this section, the results are presented in the most logical order; i.e., simulation, updating and forecast mode. Each of the five basic models from the GFFMS software is applied to each of the three test watersheds. The hydrological data is split in to two: for calibration (about two-third of the data that correspond to 11 years of data) and validation periods (one-third of the data that correspond to 5 years of data) for Gumara and Megech watersheds. However, the gauging site for Gilgel Abay is moved to another location after December 2005 due to construction of the main road that joins Bahir Dar to Addis Ababa. Since the stage-discharge relationship is not established for the new gauging site, 8 years of data for calibration and 4 years of data for validation is used for the watershed.
Generally; graphical evaluation of the models result the hydrograph below shows that LTF updating model performed better than the other models for one day to six days lead time forecast, however, it performed a little bit better for one days forecast than the other lead days of the wettest years. In capturing peak values the AR updating gave better estimate and graphically it is shown in Figure 4.
1) Gilgel Abay Watershed
The selected models were calibrated for Gilgel Abay watershed using concurrent hydro-meteorological data covering a time period from January, 1st of 1994 to December, 31st of 2001. These years were selected for calibration since they experienced normal, wet and dry periods. The hydrograph shows the simulation and updating results of the wettest years of ANN and LTF (one day lead time) models for Gilgel Abay, ANN and NNU (one day lead time) models for Gumara but SMAR and LTF (one day lead time) models for Megech watershed. For Gilgel Abay watershed, the models overestimated the base flow but underestimated the peak flows. The effect of this is that, the model better estimated the overall
Figure 4. Rainfall as well as observed and simulated discharge hydrographs of the three case watersheds for calibration (simulation and updating mode of one day lead time) for wettest year of Gilgel Abay (ANN, LTF), Gumara (ANN, NNU) and Megech (SMAR, L).
volume: i.e., the mean of the observed flow due to error compensation (i.e., positive errors are canceled out by negative errors).
Note that 1994 belongs to the calibration period and 2003 belongs to the validation period of Gilgel Abay watershed. The ANN model captured the overall pattern of the observed hydrographs of the selected years. The model clearly underestimated some of the major peak stream flows in the wettest years; such as August 11th, 14th and 12th peak days of 1994 and 5th August, 8th September and 20th July peak days of 2003. Overall, the result suggests that the model cannot provide good forecasts unless these errors in peak flows are corrected. The models shown in the hydrograph (Figure 4) are the very good performing models; ANN for both Gilgel Abay and Gumara but SMAR for Megech watersheds in simulation mode. However, in updating these models output departures from the observed discharge; the LTF (one day lead time) for Gilgel Abay and Megech watersheds but NNU (one day lead time) for Gumara watershed are shown for calibrating the model parameters both in simulation and updating modes.
2) Gumara Watershed
The selected models were calibrated for Gumara watershed using concurrent hydro-meteorological data covering a time period from January, 1st 1994 to July, 31st 2004. These years were selected for calibration since they experience normal, wet and dry periods. The graphical comparison shows the simulation and updating results of wettest years of Gumara watershed for ANN and NNU (one day lead time) models. For this watershed the model overestimated the base flows and underestimated the peak flows. Note that in the graph 2004 belongs to the calibration period and 2005 belongs to the validation period of Gumara watershed. The ANN model captured the overall pattern of the observed hydrographs of the selected wettest years. However, it clearly has some limitations as it underestimated some of the major peak stream flows of the wettest years for instance; 1st, 17th August and 25th July days of 2004 but 13th September, 17th August and 15th September days of 2005. Overall, the result suggests that the model cannot provide good forecasts unless these errors in the peak flows are corrected. The subjective model efficiency results in updating mode showed that NNU model performed better than the other models from one day to six days lead time forecast for the watershed. However; it performed a little bit better for one days forecast than the other lead time days. For capturing the peak values the AR updating component gave better estimate.
3) Megech Watershed
The selected models were calibrated for Megech watershed using concurrent hydro-meteorological data covering a time period from January 1st 1994 to December, 31st of 2004. These years were selected for calibration since they experienced normal, wet and dry periods. The hydrograph above shows the simulation results of the wettest years of SMAR model for Megech watershed. For the watershed the model overestimated the base flow and underestimated the peak flows, this is because the model estimated better the overall volume i.e., the mean of the observed flow. Note that 1995 belongs to for the calibration period and 2009 belongs to for the validation period of Megech watershed. The SMAR model captured the overall pattern of the observed hydrographs of the selected years. But, the model clearly underestimated some of the major peak stream flows for instance; 4th, 14th and 13th peak days of August in 1995 but 4th and 5th peak days of August and 3rd peak day of September in 2009 of the wettest years. Overall, the result suggests that the model cannot provide good forecasts unless these errors in the peak flows are corrected. The model efficiency results in updating mode showed that LTF model performed better than the other models for one day lead time forecast for the watershed; however, it performed a little bit better for one day forecast than the other lead days. For capturing the peak values the AR updating component gave better estimate than other models.
3.1.2. Evaluation In Terms of Objective Function
The outputs of model forecasts were evaluated using objective functions in addition to graphical comparisons and the results are presented below.
1) Gilgel Abay Watershed
In the previous section, the calibration and validation results for the selected models are discussed in terms of visual evaluation. To make objective evaluation, results of numerical performance criteria are used. In terms of the NSE, the ANN model reproduced the pattern of the observed hydrograph of Gilgel Abay watershed better than the other models with NSE = 0.87. For the NSE efficiency, the PLPM is the best performing with NSE = 0.84 next to ANN model. The least performance in terms of NSE is obtained for the two variants of the SLM model with the parametric variant of this model resulting in the lowest NSE value of 0.66. Over all, the value of NSE for the selected models shows good to very good performance in terms of capturing the pattern of the observed discharge data.
In terms of IOA model efficiency criterion the ANN model slightly performed better than the other models, but SLM is inferior in performance than the others. However, note that the difference between the performances of these models is very small when compared in terms of IOA. In terms of IVF except SLM the other models performed better with IVF of 1.00 in simulation mode. ANN slightly better captured the peaks as compared to the other models. However, the performance of all the seven models is unsatisfactory in terms of REP as these models resulted in REP values much greater than 0.3 for Gilgel Abay. Over all, the simulation from these models resulted in an under prediction of peak discharge quantiles.
2) Gumara Watershed
In terms of the NSE criteria, the ANN model reproduced the pattern of the observed hydrograph of Gumara watershed better than the other models with NSE = 0.90. This result suggests that the pattern of the runoff for this watershed can be better reproduced by non-linear transformation of the inputs to the output. The least performance in terms of NSE is obtained for the two variants of the SLM with the parametric variant of this model resulting in the lowest NSE value of 0.7. From the NSE efficiency results of each models in simulation mode; with a value of NSE = 0.84, the PLPM is secondly best in capturing the pattern of the historical hydrograph for the watershed. The NSE value of ANN model corresponds to an acceptable match of simulated discharge to the observed discharge since the closer the model efficiency is to 1, the more accurate the model is. For model output combination technique the WAM and NNM methods performed better than SAM and for updating of the model output error series NNU method performed better. In updating of predicted error series the NNU model performed better followed by LTF than the AR.
3) Megech Watershed
In terms of the NSE criteria, the SMAR model reproduced the pattern of the observed hydrograph of Megech watershed better than the other models with NSE = 0.78. This result suggests that the pattern of the runoff for this watershed can be better reproduced by a conceptual transformation of the inputs to the output. Poorest performance is obtained for the other six models in terms of NSE with values of less than 0.5. This is because Megech watershed is not adequately represented by the linear models due to the flow duration curve (not linear). The SMAR model resulted in IOA value of 0.94 and therefore performs very well (Connor, 2000). The index of volumetric fit has a value of 0.96 for SMAR model in simulation mode. In terms of this efficiency criterion except the PSLM the other models performed better with IVF value of between 1.0 and 1.07 in simulation mode. For updating of the model output error series LTF method performed better than NNU and AR in terms of NSE. The LTF model also performed best in terms of IVF followed by NNU than the AR. In capturing peak discharge values AR is best than the others.
The models inputs were prepared for validation purposes and the hydrograph of the three test watersheds for validation is presented in Table 1.
1) Gilgel Abay Watershed
An independent data set for a period of from January 1st of 2001 to December, 31st of 2005 has been used to ensure the calibrated parameters perform reasonably well under this data set. The result of performance criteria showed that the model predictive capability is reasonable for validation periods. The ANN model performed better in terms of NSE for simulation mode, the WAM method for model output combination and the LTF model in updating mode for Gilgel Abay. The hydrographs for validating the calibrated parameters both in simulation and updating mode for the wettest years are shown in Figure 5.
2) Gumara Watershed
An independent data set from a period of August 1st of 2004 to December, 31st of 2009 has been used to ensure that the calibrated parameters perform reasonably well under this data set. The result of performance criteria showed that the
Table 1. Results of performances of different models for calibration in simulation mode of the three case watersheds, note that the ranking is based on values of NSE.
Figure 5. Rainfall observed and simulated discharge hydrograph of LTF for Gilgel Abay and Megech but NNU for Gumara watershed for validation in simulation and updating mode of one day leads time.
model predictive capability is reasonable for the validation period. The model efficiency criteria showed that the ANN model performed better for simulation mode, the NNM method for model output combination and the NNU model in updating mode. A difference in model efficiency results of the calibration and validation of the models is due to the data considered. The output hydrographs for validation of both simulation and updating modes are different from that of the calibration; the reason leis in the data used for calibration and validation.
3) Megech Watershed
An independent data set from a period of August 1st of 2004 to December, 31st of 2009 has been used to ensure that the calibrated parameters performed reasonably well under this data set. The result of performance criteria shows that the model predictive capability is reasonable for validation periods. The SMAR model performed well for simulation mode and the LTF model in updating mode. A difference in model efficiency results of the calibration and validation of the models are observed because of the data considered; i.e. resent discharge data for validation but past discharge data for calibration.
In this study alternative stream flow forecasting models were evaluated and alternative updating flood forecasting methods were compared for selected rivers in the upper Blue Nile basin. For watersheds of Gumara and Gilgel Abay, the LPM performed better than the other models next to ANN model. For comparably smaller watershed (Megech) the nine parameter SMAR conceptual model performed better than the other models. However, the performance of the SLM is clearly inferior to that of the other models. Probably as a result of non-linear behavior which matches the runoff generation behavior of large watersheds, the ANN model generally performed better than the simpler models. Models in GFFMS can be used to adequately forecast stream flows in the rivers of the upper Blue Nile basin with a performance efficiency of more than 78% i.e., accounting more than 78% of the initial variance for one day lead time. The result of sensitivity analysis showed that soil moisture infiltration capacity and rate are the most sensitive parameters in the three case watersheds and the remaining seven SMAR parameters sensitivity to runoff differs from one watershed to another watershed. From the three models output combination methods weighted average method-WAM for Gilgel Abay and neural network method-NNM for Gumara performed well. The AR updating component performed better in updating peak stream flows for all watersheds; however; the LTF for Gilgel Abay and Megech and NNU for Gumara watershed performed better in updating the overall volume of the output hydrograph.
This research was sponsored in part by Bahir Dar University, Blue Nile Water Institute for funding the data collection funds and creating the platform for research discussion forums. We would also like to thank the National Meteorological Agency of Ethiopia for providing the meteorological data. In addition our special thanks go to Ministry of Water Irrigation and Energy (MoWIE).
Conflict of Interest
The authors have declared no conflict of interest.
Tesfaye A. Dessalegn has contributed in the initial conception of the study, modeling and organizing the overall research work. Mamaru A. Moges has helped in the write-up of the manuscript and consulting during the initial research design. Dessalegn C. Dagnew has helped in editing the Article. Asegidew Gashaw Has helped in re-ordering, editing and improving the write up and presentation of the results.