JGIS  Vol.10 No.4 , August 2018
Comparison and Evaluation of GIS-Based Spatial Interpolation Methods for Estimation Groundwater Level in AL-Salman District—Southwest Iraq
The aim of the research is to compare spatial prediction methods: (RBF), (IDW), (OK), (UK), and (SK) for the production of the groundwater level map and the prediction error map in study area as well. Setting the foundations and criteria for choosing the most appropriate mathematical method for the construction of statistical surfaces in the representation of the level of groundwater in study area. These methods were used to predict the spatial distribution map of the groundwater level based on measured data from 764 wells in May 2016. The study reveals that comparing the spatial interpolation models and evaluating their accuracy, through some statistical indicators and cross-validation is the best way to choose the optimal model for the representation of data entered in any site. As a result of the statistical comparison between the five spatial interpolation models and validation of the results using (cross validation) it was found that the universal Kriging (UK) method is the best method to represent the level of groundwater in Salman district because this model has the lowest root mean square error (RMSE), the lowest mean error (ME), and the highest coefficient of determination (R2) value. The groundwater level and prediction standard error maps produced in the geographic information system (GIS) give additional data and information that describe the aquifer system in study area and will ultimately improve sustainable groundwater management.

1. Introduction

The good application of hydrological methods is the basis for the management and development of water resources. As a result of the natural increase in the population of Iraq and the decline of the discharge of the Tigris and Euphrates because of the control of the countries of the upper basin, this has led to increased demand for water and this requires thinking about developing methods of searching for water and finding alternative sources of surface water to meet this challenge in the future as well as the scarcity of usable water is one of the greatest challenges in the twenty-first century [1] . Groundwater is an important source of rural and urban needs for economic and social development throughout the world [2] [3] . It is also the main source of water in areas with low rainfall, as in the study area, which is characterized by the scarcity of surface water resources.

The groundwater aquifer in AL-Salman district (dry zone) is the only source of household and agricultural needs in the study area. Therefore, it is necessary to rationalize the water in these areas through proper planning, which depends on the use of mathematical models to help the decision-maker to take correct steps in the planning and investment optimization of water projects. Proper planning requires the analysis and study of the behavior of the aquifer and the development of accurate digital maps that show the level of groundwater and its development over time [4] . One of the problems facing hydrogeological studies is the estimation of the data values in a given location, either because the data are missing or the site does not have measurements or it is impossible to make measurements for the entire area studied because this work is costly materially or morally. The scientific method in this case is to take sporadic samples from that area and then to predict unknown points (areas where samples are not taken). These mathematical processes are called spatial interpolation [5] . Therefore, spatial interpolation models can be defined as a set of statistical methods used to predict the values of phenomena in sites where measurements are not available based on a limited number of measured points.

All the phenomena with the spatial extension occupy a size in the space. To realize this size, we must see the outer surface surrounding this size, which is called the statistical surface. The spatial differences of this surface can be represented on the maps using one spatial interpolation model. Therefore, spatial statistical techniques are the main means of creating these maps. Since there is no universally approved method that can be adopted, this study compares different spatial completion methods: radial basis functions (RBF), inverse distance weighting (IDW), ordinary Kriging (OK), universal Kriging (UK) and sample Kriging (SK). Based on a set of statistical criteria, the best spatial interpolation model can be determined to represent the reality of the groundwater level in the study area evaluation. Moreover, the research aims to:

1) Setting the foundations and criteria for choosing the most appropriate mathematical method for the construction of statistical surfaces in the representation of the level of groundwater in study area.

2) Comparison of the prediction methods: IDW, RFP, UK, OK, and SK for the production of the groundwater level map and the prediction standard error map in study area.

The importance of research comes from the importance of studying groundwater related projects based on models that are in line with the development of GIS applications.

2. Material and Methodology

To reach the research objectives, the work was divided into three stages, as in Figure 1.

1) Collection, input, processing and analysis of data using the ArcGIS 10.5 program.

2) Comparison and evaluation of spatial interpolation models and selection of the best ones.

3) Production of groundwater level map and prediction standard error map to present and generalization of interpolation results.

2.1. Study Area and Data Used

The study area is located in AL-Salman district/southwest Iraq (Figure 2). It extends between longitude (43˚50 - 45˚27) east and latitude (29˚12 - 31˚25) north. covering an area of (17,462 km2) and constitute (4%) of Iraq’s total area of (435,052 km2). The study area is a part of the southern desert in AL-Salman

Figure 1. Methodology process.

Figure 2. Study area and samples locations [9] [10] .

basin which is characterized by limestone and dolomite beds. The Salman basin, to depth (400 - 500) meters consists of three main aquifers are Dammam Carbonate aquifer, Umm Er-Radhuma aquifer and Tayarat aquifer. The study area is generally, flat rocky terrain associated with structure ridges and isolated hills and karts depression. It is structurally lies within the eastern part of the stable shelf of Nubio-Arabian platform, which is characterized by large horst like anticlines and grabeen like synclines with local structures of short extend. The regional strike of the beds trends northwest-southeast with estimated dip amount of (2˚ - 3˚) towards the north east. The exposed rocks, in the study area, belong to Paleocene, Lower Miocene, Pliocene-Pleistocene and Quaternary ages. The Eocene sediments are the most prevailing ones. The rock sequence, is generally composed of carbonates with marl intercalations and lest amount of elastics [6] . The elevation of the study area ranges between (448) meters above sea level in the far south and (5) meters above sea level in the north [7] .

The climate of the study area is characterized by a dry hot climate in the summer and mild in winter, temperatures range between 16˚C and 32˚C, the annual rate of rain is less than (80) mm, relative humidity (39%) and the total annual evaporation (3484) mm [8] . The groundwater in the study area, especially in its northern parts, is being depleted due to the excessive increase in the drilling of artesian wells and the inefficiency of irrigation methods, as well as extreme evaporation and low rainfall. The study was based on field measurements taken from 764 artesian wells during the month of May 2016 [9] .

2.2. Methodology

2.2.1. Spatial Interpolation Methods

The spatial interpolation methods assessed here include in ArcGIS10.5 program the following:

・ Inverse distance weighted (IDW)

Inverted way: the weighted distance to derive the statistical surface expressed by a diver from the measured values at a number of points belonging to this surface and then a network of points is completed and the values of the phenomenon are calculated at these points according to a mathematical equation (Equation (1)). The calculation of the value of the phenomenon at any point of the network (statistical surface) is calculated and predicted in a way that is inversely proportional to its distance from the measured points after giving a weight per point [11] . This method is closely related to distance as values decrease with distance in other words that the predicted values will not exceed the values of the specimens and the prediction will be limited to the known values [12] .

This method is based on spatial correlation, where measured data are used at specific points in the region to estimate data for points where no measurements are available [13] . This means that the data of each given point is significantly affected as close to the point where measurements are not available and less. Its impact whenever it departs from it [14] .

z j = i z i d i j n i 1 d i j n (1)

where z j : estimated value for the unknown point at location j. d i j n : distance between known point i and unknown point j. z i : value at known point i. n: user defined ExpoNet for weighting.

・ Radial basis functions (RBF)

This technique preserves the sample values so that it paints a flexible predictive surface that passes over all sample values. It is able to predict the values and values of the samples, but not to ignore them but to pass them. It is similar to the model of the hands, but predicts the values that lie above the maximum values is below the minimum [15] [16] . One of the advantages of this model is that it generates enough statistical surface from a few samples. The disadvantages of this method are to have high and low values that differ from the input data set and are highly sensitive to extreme values because of the inclusion of original data values in the sample [16] [17] . It is calculated by the following Equation (2):

( r ) = r 2 + c 2 (2)

where: r = the destance from sample to estimation, c = smoothing factor.

・ Kriging

The Kriging model is one of the most complex and robust methods, applying advanced statistical methods and needs to know spatial statistics because the data must be subjected to statistical examination before application. It depends on the distance and the relationship between the values known in predicting unknown values, and it is possible to predict values that exceed or less than known values but do not pass them as in the style of Spline. The Kriging method is the best procedure for nonlinear linear completion [16] . The Kriging method is a linear interpolation procedure that provides unbiased linear estimates of different values in space, an advanced geostatistical procedure that generates an estimated statistical surface from a scattered set of points with z values (Equation (3)) [18] [19] . In this model, the semivariogram plays a key role in the analysis and modeling of geostatistical data, and takes into account the autocorrelation between spatial data to construct mathematical models of spatial correlation structures expressed by variables [11] [12] . It is calculated by the following Equation (3):

z ( s o ) = i = 1 n γ z ( s i ) (3)

where ( s o ): prediction location, γ: unknown weight for the measured value at the i location, z ( s i ) : measured value at the i site, n: number of samples.

2.2.2. Assessment Methods

The accuracy of spatial interpolation methods was assessed on the basis of three statistical criteria: Root Mean Square Error (RMSE), Mean Error (ME), and coefficient of determination (R2). The model can be validated when (RMSE) is as low as possible, the (ME) is near zero, and the closer (R2) of the correct one is, the better the model.

1) RMSE is used as an important parameter that indicates the accuracy of spatial analysis in geographic information system and remote sensing [20] . Lower (RMSE) indicates an interpolate is likely to give most reliable estimates in the areas with no data. The minimum (RMSE) calculated by Cross Validation can be used to find the best spatial interpolation model control parameters [21] [22] . The root mean square error (RMSE) was calculated for each model prediction using the formula [23] :

RMSE = 1 n i = 1 n [ z ^ ( x i ) z ( x i ) ] 2 (4)

where: z ( x i ) is observed value at point x i , z ^ ( x i ) is predicted value at point x i , n is number of samples (sum of squared errors) observed-estimated (values) and n is the number of pairs (errors).

2) The mean error (ME), is used for determining the degree of bias in the estimates and its provides an absolute measure of the size of the error. The large values indicate larger discrepancies between predicted and observed values [24] . ME formula as below:

ME = 1 n i = 1 n [ z ^ ( x i ) z ( x i ) ] (5)

where: z ( x i ) is observed value at point x i , z ^ ( x i ) is predicted value at point x i , n number of samples.

3) Coefficient of determination: it is called the linear correlation coefficient square (R2). It is expressed by the ratio of total squares of regression divided by total squares. The value ranges between the correct one and zero and is calculated by the following equation [25] :

R 2 = [ i = 1 n ( P 1 P a v e ) ( Q 1 Q a v e ) ] 2 i = 1 n ( P 1 P a v e ) 2 i = 1 n ( Q 1 Q a v e ) 2 (6)

where: Q a v e is the mean of measured values, P a v e is the mean predicted values, n: number of sample used for predication.

3. Results and Discussion

3.1. Exploratory Analysis

Before applying the spatial interpolation models, the exploratory analysis of the data used should be performed using geostatistical techniques supported by GIS programs. Spatial models give more representative results if the data are distributed naturally and may result in distorted results if the data are abnormal. The spatial statistical extension contains a set of tools for the distribution of data such as the histogram, the trend analysis tool, Semivariogram/Covariance Cloud, and some statistical indicators. The natural distribution of the data that takes the shape of the natural curve (bell) in which the value of the measures of central tendency (Mean, Median and Mode) is characterized by this curve that the coefficient of skewness is equal to zero, and the coefficient of Kurtosis is equal to 3. Each skewness coefficient is close to zero and all Kurtosis coefficients close to the value of 3 indicate a normal distribution of data [26] [27] . In this study, the skewness coefficient = 1.21 and the Kurtosis coefficients = 2.9. In terms of the skewness coefficient, we observe that the distribution of the data is skewness to the right (positive skewness). This indicates that the majority of the values of the samples used have low values and in terms of the Kurtosis curve we find that the distribution of the data used in the form of the average flattening because the value of the coefficient is very close to 3.

(Figure 3) The structure of the data and extreme values of groundwater levels and trends are described in three dimensions (X, Y, Z). The blue curve shows the

Figure 3. Trend analysis of groundwater level in study area.

gradual rise of the groundwater level from south to north, and the green curve indicates the decline of data from the west to the east. This means that the groundwater flow in the study area from the south and southwest to the north and north-east, which is consistent with the general slop of topography of the study area.

Spatial autocorrelation are used to detect and measure the similarity of contiguous phenomena that depend on the comparison between the value of the phenomenon and the average value of the structure (statistical value). If the difference between the contiguous parameters is smaller than the difference between all the parameters, it indicates that the adjacent values are similar because of the similarity of the surrounding conditions. In this case, it can be said that there is a positive reciprocal spatial autocorrelation. However, if the values of adjacent phenomena differ, it can be said that there is a negative spatial autocorrelation, in other words, a lack of spatial autocorrelation. Moran’s Index is one of the important measures in detecting the spatial autocorrelation between the elements of the phenomenon studied and the pattern of the spread of the phenomenon is it dispersal, regular or random. The value of the Moran directory is between +1 and −1. If the directory value is close to (+1), this indicates the clustered pattern. If the value is close to (−1), this indicates the random pattern [24] . (Figure 4) The statistical analysis of the spatial autocorrelation of groundwater samples in the study area shows that the clustered correlation is shown by the Moran’s Index of increase (0.71) and Z = 11, at a significant level (0.01).

In addition, the spatial autocorrelation relationship is detected by the semivariogram/covariance cloud instrument as in Figure 5 where each red dot represents a pair of similar groundwater samples in values. Which confirms that the data in the areas close to each other tend to be more similar and the correlation

Figure 4. Spatial autocorrelation of groundwater level samples in the study area.

of those data diverged among them. Previous analyzes allow us to conclude that the data used in this study are spatially interrelated and do not contain anomalous values.

After confirming the validity of the natural distribution of groundwater data and not containing abnormal data, and before producing the spatial interpolation map of groundwater levels, it is necessary to conduct comparative and assessment of spatial interpolation methods and choose the best model for representing the groundwater level in the study area.

3.2. Comparison and Choice of Optimal Model

In this study 64 tests were conducted in order to find the nearest model representing the reality of the groundwater level in the study area. Then the best method was chosen for each of the five models: 1) inverse distance weighted (IDW), 2) radial basis functions (RBF), 3) ordinary Kriging (OK), 4) simple

Figure 5. Semivariogram cloud to groundwater level in study area.

Kriging (SK) and 5) universal Kriging (UK). Then the process of comparing the final optimal models and selecting the best model to represent the reality of the groundwater.

To compare the spatial interpolation models and the selection of the best, the cross validation technique provided by ArcGIS 10.5 was used to judge statistically the accuracy of these models in the representation of groundwater levels in the study area as shown in Figures 6-9. As a result of the statistical comparison between the five spatial interpolation models and validation of the results using (cross validation) it was found that the (UK) method is the best method to represent the level of groundwater in Salman district because this model has the lowest Root Mean Square Error (RMSE), the lowest mean error (ME), and the highest Coefficient of determination (R2) value as in Table 1 and Figure 10. This process is the first step to obtain high quality in the representation of groundwater data and produce a spatial prediction map of the groundwater level. The results also indicated that the radial basis functions (RBF) model is the lowest spatial interpolation models in the accuracy of the statistical indicators, and the other models in terms of preference in the following order: UK > IDW > OK > SK > RBF.

3.3. The Maps Groundwater Level and Prediction Standard Error

The geostatistical techniques allowed to find the best spatial interpolation method and to produce groundwater level map and the prediction standard error

Figure 6. Cross validation comparison universal Kriging (UK) with (IDW) model.

Figure 7. Cross validation comparison universal Kriging (UK) with (RBF) model.

Figure 8. Cross validation comparison universal Kriging (UK) with (OK) model.

Figure 9. Cross validation comparison universal Kriging (UK) with (SK) model.

Table 1. Comparison of optimal model of each method [28] .

Figure 10. Statistical accuracy indicators RMSE, ME, and R2 for every methods optimal model.

map. Figure 11 shows the map of Flow direction and groundwater level estimated by universal Kriging model. Its analysis appears that groundwater flow direction occurs generally from south and southwest to the north and northeast in study area. As well as, this hydrological map provides important information on the hydraulic gradient in all parts of the study area. Where groundwater levels vary between (283.4) meter above sea level in the south-west and (5.4) meter above sea level in the far north-east of the study area near the AL-Sulibat depression. The comparison of Figure 11 to Figure 12 shows several facts that can be summed up as follows:

1) Groundwater levels largely follow the surface topography in the study area, moving away from the surface and increasing in the highlands as in the southwestern part of the study area. The groundwater level approaches the surface in lowlands such as valleys and depressions.

2) The presence of a AL-Salman depression in the middle of the study area helped to approach the groundwater from the surface compared to the areas around the depression.

3) If the groundwater level intersects with the surface of the earth, the groundwater appears in the form of perfusion or springs as it is in the far north of the study area or near the AL-Sulibat depression.

4) The hydraulic gradient of the groundwater with the topographic slope is largely consistent in the study area from the south-southwest to the north and north-east.

5) The hydraulic gradient rate was 1.411 m/km, while the topographic slope

Figure 11. Map of groundwater level by using optimal chosen model.

Figure 12. Topography of the study area [7] .

was 2.248 m/km.

Figure 13 shows the spatial distribution of uncertainties to spatial predictions in groundwater level in study area as the uncertainty varies between (1.3 - 2.2) meters in the near areas of the measured point, while this error gradually increases to reach (11.79 - 18.87) meters in areas far from measured points.

These maps can be used as a basis in finding the best locations for drilling wells in the study area, through subtraction groundwater surface layer from digital elevation model by using ArcGIS and produce a map showing the depth of the groundwater, can be utilized in the planning, development and decision-making processes of the actors in Iraq.

4. Conclusions

The new thing that was discussed and reached in this research is to determine the best models of spatial interpolation in the representation of groundwater levels in the Salman area, after conducting comparisons and evaluation according

Figure 13. Prediction standard error map.

to several statistical criteria and then producing a digital map describing the spatial distribution of these levels and production prediction standard error map, where geostatistical techniques were applied in surfaces modeling and comparison in terms of accuracy. The study showed that the universal Kriging (UK) method is the best way to represent the groundwater level in the study area, because it gave the least root mean square error (RMSE) 10.64, mean error (ME) 5.36 and the highest coefficient of determination (R2) 0.98 between the measured values and the predicted values by other spatial interpolation models. This required more than 64 tests of inevitable and statistical interpolation methods. Cross-validation was used to compare and evaluate the accuracy of each model, as it was an effective tool in comparison and assessment that allowed access to the most accurate model to represent the groundwater level with the least time and effort. The study reveals that comparing the spatial interpolation models and evaluating their accuracy, through some statistical indicators and cross-validation is the best way to choose the optimal model for the representation of data entered in any site.

This optimal spatial interpolation method has generated groundwater levels map in study area and as well as the uncertainty map associated with her it. Use of these maps help to detect the surface of the groundwater or piezometric level and the direction of the hydraulic gradient in the study area can be utilized in the planning, development and decision-making processes of the actors in Iraq.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

Cite this paper
Njeban, H. (2018) Comparison and Evaluation of GIS-Based Spatial Interpolation Methods for Estimation Groundwater Level in AL-Salman District—Southwest Iraq. Journal of Geographic Information System, 10, 362-380. doi: 10.4236/jgis.2018.104019.
[1]   Simonovic, S.P. (2002) World Water Dynamics: Global Modeling of Water Resources. Journal of Environmental Management, 66, 249-267.

[2]   Smedema, L.K. and Shiati, K. (2002) Irrigation and Salinity: A Perspective Review of the Salinity Hazards of Irrigation Development in the Arid Zone. Irrigation and Drainage Systems, 16, 161-174.

[3]   Nayak, P.C., Rao, Y.S. and Sudheer, K.P. (2006) Groundwater Level Forecasting in a Shallow Aquifer Using Artificial Neural Network Approach. Water Resources Management, 20, 77-90.

[4]   Reghunath, R., Murthy, T.S. and Raghavan, B.R. (2005) Time Series Analysis to Monitor and Assess Water Resources: A Moving Average Approach. Environmental Monitoring and Assessment, 109, 65-72.

[5]   Azpurua, M.A. and Ramos, K.D. (2010) A Comparison of Spatial Interpolation Methods for Estimation of Average Electromagnetic Field Magnitude. Progress in Electromagnetics Research, 14, 135-145.

[6]   Jassim, S.Z. and Goff, J.C. (2006) Geology of Iraq. Geological Society of London, London.

[7]   Digital Elevation Model (DEM) with Resolution 30 m Download from Website of Aster GDEM.

[8]   Partow, H., Witt, R. Fosnight, G. and Singh, A. (2004) The Mesopotamian Marshlands: Demise of an Ecosystem. UNEP, DEWA/GRID-Geneva in Cooperation with GRID-Sioux Falls and the Regional Office for West Asia (ROWA).

[9]   Ministry of Water Resources in Iraq (2016) General Commission for Groundwater. Unpublished Data.

[10]   Esri, HERE, Garmin, USGS, Intermap, INCREMENTP, NRCan.

[11]   Kevin, J., Ver Hoef, J.M., Krivoruchko, K. and Lucas, N. (2003) Using ArcGIS Geostatistical Analyst. Environmental Systems Research Institute, Redlands.

[12]   Anderson, S. (2002) An Evaluation of Spatial Interpolation Methods on Air Temperature in Phoenix, AZ. Department of Geography, Arizona State University, Tempe.

[13]   Collins, F.C. (1995) A Comparison of Spatial Interpolation Techniques in Temperature Estimation. Doctoral Dissertation, Virginia Tech.

[14]   Balakrishnan, P., Saleem, A. and Mallikarjun, N.D. (2011) Groundwater Quality Mapping Using Geographic Information System (GIS): A Case Study of Gulbarga City, Karnataka, India. African Journal of Environmental Science and Technology, 5, 1069-1084.

[15]   Schloeder, C.A., Zimmerman, N.E. and Jacobs, M.J. (2001) Comparison of Methods for Interpolating Soil Properties Using Limited Data. Soil Science Society of America Journal, 65, 470-479.

[16]   Poshtmasari, H.K., Sarvestani, Z.T., Kamkar, B., Shataei, S. and Sadeghi, S. (2012) Comparison of Interpolation Methods for Estimating pH and EC in Agricultural Fields of Golestan Province (North of Iran). International Journal of Agriculture and Crop Sciences, 4, 157-167.

[17]   Webster, R. and Oliver, M.A. (2007) Geostatistics for Environmental Scientists. John Wiley & Sons, Hoboken.

[18]   Gundogdu, K.S. and Guney, I. (2007) Spatial Analyses of Groundwater Levels Using Universal Kriging. Journal of Earth System Science, 116, 49-55.

[19]   Uyan, M. and Cay, T. (2010) Geostatistical Methods for Mapping Groundwater Nitrate Concentrations. 3rd International Conference on Cartography and GIS, Nessebar, 15-20 June 2010, Vol. 1520, 732-741.

[20]   Siska, P.P. and Hung, I.K. (2001) Assessment of Kriging Accuracy in the GIS Environment. 21st Annual ESRI International Conference, San Diego, 378-386.

[21]   Samin, M., et al. (2012) Spatial Estimation of Groundwater Quality Parameters Based on Water Salinity Data Using Kriging and Cokriging Methods. Proceedings of International Conference on Transport, Environment and Civil Engineering, Kuala Lumpur, 25-26 August 2012, 1294-1311.

[22]   Mitásová, H. and Mitás, L. (1993) Interpolation by Regularized Spline with Tension: I. Theory and Implementation. Mathematical Geology, 25, 641-655.

[23]   Shahid, S.U., Iqbal, J. and Hasnain, G. (2014) Groundwater Quality Assessment and Its Correlation with Gastroenteritis Using GIS: A Case Study of Rawal Town, Rawalpindi, Pakistan. Environmental Monitoring and Assessment, 186, 7525-7537.

[24]   Kamińska., A. and Grzywna., A. (2014) Comparison of Deterministic Interpolation Methods for Groundwater Level. Journal of Ecological Engineering, 15, 55-60.

[25]   Khazaz, L., Oulidi, H.J., El Moutaki, S. and Ghafiri, A. (2015) Comparing and Evaluating Probabilistic and Deterministic Spatial Interpolation Methods for Groundwater Level of Haouz in Morocco. Journal of Geographic Information System, 7, 631.

[26]   Mardia, K.V. (1974) Applications of Some Measures of Multivariate Skewness and Kurtosis in Testing Normality and Robustness Studies. Sankhyā: The Indian Journal of Statistics, Series B, 36, 115-128.

[27]   Joanes, D.N. and Gill, C.A. (1998) Comparing Measures of Sample Skewness and Kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician), 47, 183-189.

[28]   Based on Data Measured by Ministry of Water Resources in Iraq (2016) General Authority for Drilling of Groundwater Wells.