Accelerated urbanization processes are causing water-quality disruptions within rivers and streams across watersheds and regions. This is in part due to insufficient city planning towards environmental sustainability  which can lead to serious deteriorations in watershed- and river-based environmental and ecological services. Apart from noticing enhanced levels of water pollution and reduced biological viability in natural waterways, surficial water quality evaluations are needed to provide quantitative information for sustainable water-resource management   . Since these evaluations need to address a host of sampled water quality indicators, it is necessary to determine how these indicators are patterned and vary across space and time in a predictive manner as advocated by, e.g.,  - . The results so obtained can then be transformed into useful decision-supporting tools to restore and protect water quality locally to regionally   . This article focuses on assessing water quality changes along a 30.4 km stretch of Rio de la Sabana as it flows east of Acapulco into the coastal Tres Palos lagoon (Figure 1). During the last two decades, the mid-low part of this sub-basin has experienced an accelerated population growth, which has led:
1) to serious water pollution and environmental degradation issues towards the river and to the coastal lagoons at Tres Palos and Puerto Marques   ;
2) to increased vulnerability to health risks across the new and flood-prone de la Sabana Valley settlements .
For these reasons, the Mexican government with support from the Spanish government initiated a potable water supply and wastewater sanitation project in 2012 to improve the quality of life, promote social equity and environmental sustainability across the de la Sabana Valley . As such, the Rio de la Sabana River―Tres Palos Lagoon area is now part of the hydrological priority region for preserving the biological importance of the coastal and marine Mitla-Chautengo Lagoon System . To understand and quantify how water quality varies along Riode la Sabana from its headwaters to the coastal lagoon, dry and rainy season water samples were taken from seven locations, all spaced 3 to 6.5 km apart. These locations vary from low to densely populated areas. The samples were analyzed to inform about 14 physicochemical and biological parameters (temperature, pH, electrical conductivity, dissolved oxygen, biochemical and
Figure 1. Locator map centered on the Rio La Sabana―Tres Palos Lagoon sub-basin north to east within the Acapulco Municipality in southwest Mexico also showing the seven sampling locations along the river, with a development index from low to high, inversely related to the normalized difference vegetation index (NDVI).
chemical oxygen demand, ammonium, nitrate, nitrite, sulphate, phosphate, methylene blue active substances, total coliforms, fecal coliforms).
2. Materials and Methods
Rio de la Sabana begins north of Acapulco, with its headwater channels reaching up to about 2000 meters above sea level. Its main flow channel is approximately 57 kilometers long at its confluence with the Tres Palos coastal lagoon . The sub-basin area amounts to 466.3 km2 with combined permanent flow channel lengths of 727 km, thus leading to a drainage density of 1.55 km−1 . The upper part of the sub-basin is comprised of steep sparsely populated terrain with many short flow channels fed by short-duration runoff. In contrast, the lower part widens into a broad floodplain, of which the eastern part is now densely populated .
Water quality sampling was conducted in 2017 during the dry season (January) and rainy season (July) in 7 sites (Table 1) from the headwater to the mouth of Rio de la Sabana at the Tres Palos lagoon (Figure 1). These locations were selected based on Google Earths images and mapped inland-use, vegetation cover,
Table 1. Location of study sites for water-quality characterization.
hydrology and population density . Pre-sampling involved 12 sites located in areas with similar characteristics (land inclination, substrate type, water flow rate, and shading degree) to analyze 7 physicochemical parameters (temperature, pH, electrical conductivity, dissolved suspended solids, residual chlorine, iron and magnesium). The resulting spatial and physicochemical analyses enabled the selection of 3 sites on the upper scarcely populated section (reference zone), and 4 sites in the densely populated mid to low river sections (Figure 3). Within the impacted zone, there are numerous pollution sources due to solid waste dumpsites and industrial and household wastewater discharge  .
Parameter monitoring and analytical methods.
Fourteen physicochemical, biochemical and bacteriological water quality parameters (Table 2) were measured in water samples in each of the study sites, following established protocols . Temperature, pH and electrical conductivity were measured in situ. The other parameters were analyzed at a laboratory accredited by the Mexican Organization of Accreditation. Temperature, pH, electrical conductivity, ammonium, nitrates, nitrites, methylene blue active substances, sulfates and phosphates parameters were all measured in triplicate. Dissolved oxygen, biochemical oxygen demand, chemical oxygen demand, and total and fecal coliforms were determined once using certified testing procedures. The results were organized in a data matrix (sites x parameters), where the rows refer to the sampling locations and the columns refer to the analyzed parameters (Table 2). These results were subsequently plotted by sampling location and sampling season. The order between the sample concentrations, locations, and season was examined by way of cluster analysis (CA). A correlation matrix was established to determine how the parameters relate to each other, to sampling location, and to season by way of principle component analysis. This was followed by nonlinear and linear least-squares multiple regression analyses to determine how each parameter varied quantitatively by sampling location and season. The non-linear analysis was based on the following equation:
Table 2. Physicochemical and bacteriological parameters, units and methods used.
and the linear multiple regression analysis used the following equation:
with y representing any of the 14 parameters, x representing the samplinglocations S1, S2, S3, …., S7, aL and aNL refer to the intercepts, and bL, cL, bNL, cNL, dNL are regression coefficients. The resulting best-fitted extent of the parameter variations, indicated by the coefficient of variation (R2) and the root mean square of the residuals, was improved for some of the parameters through log transformation. Equation (1) was applied to generate the best-fitted lines for each parameter by season, with sampling locations coded 1, 2, 3, 4, 5, 6, 7 according to their original order. Equation (2) was used to test the statistical significance of each parameter by location and season.
3. Results and Discussion
The data for the 14 water quality parameters are listed in Table 3 and are plotted in Figure 2 by sampling location and season, with best-fitted Equation (1) results overlaid. As tabulated and as shown, most parameters except for pH and DO increased from S1 (representing the least populated area) towards the populated areas represented by S5 and S6, and dropped again at S7. In addition, most parameters increased from the rainy to the dry season due to lack of dilution, except 1) DO decreased slightly (due to higher flow rates and better aeration), but not significantly so likely due to low sampling density, and 2) water temperature remained more or less the same. The parameters that increased the most at S5
Table 3. Sampling results, by sampling location and season.
Figure 2. Scatter plots and best-fitted lines (Equation (2)) for each of the 14 parameters listed in Table 3, by sampling location (S1, S2, … S7 along x-axis) and by pollution zone, colour-differentiated for each parameter from left to right by pollution level: nearly pristine (Z1), transitional (Z2), most polluted (Z3), and somewhat diluted (Z4).
and S6 from the rainy to the dry season were EC, BOD5, COD, , , TC and FC. There were two notable outliers for pH during the dry sampling season at S4 (pH = 8.86) and S6 (pH = 9.4), somewhat parallel to high effluent concentrations pertaining to NH3, , MBAS, TC and TF. Also notable is the steep decline of DO at S5 and S6 for both the dry and rainy sampling season. This undoubtedly relates to elevated BOD and COD discharge at these locations, as also reported in  .
The best-fitted Equation (1) results for the parameters with non-linear S1 to S7 trends are listed in Table 4. The similarities among these trends are such that the cNL and dNL coefficients could be kept in common without significant R2 and RMSE differences. In detail:
1) aNL refers to the headwater values for each water parameter by season;
2) bNL quantifies the pollution extent for each water parameter by season;
3) cNL = 5.58 indicates that the maximum levels are associated with the S5 and S6 locations;
4) dNL = 2.60 quantifies the spatial pollution extent across the S1 to S7sampling locations.
The results in Table 4 are used to estimate maximum and minimum water pollution levels along the river transect for the dry and rainy season (Table 5). In turn, the ratios of these numbers can be interpreted as spatial and seasonal pollution indicators. For example, transitions from rainy to dry season led to a min/min and max/max fecal count multiplication factors of about 7600 at the headwater region and 11,500 at the maximum pollution locations. By rainy to dry season, the down-river fecal pollution count increased by a max/min
Table 4. Best-fitted non-linear regression analysis results (Equation (1)); pH and DO not included.
cNL = 5.58 ± 0.23; dNL = 2.60 ± 0.49.
Table 5. Determining the minimum and maximum pollution levels and the corresponding ratios for the 12 parameters listed in Table 4, by location and by season (pH and DO not included).
amin: aNL for T(˚C) and 10aNL for all other parameters. bmax: aNL + bNL for T(˚C) and 10aNL+bNL for all other parameters.
multiplication factor of 60 and 90, respectively. For the other parameters, down-river changes in pollution were most severe for NH3, , , and MBAS, with max/min pollution effects stronger for and during the rainy season, and stronger for MBAS, NH3 and during the dry season.
The seasonal to and NH3 differences were likely related to the increasing extent to which added is converted to and NH3 as the river flow rate drops from the rainy to the dry season. In this regard, it can be determined from Table 3 that:
1) the total N concentrations within the river water ( , , NH3 combined) decreased on average from the dry to the rainy season by a factor of 3.5, likely due to dilution;
2) at S4, S5, S6, and S7, the combined and NH3 concentrations amounted to 60% of total N during the dry season, and about 40% during the wet season(mostly NH3 only);
3) the least amount of water-dissolved N in the form of and NH3 occurred at S1, S2, and S3 during the dry season (<10%), and increased to about 20% during the wet season.
Cluster analysis. The CA-generated dendrograms are displayed by dry and rainy sampling season in Figure 3 also presenting the normalized parameter values in Table 2. These dendrograms show that locations S1, S2, S3 and S4 are grouped within Cluster A, thereby representing the upper part of the Rio de la Sabana watershed where industrial to residential discharge rates into the river are still low. Within this cluster, pollution levels increased in the order
Figure 3. Zoning the physicochemical and bacteriological sampling results of Rio de la Sabana by way of hierarchical cluster dendrograms. Also shown: normalized parameter scores overlaid on value-coded background (from blue to yellow to dark red) by sampling season Top: dry season; bottom: rainy season. Note the seasonal reversal of the S5 and S6 locations.
S1 < S2 < S3
1) recovered somewhat for both seasons after passing through the S5 and S6 locations, presumably due to the influx of less contaminated floodplain water from the eastern less inhabited part of the Rio de la Sabana watershed;
2) was worst at S5 during the rainy season where pollutant inputs are likely highest due industrial and residential surface run-off;
3) was worst at S6 during the dry season mainly due to accumulating up-river sewage discharge during low river flow rates.
Based on location similarities, Clusters A and B were divided into four zones:
1) The Reference zone (S1, S2): located at the sub-basin’s upper part, meeting national and international standards established for aquatic life protection  - .
2) The Transition zone (S3, S4): a peri-urban zone located between the sub-basin’s upper and mid-low parts had all parameters except pH and DO rising from S3 to S4 above their values at S1 and S2.
3) The Pollution zone (S5, S6): all parameters except pH and DO were well above their average values and above national and international standards for aquatic life protection  - .
4) The Recovery zone (S7).
Correlation matrix. All parameters other than DO and pH were highly positively correlated to one another during the rainy and the dry season, as shown in Table 6. Excluding the high pH values at S4 (pH = 8.86) and S7 (pH = (9.40) yielded a positive correlation between pH and DO for both season, with both remaining negatively related to all the other parameters. The generally gradual pH decline from S1 to S7 may be due to a greater presence of dissolved organic acids in response to a transition from upper-reach groundwater seepage to lower-reach surface run-off. The concurrent decrease in DO would be due to a stimulated chemical and biological oxygen demand due to enhanced oxygen-consuming fecal-matter containing effluents arriving at the S4 - S7 sampling locations    . This would also include the discharge of surfactants (represented by MBAS) and detergents  (NMX-AA-039-SCFI-2001). Consequently, effluent-associated parameters such as EC, BOD5, COD, NH3, , MBAS, and were all highly correlated to one another for both seasons  . Their corresponding dry-to-rainy season reductions as plotted in Figure 2, are undoubtedly due to increased water flow dilution.
Analyzing the parameter correlation matrix with sampling location and season as two additional variables produced the Factor 2 versus Factor 1 plot in Figure 4 by way of principle component analysis (PCA). Here, Factor 1 refers to the S1, S2, S3, S4, S7, S5, S6 sampling sequence, while Factor 2 refers to the rainy (coded 0) versus the dry (coded 1) season. In Figure 4, the parameters entering
Table 6. Correlation matrix for the 14 parameters in Table 2.
Figure 4. Scatter plot of Factor 2, mostly associated with dry versus rainy season sampling, versus Factor 1, mostly associated with sampling location sequenced from left to right as follows: S1, S2, S3, S4, S7, S5, S6.
towards the right are most positively related to sampling location while the parameters entering towards the top are most positively related to season (Factor 2). Both DO and pH were negatively influenced by sampling location (Factor 1), with pH positively related to the transition from the rainy to the dry season, while DO was not much influenced by this transition. The and T(˚C) parameters are shown to be closely and positively related to sampling location but not to sampling season.
Multiple regression analysis. The best-fitted and most significant intercepts (aL) and regression coefficients (bL, cL) for Equation (2) are listed in Table 7 for all 14 parameters (p < 0.1). Also entered are the corresponding R2 and RMSE values, to indicate the extent of variation capture and associated regression error for each parameter. As shown, all parameters were significantly related to sampling location. Only three parameters, namely T(˚C), and DO, were not significantly affected by season. With DO, the increase from the dry to the rainy season was not significant at p < 0.1, even after excluding the low DO results at S4 and S5 (dry season) and S5 (rainy season). With pH, season sampling became significant after excluding the high dry-season pH values at the S4 and S6 (pH = 8.86 and 9.4) from the analysis.
Altogether, the results in Table 7 confirm that the location-specific and season-specific effects on water pollution are highly significant and can now in part
Table 7. Best-fitted Equation (2) regression results for each parameter listed in Table 3, with sampling location, season and Zone 4 (S7) as independent regression variables.
aLocation: S1, S2, S3, S4, S7, S5, S6 coded 1, 2, 3, 4, 5, 6, 7 (note the change in order for S5, S6 and S7). bSeasons coded 0 (rainy), 1 (dry); NS: not siginificant. cpH outliers excluded.
be interpolated for any location along Rio de la Sabana. This can be done by way of the best-fitted Equation (1) and Equation (2) regression results, and by ranking specific locations using existing conditions at S1, S2, …, S7 as a guide. To this effect, locations along the Tres Palos lagoon may eventually experience pollution levels similar to the S5 and S6 locations. However, improved pollution mitigation may lower the pollutant levels at S4 to S7 locations through, e.g., biological denitrification, sulphate and phosphate removal by way of chemical and biological means, and effluent sterilization.
While DO is negatively related to all the other water quality parameters as well as location, its overall variations are best captured by way of the following multivariate regression equation, and as plotted in Figure 5:
DO = (0.84 ± 0.03) pH - (0.0059 ± 0.0005) EC; R2 = 0.923; RMSE = 0.59. (3)
In principle, this equation reflects the association of higher DO and pH and lower EC values in the upper river reach compared to the lower river reach. The lack of an inverse relationship between DO, BOD5 and COD is likely due to other DO influencing factors such as the diurnal oxygen release from photosynthesizing plants and algae within the river .
Figure 5. Plotting actual versus best-fitted S1 to S7 DO values generated with Equation (3). Differences by season are marked by dot outline: none for rainy season; black for dry season.
During the last decades, increased developments on the east and west sides of the Rio de la Sabana floodplain have led to extensive water pollution, with higher pollution levels registered for the dry and rainy seasons. This study marks the S5 and S6 locations as currently the most polluted locations. Upriver, the S4 location may also become increasingly vulnerable to pollution. Downriver, there was a slight reduction in water pollution, likely due to dilution caused by water seepage and run-off from the yet fairly undeveloped low-lying areas on the east and northern side of the river basin.
The analysis of this sampling effort revealed that all 14 water quality parameters were significantly related to sampling locations where current settlement densities and consequently effluent discharge rates would be highest. By induction, the above approach could prove useful: 1) in application at other settlement-affected locations, especially those along stream and effluent discharges towards the coastal Tres Palos and Puerto Marques lagoons, and 2) in encouraging initiatives towards intensifying wastewater treatment along Rio de la Sabana developments and elsewhere.
The work was supported by the Consejo Nacional de Cienciay Tecnología (CONACYT), (National Scholarships for Quality Postgraduate Programs), with field sampling and laboratory analyses done at Ingeniería en los Sistemas de Tratamientos de Aguas S.A. de C.V., and Protección Civil del Municipio de Acapulco.
Conflict of Interests
The authors have declared that no conflict of interests exists.