Urbanization occurs due to migration of people from rural areas to cities and increase in population. Migration to cities happens in search of employment opportunities, better infrastructure facilities, lifestyle changes and so on. Unplanned urbanization leads to uneven distribution of natural resources, thereby affecting the quality of human lifestyle. Thus, urban growth models (UGMs) become mandatory so that the future urban growth pattern of a city can be predicted based on the markov chain principle which is based on the past and current scenarios    . Recently, due to the availability of high resolution temporal satellite data along with the advancements in the GIS based spatial data modeling techniques, the urban growth predictions have become more realistic  .
Cellular Automata (CA) based UGMs are widely used in the prediction of urban growth. CA models have the ability to handle spatio-temporal dataset and they model the urban growth effectively. A typical CA model consists of five elements: Cell Space, Cell State, Cell Neighbourhood, Transition Rule and Time  . UGMs based on CA predict the urban growth with high accuracy than any other mathematical models   . SLEUTH, a UGM based on CA, is also being widely used to model the urbanization and is implemented to predict the urban growth of the cities of Mashad, Iran  . The study implemented the urban growth model with transportation data and highlighted the efficiency of SLEUTH model when other socioeconomic data with high temporal accuracy were not available. SLEUTH was used to predict the urban growth of Matara city, Sri Lanka  . It was found out that out of 66 Grama Niladari Divisions (GNDs), 29 GNDs would be urbanized in 2030. This prediction results would be helpful for urban planners to devise further urban planning policies of Sri Lankan cities.
For an effective UGM, along with the temporal land use dataset, various agents of urbanization should also be involved in the modeling process. CA models along with agents of urbanization including population, transportation were used to model the urban growth of Bindura district, Zimbabwe  . Apart from the land cover maps of the study region, agents of urbanization including population density, elevation, distance to town center and rivers were used to predict the urbanization of the district in 2030. The study revealed that an efficient urban planning policy is needed for the study region as the future urbanization would affect the sustainable development of the district. Agents based CA model was used to predict the urban growth of Pearl River Delta region, one of the fastest growing regions of China  . The study implemented the urban prediction for the year 2052 using the land cover maps and agents of urbanization including population, elevation, transportation network data and the Master plan of the region. The prediction outputs can be employed to identify the potential areas of urbanization, which would be helpful for the urban planners to plan the future urban development of the region.
Nowadays, machine learning algorithms such as neural networks are integrated with CA and GIS to predict the urban growth of a city. Neural Network integrated CA (NN-CA) model overcomes the uncertainty of transition rule determination found associated with the traditional CA models  . Further, NN-CA model has the ability to model complex, non-linear, spatio-temporal data effectively through its adaptive learning process. NN-CA based urban prediction of 2019 was implemented for Dhaka, Bangladesh  . The study had revealed that 58% of the study area would be urbanized by 2019 while 46% of the study region had been reported as urbanized in 2009. NN-CA based urban model is also used to predict the urbanization of European cross-border region  . The study used industrial, commercial and transportation data along with the land cover maps of the study region to predict the urbanization of the year 2000. The study shows the efficiency of NN-CA model in handling big data and predicting the urban growth in cross border regions.
Deep belief or Belief theory implemented in CA (DB-CA) models is being used for the urban prediction. Unlike, neural networks, DB-CA models calculate the conditional dependencies between the input dataset, i.e., t1 and t2 urban or land cover maps  . In NN-CA model, data sampling is required for training and testing the model based on the input data and also the choice of the activation function for the model involve user’s intervention  . However, in DB-CA model, the prior data is used as such for the prediction modeling which gives more accurate results. DB-CA based models are used for predicting the urban growth of Jiaxing City in 2015  . The study showed the efficiency of DB-CA model (k: 0.77) over NN-CA model (k: 0.63) in predicting the urbanization. Further, urban growth predictionin 2015 was implemented using logistic regression based CA model (LR-CA) and DB-CA for Beijing, Tianjin and Hebei regions of China  . Results show that DB-CA model predicted urban growth of the study area with higher accuracy (k: 0.83) than that of LR-CA model (k: 0.81).
In the present study, the efficiency of NN-CA and DB-CA models on predicting the urban growth of Chennai Metropolitan Area (CMA) in the year 2017 was assessed. The model which provides higher accuracy was used to assess the influence of different neighbourhood configurations on the prediction output of 2017. Further, the distribution of urban sprawl for 2010, 2013 and 2017 of the study region was measured through Shannon’s Entropy.
2. Study Area
Chennai, the capital state of Tamil Nadu, is the India’s fourth largest metropolitan cities after Mumbai, New Delhi and Kolkata. In the current study, urban growth prediction of CMA in 2017 was implemented using NN-CA and DB-CA models. In India, each state is divided hierarchically into number of districts, taluks and revenue villages for the purpose of administrative activities. In the current study, CMA falls under three districts including Chennai and parts of Thiruvallur and Kancheepuram districts and is located on the Coromandel Coast (Figure 1). As per Chennai Metropolitan Development Authority (CMDA), 176 km2 of Chennai district, 637 km2 of Thiruvallur district including Ambattur, Tiruvallur, Ponneri and Ponnamallee taluks and 376 km2 of Kancheepuram district comprising Tambaram, Sriperumbudur and Chengalpattu taluks constitutes CMA (http://www.cmdachennai.gov.in/index.html). It extends over an area of 1189 km2. For better administrative purposes and efficient urban development planning, Housing and Urban development department  had planned to extent the area of CMA to 8878 km2 including the entire districts of Thiruvallur, Kancheepuram and Arakkonam and Nemili Taluks in Vellore district.
Chennai district has a coastal length of 19 km and CMA has a total coastal length of 46 km. Famous beaches in Chennai are Marina, Elliot’s or Besant Nagar, Thiruvanmiyur beaches of which Marina beach is the world’s second longest beach and it attracts number of tourists every year (Figure 1). Chennai is well
Figure 1. Study Area Map showing the expanded administrative boundaries of Chennai City.
connected to other cities through its developed transportation networks. Chennai central is the main terminus railway station and is adjacent to Chennai Egmore, another major railway junction. One of the fastest growing railway hubs is Tambaram railway station and is located in the southern region of the study area which played a major role in the expansion of urban sprawl of Chennai towards South (Figure 1). In addition, the city has also recently commenced its Metro Rail service, a rapid transit system, in 2015. The city has an international airport located at Meenambakkam about 14 km from the Chennai city center. Three major rivers namely Kosasthalaiyar, Cooum and Adyar pass through CMA and major lakes including Sholavaram Lake, Red Hills Lake and Chembarambakkam Lake are situated within the study region which serve as the major source of drinking water.
Chennai is the most densely populated city in Tamil Nadu, with a density of 26,553 people per km2. Chennai is the largest commercial and industrial centers of South India as well as a cultural, economic, educational and Information Technology (IT) centers. Majority of Chennai’s economy is based on the automobile sector, software services, hardware manufacturing, healthcare and financial services. It is also known as ‘Detroit of India’ with thriving automotive industries. Due to the boom in industrial, automobile, electronic and entertainment sectors, the study region has experienced rapid urbanization over the past two decades  (Figure 1).
3. Data and Methods
In the present study, following dataset were used in the urban modeling of CMA.
1) Urban maps of 2010, 2013 and 2017 were derived using 15 meters resolution (Multispectral PAN merged) satellite images of Landsat 7 ETM of 2nd June 2010, Landsat 8 OLI of 17th May 2013 and 25th March 2017 respectively (https://earthexplorer.usgs.gov/).
2) Google Earth was used for the validation of urban maps of CMA during the study periods along with field knowledge.
3.1. Prediction Modeling
The urban growth prediction was implemented using NN-CA and DB-CA models (Figure 2) with the proximity map of existing built-up of 2013 and the neighbourhood map along with the constraints (discussed in Sections 3.2-3.4).
3.1.1. Using NN-CA Model
Multi-layer Perceptron (MLP) of Land Change Modeler (LCM), Terrset (http://clarklabs.org/terrset/land-change-modeler/), was used for the prediction of urban growth based on NN-CA for the study region in 2017. A typical MLP architecture includes one input layer, one or more hidden layers and an output layer. 2n/3 number of hidden layers is most appropriate in modeling complex scenarios  (n: number of input layers). Here, n = 2. The urban maps of 2010,
Figure 2. Methodology adopted in the study for the prediction of urban growth of CMA using DB-CA and NN-CA models.
2013 along with the proximity map of existing built-up of 2013 along with 3 × 3 neighbourhood map were fed into the NN-CA model and it was run for 10,000 iterations with dynamic learning rate  . An accuracy rate of 80% is considered to be accepted and it can be understood that the model has learnt well based on the input datasets  . In our study, an accuracy rate of 92% is reported.
3.1.2. Using DB-CA Model
In case of DB-CA model, the urban prediction of 2017 was implemented using Bayes’ rule  . The prediction of urban growth in 2017 (t3) based on the urbanization of 2013 (t2) and 2010 (t1) was implemented through DB-CA model using Equation (1). Conditional probability of finding urban pixels in t3 given urban neighbours ≥ 3 in t2, is given by
Probability of urban pixels in t2, is calculated as
Conditional probability of urban neighbours ≥ 3 in t1 given urban pixels in t2, is given by
Probability of urban neighbours ≥ 3 in t2, is calculated through
: Number of urban neighbours ≥ 3 in t1;
: Number of urban pixels in t2;
: Number of urban neighbours ≥ 3 in t2;
: Total number of pixels in the study region.
DB-CA model predicts the posterior probability based on the prior probability and the likelihood .
In the current study, it is assumed that probability of built-up at time t1 getting converting into non-built-up at time t2 remains zero. Hence, it is assumed that the number of built-up pixels of 2013 remain built-up in the year 2017 also. Based on the DB-CA model, the expected number of non-built-up pixels to be converted to built-up in 2017 was calculated. The model was run in ArcGIS environment till the expected number of non-built-up pixels are converted to built-up in 2017.
3.2. Urban Cover
Support Vector Machine (SVM) technique was used for the preparation of land cover maps using the Landsat data of the years 2010, 2013 and 2017 (mentioned in Section 3) of the study region. SVM of supervised classification  was used to classify the land cover features of CMA into Built-Up, Vegetation, Waterbody and Openland. This current study aims at predicting only the urban growth of the study region. Hence the land cover maps were converted into binary maps with only two categories namely “Built-Up” and “Non-Built-Up”. Vegetation, Waterbody and Openland were combined into “Non-Built-Up” category. Thus, the urban maps of CMA for the years 2010, 2013 and 2017 were obtained.
3.3. Identification of Hotspot
In an urban prediction model, the state of a pixel, i.e., the land cover category, at a given time, changes based on its state and its neighbouring cells at the previous time step  . Every city has its own development activities and within a specific time interval, maybe 4 or 5 years, urbanization might happen spontaneously even in regions where there are no urban neighbours. In the present study, based on Government policy  areas of 500 m around OMR (Old Mahabalipuram Road) are found to be the most potential areas i.e., hotspots of urbanization. Hence, a buffer zone of 500 meters around OMR is included only in the urban map of 2013 as hotspot in the prediction modeling. Thus in our current study, for the prediction of urban growth of CMA in 2017, urban maps of 2010 along with the urban map of 2013 with hotspot were used (Figure 3).
3.4. Agents of Urbanization
Apart from the urban maps, various agents of urbanization play a major role in determining the state of a pixel in the prediction modeling. In regions, where the urban growth is compact, the urban map of the previous time step alone is sufficient to predict the urbanization of the next time step  . In the current study, for predicting the urban growth of CMA in 2017, along with the urban maps of 2010 and 2013, “Existing Built-Up of 2013” was used as the agent of urbanization. A 3 × 3 neighbourhood configuration was used in the prediction modeling. Based on  areas that had been prohibited for urban development had been identified and introduced into the models as “Constraints” where urban growth prediction was avoided during the modeling process. In the present study, Coastal Regulation Zone (CRZ)-I, major water bodies, areas around airport at Meenambakkam, areas of 100 m around the boundary of Indian Air Force station near Tambaram, Pallikaranai swamp area and green belt areas of 15 m along Poonamallee and Red Hills bye pass roads had been identified as constraints (Figure 4).
Figure 3. Land Cover Maps of CMA during the study periods. (a) June 2010; (b) May 2013; (c) March 2017.
Figure 4. Inputs of DB-CA and NN-CA models for the urban growth prediction of CMA in 2017 other than urban cover maps. (a) Proximity Map of Existing Built-Up of 2013; (b) 3 × 3 Urban Neighbourhood Map; (c) Constraints.
3.5. Influence of Neighbourhood Configuration on the Prediction Outputs
Cell Neighbourhood plays a major role in predicting urban growth using CA model  . But the choice of appropriate neighbourhood configuration always remains uncertain and it varies from region to region. Thus for the prediction of urbanization of CMA in 2017, DB-CA model was run using different neighbourhood configurations (Rectangular: 3 × 3, 5 × 5, 7 × 7 and Circular: 3 × 3) to identify the most appropriate one (Figure 5).
Validation is an important process which enables the users to understand the accuracy of the prediction model. The most widely adopted accuracy assessment technique is “Error Matrix” or “Contingency Table” through which Overall Accuracy (OA) and Kappa coefficient (k) are derived  . Also the areas of hits (correctly predicted), misses (under-predicted) and false alarms (over-predicted) were analyzed.
4. Results and Discussions
4.1. Land Cover Maps
The land cover maps of the study region produced an OA of 92.23%, 87.87% and 87.57% and k values of 0.89, 0.8437 and 0.8412 for the year 2010, 2013 and 2017 respectively. The land cover maps (Figure 3) show a remarkable increase in built-up from 2010 to 2017. CMA comprises of total area of 1189 km2 of which 237.41 km2, 400.57 km2 and 572.11 km2 are mapped as urban cover in 2010, 2013 and 2017 respectively. Figure 3 shows that the urban cover in 2017 had increased more than double the area in 2010 which is evident from the population data. CMA had a population of 7.04 million in 2001 which increased to 8.87 million in 2011 and it is estimated to have a population of 11.19 million in 2021  .
Figure 5. Types of neighbourhood configurations used in the prediction of urban growth of CMA in 2017. (a) Rectangular (3 × 3); (b) Rectangular (5 × 5); (c) Rectangular (7 × 7); (d) Circular (3 × 3).
This is due to the fact that many IT centres, industries, educational institutions, banking sectors and so on, had emerged in the study region which attracted many rural people to migrate to CMA and the number of industries within the study region had increased five times from 7782 in 2001 to 42,000 in 2017  . Hotspot location was included into the urban map of 2013 in a linear pattern of 500 m around OMR for the prediction of urban cover in 2017 to capture the urbanization especially due to the emergence of IT centres in this region.
4.2. Assessing the Urban Sprawl through Shannon’s Entropy
From 2010 to 2017, the urban cover in CMA was increasing (Figure 6(a)). The study area was divided into five distance based zones of each 7km with the State Secretariat as the centre (Figure 6(b)). The entropy values of urbanization of 2010, 2013 and 2017 were found to be 0.8115, 0.9462, and 0.9377 respectively. The maximum entropy value with five zones is 1.6094 (loge5). The entropy value has increased from 2010 to 2013 and it decreased after 2013. This explains that the urban sprawl was distributed till the year 2013. After 2013 the urbanization started to get congested. In 2017, the concentrated type of urban growth was observed in the 7 km, 14 km and 21 km buffer zones. The Chennai Corporation falls within the 7 km and 14 km buffer zone. This shows that the Corporation boundary had become saturated with the urban growth. Numgambakkam, Tondiarpet, Madhavaram, Guindy fall within these two zones. The 21 km buffer zone comprising Meenambakkam, Red Hills lies immediately outside the Chennai Corporation, i.e., at the fringe of the boundary. Since it lies close to the Corporation boundary, it also experiences congested urban growth in 2017. Because of the availability of employment opportunities and socio-economic factors including educational institutions, hospitals, religious centres, recreational centres,
Figure 6. Urban Sprawl and Entropy of the study region. (a) Urban sprawl of the observed urbanization of CMA in 2010, 2013 and 2017; (b)Entropy values of urbanization of the study region for five distance based buffer zones from the State Secretariat.
availability of transportation facilities, these three zones have become congested in 2017  .
The 28 km and 35 km zones lie away from the Chennai Corporation and experiences distributed type of urban sprawl as the socio-economic factors are comparatively lesser in these two zones. Kundrathur, Minjur, Vandalur, Thiruninravur are located within these zones. Thus, the entropy analysis of the study region clearly indicates that compact type of urban growth was found within a distance of 21 km from the State Secretariat whereas the growth is distributed in the zones lying at 28 km and 35 km distance from the Secretariat.
Based on the urbanization in each of these zones, it is seen that 7 km zone is the most urbanized, i.e., 81.1% of the zone area is urbanized in 2017, followed by 14 km, 21 km, 28 km and 35 km zones which has 69%, 46.4%, 34.1% and 29.4% of urban cover in 2017. The 7 km and 14 km zones alone have an urbanization of 232.75 km2 whereas the urbanization of the study area in 2017 is 572.11 km2. This shows that almost 40% of the urban growth of CMA happens within the Corporation extent  . This further emphasises the fact that the Corporation boundary and its periphery (7 km, 14 km and 21 km) experiences compact type of urbanization whereas urbanization is distributed in the 28 km and 35 km zones which had started to experience urbanization since 2013.
4.3. Predicted Urban Growth
NN-CA and DB-CA models along with the input datasets (Figure 4) predicted 606.34 km2 and 665.09 km2 areas as built-up in 2017 (Figure 7) whereas the observed area of urbanization in 2017 is 572.11 km2. The accuracy of the predicted output of DB-CA (OA: 86.08%; k: 0.73) is found to be higher than that of NN-CA model (OA: 85.51%; k: 0.71). On validating the predicted outputs with
Figure 7. Observed and Validated Urban Cover of CMA. Observed Urban Cover in (a) 2010; (b) 2013 with hotspot; (c) 2017. Validated Urban Cover of 2017 using (d) NN-CA model; (e) DB-CA model.
the observed urban map of 2017, it is observed that the areas of hits (correctly predicted built-up) was found to be from DB-CA model (Hits: 524.14 km2) than using NN-CA model (Hits: 502.42 km2).
Misses represent under-prediction and false alarms indicate over-prediction. An UGM can be called efficient, if it has lesser areas of misses and false alarms (given by k) apart from having higher areas of hits (given by OA)  . In this study, misses were lesser in DB-CA model (47.97 km2) whereas 69.7 km2 of area was reported as misses in NN-CA model. However, the false alarm was higher in DB-CA model (140.95 km2) than that of NN-CA model (102.69 km2). When an UGM predicts higher areas of hits, the false alarm will also be higher with lesser areas of misses  . Thus, the areas of hits and false alarms are higher and misses are lesser in DB-CA model when compared to those of NN-CA model. Further, to analyse the prediction outputs based on DB-CA and NN-CA models, the validation outputs were divided based on the five distance based zones of CMA.
4.3.1. Zone Wise Assessment of the Accuracy of the Prediction Outputs
Within 7 km from the Secretariat, the observed area of urbanization in 2017 is 68.46 km2. DB-CA and NN-CA models predicted 67.93 km2 and 67.26 km2 as built-up in 2017. Misses were reported to be lesser in DB-CA model (0.54 km2) than that of NN-CA model (1.2 km2). In this zone, both the models were able to capture the urbanization almost close to the reality. However, DB-CA model predicted with better accuracy than NN-CA model with misses almost less than 1 km2 area. In the 14 km buffer zone, where the observed urban growth in 2017 is 164.29 km2, DB-CA model predicted 157.82 km2 and NN-CA predicted 154.62 km2 area of urban growth. Misses were lesser in DB-CA model (6.47 km2) than that of NN-CA model (9.67 km2). These two zones fall within the Corporation boundary and DB-CA model reported only 7.01 km2 area of misses and 225.75 km2 as hits whereas NN-CA model had misses of 10.87 km2 area and hits of 221.88 km2. Urbanization in the 21 km buffer zone which lies at the periphery of the Corporation boundary was 184.37 km2 in 2017. DB-CA model had hits of 166.73 km2 while NN-CA had only 158.94 km2 area of hits. DB-CA model had lesser areas of misses (17.64 km2) than NN-CA model (25.42 km2). In the 28 km and 35 km zones, where the urbanization is distributed, urbanization in 2017 was 155.15 km2. DB-CA model produced hits of 131.85 km2 with misses of 23.3 km2. NN-CA model reported hits of 121.75 km2 and misses corresponds to 33.4 km2 (Figure 8). This shows the potential of DB-CA model to predict the urban growth better than NN-CA model in both congested and distributed type of urban growth regions.
Further, the area of misses reported within the Corporation boundary based on DB-CA model (7.01 km2) was much lesser than that of NN-CA model (10.87 km2). In a region where urban growth is congested, for a given pixel, the number of urban neighbours is comparatively higher than in a region where urbanization is dispersed  . Hence an efficient UGM should be able to capture the
urban growth closer to the reality in a congested urban growth region. In our study, DB-CA model predicted the urban growth with lesser areas of misses than NN-CA model not only in the congested zones (7 km, 14 km and 21 km) but also in the zones where the urban sprawl is distributed (28 km and 35 km).
The zone based analysis of the prediction outputs reveals that the DN-CA model has the potential to predict the urbanization of the study region better than NN-CA model. While, DB-CA model showed better prediction accuracy in the congested zones, the efficiency of the DB model is more profound in the dispersed zones where NN-CA model had comparatively higher areas of misses.
4.3.2. Assessing the Influence of Neighbourhood Configuration on the Prediction Output
Since, neighbourhood plays a major role in urban prediction based on CA model, the effect of neighbourhood configuration on the prediction output was assessed. The prediction model based on DB-CA model was run with four different configurations including rectangular (3 × 3, 5 × 5, 7 × 7) and circular (3 × 3). Validation results of the prediction outputs with different neighbourhood configurations are given in Table 1. Result suggests that a rectangular 3 × 3 neighbourhood is the most appropriate configuration for predicting the urbanization of CMA in 2017  .
Figure 8. Zone wise analysis of the validation outputs of DB-CA and NN-CA models.
Table 1. Validation of prediction outputs based on DB-CA model with different neighbourhood configurations.
The study was conducted to model the urban growth of CMA using NN-CA and DB-CA models with hotspot, i.e., potential region for development based on the City Master Plan. Results revealed that DB-CA model proved to provide more accurate prediction output than NN-CA model. Effects of different types of neighbourhood configurations on the prediction output based on DB-CA model were assessed and it was found out that 3 × 3 rectangular neighbourhood was the most appropriate one for modeling urban growth of CMA. To understand the pattern of urban sprawl of the study region, entropy analysis was performed. Entropy results suggested that Chennai Corporation had already become saturated with urban growth and in the periphery of the Corporation boundary. The urbanization has begun to get congested whereas areas away from the Corporation experience distributed growth. This study highlights the importance of the Government’s decision  in expanding the area of CMA as it will enable the urban planners and policy makers to devise appropriate planning actions so that the study area does not get congested anymore at least within the Chennai Corporation as it is already brimming with urban growth.