Classification of satellite image is a very significant part of remote sensing image analysis, object and pattern recognition, mapping and monitoring of forest covers and natural resources. The process is commonly utilized for generation of thematic maps like forest, land cover/use maps and spatial pattern maps. Forest and land cover types classification using satellite data has been adopted extensively. Many supervised image classification algorithms have been developed and utilized for forest and land cover mapping, ranging from machine learning algorithms to traditional classifiers   . Most of the algorithms have been reported to perform reasonably and to enhance the classification accuracy    . However, it is difficult to identify the best image classification algorithm which suits a particular environment. This is simply because numerous factors tend to affect the results: scheme of classification, satellite data in use, image pre-processing, training and validation sample selection and collection, learning algorithm and post processing approaches and validation techniques  . For that reason, evaluation of commonly applied machine learning algorithms is essential using same satellite dataset and scheme of classification to aid the selection of suitable algorithm for a particular application. With the advancement of the remote sensing technology, new classification algorithms are developed and invented rapidly. Consequently, it is very important to assess their performance in various kinds of environment using various types of remote sensing datasets 
The main objective of this study therefore, is to evaluate the capability of the widely applied parametric and non-parametric supervised machine learning algorithms for forest resource and land cover mapping in tropical environment using SAR and optical datasets. Specifically to assess which classification algorithm gives better results using independent and integrated Landsat TM and ALOS POLSAR datasets for categorization of forest resource and land cover mapping.
2. Study Area
The satellite image utilized for this study is of Bereko and Duru-Haitemba forest reserve in Babati, Tanzania. Lying between latitude 4˚15' and 4˚30' South, and between longitude 35˚35' and 35˚50' East (Figure 1). The area is categorized into six main land cover/use types: water (e.g. lakes), shrubs, natural dense forest and moderate forests  (Figure 1).
3. Dataset and Methods
3.1. Dataset and Training Samples
Both Optical and SAR satellite images has been utilized. Landsat 5 Thematic Mapper (TM) 30 m spatial resolution of November 4th, 2009 and ALOS PALSAR L band  of September 13th, 2009. The Shuttle Radar Topographic Mission (SRTM) Digital Elevation Model (DEM) 90 m resolution   has been applied
Figure 1. Study area: (a) Duru-Haitemba and Bereko forest reserve classified image; (b) Landsat 5TM RGB:432.
for image preprocessing. In addition, a set of points based on Global Positioning System (GPS) and knowledge-based information acquired in October 2009, Normalized Difference Vegetation Index (NDVI)  and Google Earth images were used for ground truthing.
Training and validation samples for all land cover classes (i.e. water, shrubs, natural dense forest and moderate forests were) selected based on ground truth data, GPS based point locations and knowledge based information acquired on the site. The collected samples were divided into two groups, first as test sample (70% of the collected sample) and as second validation sample (30% of the collected sample).
3.2. Data Processing
ALOS PALSAR HH and HV polarization images were collected in slant range single look complex format. The images were transformed from slant range to ground range resolution using a multi-looking procedure of 9 × 2 (i.e., nine looks in azimuth and two looks in range)    . Resulting images were of 29.9 m × 27.7 m resolution in range and azimuth consecutively. This procedure improves radiometric resolution and squares the pixels in ground range geometry, which is same as Landsat TM spatial resolution (e.g. 30 m). For speckle reduction a refined Lee spatial filter  with a 7 × 7 window size was adapted. The topographic effects on the ALOS PALSAR backscattering was accounted by applying a radiometric terrain correction to convert backscattering in sigma-nought σ˚ value to the improved backscattering in gamma-nought γ˚    . Landsat-5 TM digital numbers were converted to surface reflectance (SR) and normalized using Atmospheric and Topographic Correction 3  .
3.2.1. Landsat TM and ALOS PALSAR Derivatives
Several ALOS PALSAR and Landsat TM derivatives were extracted. Especially vegetation indices (VI), Principal Component Analysis (PCA), SAR quotient bands HH/HV and HV/HH  and Radar Forest Deforestation Index (RFDI)  and the Grey-Level Co-Occurrence Matrix (GLCM)  textural feature measures. Normalized Vegetation Index (NDVI), (IR − R)/(IR + R)  , and Soil leaf Area Vegetation Index (SLAVI )  (Table 1).
3.2.2. Integration of ALOS PALSAR and Landsat TM data
Various input bands were prepared ready for image classification. A multi-sensor integration image fusion approach was adapted   . The multi-sensor integration approach combines n images in n different layers algorithmically, without creating a new set of images   
3.2.3. Classifiers under Study
For classification of forest covers and land cover mapping of the independent bands and integrated ALOS PALSAR/Landsat data and their derivatives Three non-parametric and one-parametric classifiers were tested on their ability. Random Forest  , Support Vector Machine (SVM)     and Neural Network (NN)  supervised classifiers. Maximum likelihood (ML)  parametric classifier was used for comparison purpose. The ML is a conventional
Table 1. Data categorization for various classification set-ups adapted from Deus  ; SR = Surface Reflectance, AP = ALOS PALSAR, mea = mean, cor = correlation, var = variance, con = contrast, and sec = second moment are GLCM texture derivatives.
classification algorithm that uses Gaussian distribution principle for data segmentation. The technique is robust and well-known for general classification problems. However, it may have some difficulties in classifying data coming from different sources, such as optical and SAR data. MLC is one of the extensively utilized classifier in the field.
The SVM is basically a binary class classification method based on machine learning and using support vector in the data classification.   . Linear, polynomial, radial basis function and sigmoid are the four common kernels available in remote sensing packages. A careful selection of parameter setting can improve the performance of the SVM  . The Gaussian radial basis kernel function and a penalty parameter of 100 were selected based on trial and error. However, the kernel and penalty parameter selected are recommended to be the best for land cover classification  .
NN classifier has arbitrary decision boundary abilities and could adapt to various data types and input structures easily, fuzzy output values and suitable generalization for use when integrating manifold images  . The classifier benefits from parallel computation, the capability to estimate the non-linear relationship between the input data and desired outputs, and fast generalization ability    . The NN parameter setting was adapted based on trial and errors. As an activation function a logistic function was chosen, one hidden layer and 1000 training iterations were also designated.
RF is a machine ensemble approach based on classification and regression trees and can be used for both image classification and regression analysis    It makes use of multiple self-learning decision trees to parameterize models and use them for estimating categorical or continuous variables   . The number of trees is a user-defined parameter. RF normally gives higher overall cross-validation accuracies compared to other classification approaches  Generally, Non-parametric classifiers yield higher classification accuracy compared to parametric classifiers    . To run the classification process and assess the potential of parametric and non-parametric classifiers. The data are grouped into three major groups A-C (Table 2), Group A, consists of surface reflectance bands from Landsat 5 TM image and its derivatives (i.e. Vegetation index, GLCM textures and PCA). Group B, comprises of individual ALOS PALSAR backscattering and derivatives. Group C, involve the integration of surface reflectance, backscattering and their derivatives. However, to maximize the overall classification accuracy the best blend of textures, indices and features were identified. The selection of relevant integration bands was carried out based on trial and error classification.
3.2.4. Classification Accuracy Assessment
To test the capability of parametric and non-parametric classifiers a validation dataset was used for accuracy assessment. Three terms that describe the classification accuracy were utilized (i.e. overall accuracy (OA), kappa coefficient (κ))  and F1 score index  . The overall classification accuracy is the percentage of the pixels that have been classified correctly in the validation dataset  . The Kappa coefficient is a metric that compares an observed Accuracy with an expected accuracy. It is used not only to assess a single classifier, but also to assess classifiers amongst themselves. F1 score index merges producer’s and user’s accuracy into a fused quantity was computed (Equation (1))  . Producer’s accuracy is used to estimate the omission error to a certain class and it is the probability that a reference site has been classified correctly. User’s accuracy is used to estimate the commission error and it is the probability that a pixel classified on the image signifies the actual class on the ground. F1 score enables a better evaluation of the land cover class-wise accuracies. The score varies between 0 and 1 where by 0 signifies the worst results, and 1 is the best accuracy achieved.
To compare the capability of the four classifiers under study, a two-sample t-test  was applied on the overall classification accuracy obtained using different data categories (Table 1). The influence of surface reflectance and backscattering derivatives on the classification accuracy the two-sample t-test was utilized. The two-sample t-test assesses whether two sample means unrelated. A difference in mean indicates that the two samples are dissimilar. The test is normally applied when the test makes use of a small sample size, the variances of two normal distributions are unknown and the experimentation involve a small sample size.
4.1. Classification Results Based on the Four Classifiers
The classification results attained based on different data groups (A-C) (Table 1) and tested classifiers are presented in Figure 2 for Overall accuracy, Figure 3 for Kappa coefficients and Table 2 for F1 score attained for every land cover type. Using maximum likelihood classifier, data group A, group A1 surface reflectance and derivatives depicts higher overall classification accuracy (average OA = 93.35%) and higher F1 score index values (F1 = 0.95 - 1) for all land cover types. Group B, backscattering values and derivatives, depicts lower overall classification accuracy (average OA = 53.92) and lower F1 score index (F1 = 0.18 - 0.53) values for dense forest, moderate forest and bare soil land cover classes. Group C, integration of surface reflectance, backscattering and derivatives provides good overall classification accuracy (average OA = 87.25%) and higher F1 score index values (F1 = 0.77 - 1) for all land cover classes (Figure 2 and Table 2). Using, support vector machine, both category A and C provides the best classification accuracy in terms of overall classification accuracy (average OA = 95.82% and 97.20% respectively) and F1 score index values varied between 0.94 and 1 for all land cover types. Category B indicates poor lower overall classification accuracy (average OA = 57.9%) and lower F1 score index values are obtained for dense forest, moderate forest and bare soil covers ranging from 0.07 to 0.68 (Table 2).
Figure 2. Comparison of the overall classification accuracy achieved on different data categories (Table 1), for MLC, NN, SVM and RF classifiers based on the validation samples.
Table 2. F1 score accuracy comparison of land cover classification results for different data groups achieved for the tested classification algorithms; (a) SVM (b) RF (c) NN (d) MLC. Results are based on the validation dataset.
For Random Forest both data group A and C provides the best classification accuracy in terms of overall accuracy (average OA = 95.7% and 96.9% respectively). Higher F1 score index values are obtained for all land cover types ranging between 0.94 and 1 (Table 2). Group B indicates higher overall classification accuracy compares to other classifier (average OA = 61.08%) and Lower F1 score index values varying between 0.38 and 0.61 are obtained for dense forest, moderate forest and bare soil covers. For neural network classifier, like other classifier best results are obtained with data group A and C (average OA = 91.03% and 89.02% respectively). Lower F1 score values for dense forest, moderate forest and bare soil covers ranging from 0.1 to 0.56 (Figure 2 and Table 2). For all classifiers when using SAR data water is the only land cover type classified with very higher F1 score index values followed by shrubs (Table 2).
4.2. Evaluation of RF, SVM, NN and MLC Classifiers
The non-parametric classifiers (RF, SVM and NN) are assessed together with the maximum likelihood classifier (MLC) on different data subgroups. Figure 3 indicates the performance of the three classifiers in terms of Kappa coefficient (KC). The results of data group A and C Landsat based surface reflectance, PALSAR backscattering and derivatives, indicate that the three tested machine learning classifiers as well as MLC have good performance (average KC = 0.89 and 0.91 respectively) and there is no statistically significant difference at 95% confidence interval in their results (Figure 3).
However, MLC provides the poorest accuracy compared to the machine learning classifiers. In these groups, SVM and RF have better performance at 95% confidence interval compared to NN and MLC classifiers. For group B, SAR
Figure 3. Comparison of Kappa coefficients for various classification results for different data categories (Table 1) and classifiers. The results are based on validation samples.
backscattering and derivatives, all classifiers displayed poorer performance (Average KC = 0.50), though in most cases machine learning algorithm performed better compared to MLC (Figure 3). In this group, SVM and RF have more or less same performance and there is no substantial difference in their performances at 95% confidence level. Generally, in all data categories, SVM and RF produce better classification value at 95% confidence interval compared to NN.
Table 3 presents classifier comparison results of the three classification algorithm utilized based on the two sample t-test. The performances of all classifiers within each group are compared at 5% significance level. The results indicates that there is a statistical significant difference between SVM and NN classifiers (p-value = 0.05) at 5% significant level. There is no statistical significant difference between SVM and RF classifiers (p-value = 0.834) at 5% significance level. Both SVM and RF classifiers indicate a significance difference when compared to MLC (p-value = 0.001). RF and NN classifiers are statistically significant different at 5% significance level (p-value = 0.012). NN and MLC indicates that there is no statistically significant different at 5% significance level (p = 0.622) (Table 3).
In this research a comparison of supervised learning algorithm using independent and integrated landsat TM and ALOS PALSAR data has been carried out. The assessment of the performances of the four classifiers under study shows that both parametric and non-parametric classifiers have good performance when using Landsat TM data (Figure 2 and Figure 3). Attarchi and Golaguen  attained same results indicating that both parametric and non parametric classifiers performs well for Landsat based surface reflectance and derivatives. SAR data and derivatives were effectively well classified by RF and SVM classifiers compared to MLC at 95% confidence level. This is probably due to the fact that SAR data and their derived parameters usually do not follow a Gaussian distribution, which is a basic assumption for several classification approaches.
Table 3. T-test statistic results for the comparison of the classifiers. The comparison was done based on the overall classification accuracy attained for each data category.
Notes: A p-value ≤ 0.05 indicates the two samples are statistically significant different at 5% significance level. The p-value of greater than 0.05 implies that there is no significant difference between the two samples on comparison.
On the integration of SAR and Landsat data all classifiers indicate good performance, however, SVM and RF has the best performance in relation to NN and MLC at 95% confidence interval. Based on previous studies, parametric classifiers like MLC are not worthy when using multi-source remote sensing data. The superior performance of SVM and RF compared to NN could be due to the fact that SVM and RF has the potential to handle high dimensional data  .
Looking on the performance of classifiers based on data category, results in category A, subgroup A1-A3, Landsat surface reflectance and its derivatives indicates that non-parametric classifiers (SVM, RF and NN) as well as MLC performs well (Figure 2 and Figure 3). However NN indicate the lowest performance for data group A2. In category B, subgroup B1-B5 MLC has lowest performance. In this category MLC and NN have the more or less same performance for data group B2 and B4 and there is no significant difference in their performances at 5% confidence level. Subgroup A3 and C1 are the only data categories where all four classifiers indicates nearly similar performances (Figure 2 and Figure 3). In all categories A-C, SVM and RF provides the best performance. In data category C, specifically for C2-C4 NN display lowest performance compared to SVM and RF as well as MLC (Figure 2 and Figure 3). Attarchi and Gloaguen  , also indicated a poorest performance of NN compared to SVM and RF. As depicted in Figure 3, SVM and RF appear to have more or less same performance. This is simply because accuracy improvement of land cover mapping by new algorithms are hardly observable   . Li et al  attained more or less similar classification accuracy values for SVM and RF.
The performances of all classifiers within each group are compared at 5% significance level. Comparing all classifiers using the two sample test, the results indicates that there is no statistical significant difference between SVM and RF classifiers at 5% confidence interval. Both SVM and RF classifiers indicate a significance difference when compared to NN and MLC. RF and SVM show a statistically significant different at 5% significance level when compared to MLC. NN and MLC indicates that there is no statistically significant different at 5% significance level (Table 3).
6. Conclusion and Recommendations
The potential of parametric and non parametric classifiers has been examined based on integration of Landsat TM and ALOS PALSAR data. All classifiers under study performs well in terms of overall accuracy when using Landsat TM and derivatives, however SVM and RF are superior compared to others. For SAR data SVM, RF and NN performs well compared to MLC. On integration of Landsat and PALSAR data SVM and RF seems to be very powerful compared to NN and MLC especially when combining TM derivative, backscattering and GLCM textures. Generally, the overall results indicates the robustness SVM and RF at 5% significance level for land cover classification in tropical area. However, the process of selecting a suitable classifier for a certain task depends much on tradeoffs among classification accuracy, time consumption, and computing resources. Based on the results attained the researcher recommends that, the performance of other classification algorithms, especially object based classification should be tested in tropics and semi-arid environments. This will show their potential ability in terms of differentiating forest resource and land cover mapping. Additionally, since new classification algorithms are developed rapidly it is very essential to evaluate their performance and sensitivity in different environs using various types of remote sensing datasets and high quality samples. If a comprehensive assessment of algorithms on various kinds of environment types were carried out it would be more suitable to select an algorithm for a specific remote sensing application.
The author thanks Dr. Veraldo Liesenberg for facilitating the acquisition of ALOS PALSAR L band data. The data was acquired under Cat.1-Proposal 6242 through the European Space Agency (ESA) Third Party Mission. The Landsat TM data was downloaded from the US Geological Survey (USGS) website.
The following abbreviations are used mostly in this manuscript:
 Attarchi, S. and Gloaguen, R. (2014) Classifying Complex Mountainous Forests with l-Band Sar and Landsat Data Integration: A Comparison among Different Machine Learning Methods in the Hyrcanian Forest. Remote Sensing, 6, 3624-3647.
 Li, C., Wang, J., Wang, L., Hu, L. and Gong, P. (2014) Comparison of Classification Algorithms and Training Sample Sizes in Urban Land Classification with Landsat Thematic Mapper Imagery. Remote Sensing, 6, 964-983.
 Lu, D.S. and Weng, Q.H. (2007) A Survey of Image Classification Methods and Techniques for Improving Classification Performance. International Journal of Remote Sensing, 28, 823-870.
 Rosenqvist, A., Shimada, M., Ito, N. and Watanabe, M. (2007) Alos/Palsar: A Pathfinder Mission for Global-Scale Monitoring of the Environment. IEEE Transactions on Geoscience and Remote Sensing, 45, 3307-3316.
 Rabus, B., Eineder, M., Roth, A. and Baler, R. (2003) The Shuttle Radar Topography Mission—A New Class of Digital Elevation Models Acquired by Spaceborne Radar. ISPRS Journal of Photogrammetry and Remote Sensing, 57, 241-262.
 Lee, J.S., Hoppel, K.W., Mango, S.A. and Miller, A.R. (1994) Intensity and Phase Statistics of Multilook Polarimetric and Interferometric Sar Imagery. IEEE Transactions on Geoscience and Remote Sensing, 32, 1017-1028.
 Liesenberg, V. and Gloaguen, R. (2013) Evaluating SAR Polarization Modes at l-Band for Forest Classification Purposes in Eastern Amazon. International Journal of Applied Earth Observation and Geoinformation, 21, 122-135.
 Lee, J.S., Wen, J.H., Ainsworth, T.L., Chen, K.S. and Chen, A.J. (2009) Improved Sigma Filter for Speckle Filtering of SAR Imagery. IEEE Transactions on Geoscience and Remote Sensing, 47, 202-213.
 Castel, T., Beaudoin, A., Stach, N., Stussi, N., leToan, T. and Durand, P. (2001) Sensitivity of Space-Borne SAR Data to Forest Parameters over Sloping Terrain. Theory and Experiment. International Journal of Remote Sensing, 22, 2351-2376.
 Richter, R. and Schlapfer, D. (2012) Atmospheric/Topographic Correction for Satellite Imagery (Atcor-2/3 User Guide, Version 8.2 Beta). German Aerospace Center, Remote Sensing Data Center, Wessling, 203.
 Wijaya, A. and Gloaguen, R. (2009) Fusion of Alos Palsar and Landsat Etm Data for Land Cover Classification and Biomass Modeling using Non-Linear Methods. IEEE International on Geoscience and Remote Sensing Symposium, Cape Town, III-581.
 Mitchard, E.T.A., Saatchi, S.S., White, L.J.T., Abernethy, K.A., Jeffery, K.J., Lewis, S.L., Collins, M., Lefsky, M.A., Leal, M.E., Woodhouse, I.H., et al. (2012) Mapping Tropical Forest Biomass with Radar and Spaceborne Lidar in Lopé National Park, Gabon: Overcoming Problems of High Biomass and Persistent Cloud. Biogeosciences, 9, 179-191.
 Haralick, R.M., Shanmugan, K. and Dinstein, I. (1973) Textural Features for Image Classification. IEEE Transactions on Systems, Man, and Cybernetics, 3, 610-621.
 Furtado, L.S.A., Silva, T.S.F., Fernandes, P.J.F. and Novo, E.M.L.M. (2015) Land Cover Classification of Lago Grande de curuai Floodplain (Amazon, Brazil) using Multi-Sensor and Image Fusion Techniques. Octa Amazonica, 45, 195-202.
 Amarsaikhana, D., Blotevogelb, H.H., van Genderenc, J.L., Ganzoriga, M., Gantuyaa, R. and Nerguia, B. (2010) Fusing High-Resolution SAR and Optical Imagery for Improved Urban Land Cover Study and Classification. International Journal of Image and Data Fusion, 1, 83-97.
 Pohl, C. and Van Genderen, J.L. (1998) Multisensor Image Fusion in Remote Sensing: Concepts, Methods and Applications. International Journal of Remote Sensing, 19, 823-854.
 Mountrakis, G., Im, J. and Ogole, C. (2011) Support Vector Machines in Remote Sensing: A Review. ISPRS Journal of Photogrammetry and Remote Sensing, 66, 247-259.
 Paola, J. and Schowengerdt, R. (1995) A Review and Analysis of Backpropagation Neural Networks for Classification of Remotely-Sensed Multi-Spectral Imagery. International Journal of Remote Sensing, 16, 3033-3058.
 Yuan, H., van der Wiele, C.F. and Khorram, S. (2009) An Automated Artificial Neural Network System for Land Use/Land Cover Classification from Landsat TM Imagery. Remote Sensing, 1, 243-265.
 Heinl, M., Walde, J., Tappeiner, G. and Tappeiner, U. (2009) Classifiers vs. Input Variables—The Drivers in Image Classification for Land Cover Mapping. International Journal of Applied Earth Observation and Geoinformation, 11, 423-430.
 Liesenberg, V., Roberto, D.S.F. and Gloaguen, R. (2016) Evaluating Moisture and Geometry Effects on l-Band SAR Classification Performance over a Tropical Rain Forest Environment. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9, 5357-5368.
 Mellor, A., Haywood, A., Stone, C. and Jones, S. (2013) The Performance of Random Forests in an Operational Setting for Large Area Sclerophyll Forest Classification. Remote Sensing, 5, 2838-2856.
 Schuster, C., Foerster, M. and Kleinschmit, B. (2012) Testing the Red Edge Channel for Improving Land-Use Classifications Based on High Resolution Multi-Spectral Satellite Data. International Journal of Remote Sensing, 33, 5583-5599.
 Wilkinson, G.G. (2005) Results and Implications of a Study of Fifteen Years of Satellite Image Classification Experiments. IEEE Transactions on Geoscience and Remote Sensing, 43, 433-440.