Recognition of Bangla Handwritten Number Using Combination of PCA and FIS with the Aid of DWT

Show more

1. Introduction

Huge number of works relevant to object detection and recognition using machine learning is found in recent literature. In this section, we will search some works pertinent to DWT, FIS and PCA in image classification or identification. Nawaf Hazim Barnouti et al., discussed about combination of DWT and DCT that has been used for embedding and extraction copyright protection by using digital watermarking method in [1]. This two method DWT + DCT applied on two-dimensional images and works in frequency domain which seems to be more robust as found in its result section. Pooja B. Minajagi et al., proposed a method about segmentation of brain MRI image using Fuzzy c means clustering (FCM) and DWT in [2]. The paper provides level set segmentation using fuzzy c means based on special features (SFCM) and segmentation of brain MRI images using DWT algorithm. The performance evaluation is done by computing mean square error, peak signal to noise ratio (PSNR), maximum difference, absolute mean error etc.

Another work in combination of DWT and DCT in digital watermarking of color image is found in [3] by Ravinder Singh et al. The watermarking technique of the paper selected one color component from RGB image, is applicable in embedded watermarking since it requires only red component as discussed in [3]. T. Sridevi et al., implemented a robust watermarking using fuzzy logic approach based on DWT and SVD algorithm in [4]. The fuzzy logic decided how much of watermark has to be added to the cover image, which is based on the image properties as shown clearly in [4].

A classifier based on fuzzy if-then rules that allows the incorporation of weighted training patterns is proposed in [5]. The antecedent part of fuzzy if-then rules are specified by partitioning each attributes into fuzzy sets while the consequent class and the degree of certainty are determined from the compatibility and weights of training patterns. A learning method which adjusts the degree of certainty to improve performance of classification and reduce costs as introduced in [5]. In [6] fuzzy logic-based image processing is used for accurate and noise free edge detection and Cellular Learning Automata (CLA) is used to enhance the previously-detected edges with the help of the repeatable and neighborhood considering nature of CLA. The different results of edge detection technique are compared with fuzzy edge detected and resulting edge is enhanced using CLA. The authors in [7] deal with Fuzzy logic for the automatic analysis of X-ray images of industrial products for defect detection. A to stage algorithm is presented based on the feature analysis of the radiographic images obtained from the inspected product.

In [8], authors developed a semi-automated fuzzy inference system to detect the internal architecture of a mass transport complex (MTC) in seismic images. The characteristics of a MTC were expressed as fuzzy if-then rules consisting of linguistic values associated with fuzzy membership functions. Ashwini B Yaragall et al., in [9] discussed about Handwritten Character Recognition using deep learning where Convolutional Neural Network (CNN) has been used to train a model and Long Short Term Memory networks (LSTM) has been used to construct bounding boxes for each character. Chandrika Saha et al., in [10] proposed Deep Convolutional Neural Network (DCNN) to recognize Bangla hand-written digits. Authors used seven layered D-CNN containing three convolution layers, three average pool layers and one fully connected layer. In this paper we avoid CNN to reduce process time of recognition. None of above papers used the combination of DWT and FIS. The spectral components of images of Bangla digits acquired from DWT are found random on its scatterplot. A lot of points correspond to a particular class of image are found in the middle of the region of another class of image. The support vector machine (SVM) and k-means clustering fails to identify image from the spectral data of an image. We found that DWT + FIS combination only works to achieve reasonable rate of accuracy of recognition. Application of PCA in object detection is found in recent literature, for example upper part of human body is detected by PCA in [11]. The performance of PCA is compared with HOG, BDPCA and Haar cascade, where no combined scheme is used. In [12], authors claim that CNN has inherent problem of over-fitting and to overcome the problem they combined PCA with CNN for object detection and recognition in robot-aided visual system. Analyzing all the previous works mentioned here, we found two research gaps. None of above papers finds the matching of spectrum of DWT with FIS and PCA in object recognition. Another finding of the paper is that, DWT has mismatch with K-mean clustering and SVM in object recognition. This paper for the first time applies DWT + FIS + PCA to recognize handwritten Bangla digits including image enhancement and morphological operation. We successfully recognize Bangla handwritten digits and compare our results with some previous works and got better accuracy of recognition, mentioned at the end of the result section.

The rest of the paper is organized as: Section 2 deals with basic theory of DWT, FIS and PCA along with experimental setup of object recognition, Section 3 provides results based on analysis of Section 2. Finally Section 4 concludes entire analysis.

2. Methodology

In this paper image recognition is done using DWT, FIS and PCA. This section will deal with basic theory of wavelet transform, FIS and PCA then the experimental setup to combine above three methods.

2.1. Wavelet Transform

Wavelet is an oscillatory function of finite duration. If the sinusoidal wave $y\left(t\right)=A\mathrm{cos}{\omega}_{c}t$ is modulated by a smooth Gaussian window function $g\left(t\right)={\text{e}}^{-{t}^{2}}$, the modulated wave, $\psi \left(t\right)=g\left(t\right)y\left(t\right)=A{\text{e}}^{-{t}^{2}}\mathrm{cos}\left({\omega}_{c}t\right)$ is an wavelet over the interval $\left[\infty ,-\infty \right]$ but almost all of its energy is confined within a small interval.

The continuous-time wavelet transform (CWT) of an integrable function y(t) is expressed as [13] [14],

$W\left(a,b\right)={\displaystyle {\int}_{-\infty}^{\infty}f\left(t\right)\frac{1}{\sqrt{\left|a\right|}}{\psi}^{*}\left(\frac{t-b}{a}\right)\text{d}t}$ (1)

where a and b are real (scaling and shifting parameter) and * denotes conjugation.

If scaling (a) and shifting (b) are chosen based on powers of two then the analysis will be much more efficient and accurate like CWT. Such analysis of WT is called the discrete wavelet transform (DWT) expressed as,

$d\left(m,n\right)=\frac{1}{{2}^{m}}{\displaystyle {\int}_{{2}^{m}n}^{{2}^{m}\left(n+1\right)}y\left(t\right)\psi \left({2}^{-m}t-n\right)\text{d}t}$ (2)

Here d(m, n) is equivalent to continuous wavelet transform W(a, b) when a = 2^{m} and b = n2^{m} .

2.2. Fuzzy Inference System

FIS is a nonlinear mapping by means of fuzzy logic, from a given set of input value to one or more output values. To produce the expected outputs, it takes inputs and processes them based on the pre-specified rules. Fuzzy rules and fuzzy arithmetic is used in the internal processing. In the fuzzy inference system, real value is used in both the input and output units. The basic structure of a fuzzy inference system consists of a set of conceptual components as shown in Figure 1 as mentioned in [15] [16].

2.3. Principal Component Analysis

PCA is widely used in objection recognition or detection with reduction of variable. In this paper the feature vector derived from DWT is applied in PCA to enhance accuracy of object recognition. The steps of PCA algorithm is given below as [17] [18].

1) Let feature vectors of images derived from DWT are: ${F}_{\text{1}},{F}_{\text{2}},{F}_{\text{3}},\cdots ,{F}_{M}$ each of size 1 × N

2) The average of feature vectors is, $\Phi =\frac{1}{M}{\displaystyle \underset{i=1}{\overset{M}{\sum}}{F}_{i}}$ and difference vectors, ${D}_{i}={F}_{i}-\Phi $ ; $i=1,2,3,\cdots ,M$

3) The covariance matrix is evaluated as: $C=\frac{1}{M}{\displaystyle \underset{i=1}{\overset{M}{\sum}}{D}_{i}^{\text{T}}{D}_{i}}$.

4) The M orthogonal Eigen vectors U_{k}, where
${U}_{k}^{\text{T}}{U}_{j}={\delta}_{k,j}$ and corresponding Eigen values λ_{k} are selected from the covariance matrix C indicate the principle components of data.

Figure 1. Structure of fuzzy inference system.

5) Let us now select a new test image and determine its vector F_{t}. The projection of F_{t }on Eigen vector space is:
${U}_{l}^{\text{T}}\left({F}_{t}-\Phi \right)={U}_{l}^{\text{T}}{D}_{t}={\omega}_{l}$ is called weight of object l. Let us define weight vector,
${\Omega}_{t}=\left[\begin{array}{cccc}{\omega}_{1}& {\omega}_{2}& \cdots & {\omega}_{k}\end{array}\right]$, where we consider k Eigen vector corresponding to k largest Eigen values.

6) If ${\Omega}_{i}$ is the weight vector of ith training image then the Euclidean distance: ${\epsilon}_{i}=\Vert {\Omega}_{i}-{\Omega}_{t}\Vert $ is measured and if the distance is less than a threshold value θ then the test image is under the category of ith training image.

7) If the distance is greater than θ, then check for other category of object repeating all the previous steps.

2.4. Implementation

The steps of preprocessing of image consists of RGB to grayscale conversion, de-noising of image using filter, enhancement of image and finally thinning scheme as shown in Figure 2. The signal vector of the image is extracted using row and column wise DWT, the corresponding algorithm is given in subsection 2.5. Actually each row of the preprocessed image is applied in a filter bank of Figure 3 consists of lowpass (LP) filter of impulse response h(n) and highpass (HP) filter of impulse response g(n) like [19]. Each of the filtered signal is down sampled by a factor of two hence the length of the signal vector of output of the sampler is half of its input. The HP filter generates detail component and the LP provides the approximate component. The approximate component is further split into approximate and detail components.

One dimensional DWT is applied on each row of the preprocessed image until reducing to one element. The single element from each row forms a signal vector, which is again applied to one dimensional DWT until getting a column

Figure 2. Experimental setup of image recognition.

Figure 3. Decomposition of signal under filter bank.

matrix, V = [a b c d]^{T} of 1 × 4, which is actually contains the low frequency components of the image. The numerical values of elements of V is shown in Table 1 for nine “0”, “1” and “2”. The scatterplot of a-b and c-d are shown in Figure 4(a) and Figure 4(b). The region of digits “0”, “1” and “2” on a-b and c-d using k-means clustering are shown in Figure 4(c) & Figure 4(d). The corresponding scatterplot using SVM are shown in Figure 4(e) & Figure 4(f).

2.5. Proposed Algorithm of Extracting Four Lowest Spectral Components

*M* is the binary image of N×N after preprocessing

for i = 0:N-1,

{ s = M(i, :); %ith row the of the image matrix, M

Table 1. Co-efficients of DWT.

Select orthogonal LP and HP filter bank (for example Daubechies wavelet filter)

Take DWT on the row vector s and extract approximate component i.e. the output of LP filter u.

* y *= Under sample of u like figure 3

while length(y)>1

Continue step 1 to 4

r(i) = y %single element derived from ith row

}

Apply DWT on vector, r = [r(0) r(1) r(2) … … … r(*N*-1)], m times

Plot the resultant vector of length *N*/2^{m }

Both the K-mean clustering and SVM algorithms fail to segregate the digits in three different regions i.e. the algorithms failed to match the profile of spectral components due random location of points visualized from Figures 4(c)-(f). We got better matching using FIS and PCA, which is highlighted in next section.

3. Results

Few images of Bangla handwritten numerical character are shown in Figure 5 (before preprocessing) taken from benchmark Indian database (Character Databases of Indic Scripts). The URL of the database is: (https://www.isical.ac.in/~ujjwal/download/database.html) taken on 30^{th} December 2018. The original image, enhanced image, image with thinning scheme and the result of the proposed algorithm is shown in Figure 6 for four image of each character taken randomly from the database. Here we resize each grayscale image as 256 × 256 and apply DWT on each row of the image until getting a single value against each row. The output of DWT now will be a column vector of size 256 × 1. Next we apply DWT on the final column vector 3 times, therefore the size of the feature vector becomes, 256/2^{3} = 32 as mention in subsection 2.5.

Figure 4. Scatterplot of spectral components. (a) Scatterplot of a - b; (b) Scatterplot of c - b; (c) Region of digits on a - b using K-means; (d) Region of digits on c - d using K-means; (e) Region of a - b using SVM; (f) Region of c - d using SVM.

Figure 5. Few numerical character before prepossessing.

Figure 6. Output of proposed algorithm.

Taking the one “dimensional DWT vector” of co-efficient of length 16, we get the profile like Figure 7. Here we consider only five image of digit 1, 2, 3, 4 and 5. Each digit reveals distinct feature. Reducing the length of vector of length four we get the following data (Table 1) against digit 1, 2 and 3. Here each vector is presented as, V = [a b c d]. We next apply FIS on the data of Table 1. The basic diagram of FIS, signal flow diagram and few Fuzzy rules are shown in Figures 8(a)-(c) respectively like [20]. The variation of variable a, b, c and d of input vector, V = [a b c d] is shown in Figure 9. Verification of relation of input and output of FIS is shown in Figure 10 under four examples of, (V = [0.941 1 0.996 0.93], output = 2), (V = [0.803 1 0.726 0.849], output = 1), (V = [0.976 0.849 1 0.99], output = 0) and (V = [0.9428 1 0.7336 0.8893], output = 2).

Figure 7. Profile of coefficient vectors of DWT.

(a)(b)(c)

Figure 8. FIS model of digit recognition. (a) Basic FIS; (b) Signal flow; (c) Fuzzy rules of the system.

Figure 9. Surface plot of coefficients of DWT.

Figure 10. Verification of Fuzzy input and output of the FIS. (a) V = [0.941 1 0.996 0.93]; (b) V = [0.803 1 0.726 0.849]; (c) V = [0.976 0.849 1 0.99]; (d) V = [0.9428 1 0.7336 0.8893].

The combination of DWT and PCA are also properly matched in object detection as found in this paper. Taking the DWT co-efficient of Table 1 against three digits: 0 (object-1), 1 (object-2) and 3 (object-3), we determine four principal components of each object as shown in Figures 11(a)-(d) separately. Each of the four principal components are widely separated and shows better separation compared to Figure 7 hence combination of DWT and PCA works better than DWT alone.

The impact of size of vector V of DWT and the size of preprocessed image on accuracy of recognition is shown in Table 2. The accuracy of recognition of ten objects (Bangla digits) are determined by four techniques as: DWT of [21], PCA + DWT under the concept of [22] [23], FIS + DWT using the technique of [24] and FIS + PCA + DWT as the proposed method. The accuracy increases with increase in size of vector V and that of image for all four cases. The PCA + DWT shows better result compared to FIS + DWT for larger V or size of image. The combination of three schemes outperforms compared to other three cases of Table 2. Three techniques of object recognition are combined using entropy

Figure 11. Profile of PCA of three objects taking spectrum of DWT as the input. (a) First principal components; (b) Second principal components; (c) Third principal components; (d) Fourth principal components.

Table 2. Comparison of recognition accuracy.

based algorithm of [25]. We worked on the machine: Processor→ Intel(R), Core(TM) i7-8550U, RAM → 8.00 GB and use Matlab 18. Taking the size of image: 64 × 64 we get the process time of 500 ms for DWT, 1.8S for PCA + DWT, 2.7S for FIS + DWT and 4.6S for FIS + PCA + DWT against each object.

4. Conclusion

In this paper, we recognize Bangla handwritten digits using combination of PCA and FIS, taking the feature vector of DWT. We compare the results of our proposed method with some previous works of object recognition and we get better accuracy of recognition. One limitation of the paper is that we did not compare the process time or complexity of algorithms. In future, we will combine more object recognition algorithm to recognize Bangla vowels, consonants and digits all together. The concept of the paper is applicable in any kind of object detection/recognition, although the accuracy of recognition may vary for different type of objects and quality of image. Inclusion of DWT will save the memory against storing the database of images. Still we have the scope to use other object recognition algorithms like, PCA, LDA, SURF, HOG and CNN for comparison in context of accuracy of recognition and process time so that we can select appropriate algorithm for real time operation of computer vision.

References

[1] Barnouti, N.H., Sabri, Z.S. and Hameed, K.L. (2018) Digital Watermarking Based on DWT (Discrete Wavelet Transform) and DCT (Discrete Cosine Transform). International Journal of Engineering & Technology, 7, 4825-4829.

[2] Minajagi, P.B. and Goudar, R.H. (2016) Segmentation of Brain MRI Images Using Fuzzy C-Means and DWT. International Journal of Science Technology & Engineering, 2, 370-378.

[3] Singh, R., Mathuria, M., Rathore, K. and Kumar, S. (2014) A Robust Color Image Watermarking Using Combination of DWT and DCT. IJCA Proceedings on 4th International IT Summit Confluence 2013 the Next Generation Information Technology Summit, Confluence 2013, January 2014, Vol. 1, 11-14.

[4] Sridevi, T. and Fatima, S.S. (2013) Digital Image Watermarking Using Fuzzy Logic Approach Based on DWT and SVD. International Journal of Computer Applications, 74, 16-20.

https://doi.org/10.5120/12945-0014

[5] Nakashima, T., Schaefer, G., Yokota, Y. and Ishibuchi, H. (2007) A Weighted Fuzzy Classifier and Its Application to Image Processing Tasks. Fuzzy Sets and Systems, 158, 284-294.

https://doi.org/10.1016/j.fss.2006.10.011

[6] Patel, D.K. and More, S.A. (2013) Edge Detection Technique by Fuzzy Logic and Cellular Learning Automata Using Fuzzy Image Processing. International Conference on Computer Communication and Informatics, Coimbatore, 4-6 January 2013.

https://doi.org/10.1109/ICCCI.2013.6466130

[7] Amza, C.G. and Cicic, D.T. (2014) Industrial Image Processing Using Fuzzy-Logic. 25th DAAAM International Symposium on Intelligent Manufacturing and Automation, Vienna, 26-29 November 2014, 492-498.

https://doi.org/10.1016/j.proeng.2015.01.404

[8] Orozco-del-Castillo, M.G., Ortiz-Aleman, C., Urrutia-Fucugauchi, J. and Castellanos, A.R. (2011) Fuzzy Logic and Image Processing Techniques for the Interpretation of Seismic Data. Journal of Geographics and Engineering, 8, 185-194.

https://doi.org/10.1088/1742-2132/8/2/006

[9] Yaragall, A.B., Bhoomika, N., Krithika, M.S., Reddy, N.T. and Rekha, M.M. (2019) Handwritten Character Recognition. International Research Journal of Computer Science, 6, 126-128.

[10] Saha, C., Faisal, R.H. and Rahman, M.M. (2019) Bangla Handwritten Digit Recognition Using an Improved Deep Convolutional Neural Network Architecture. International Conference on Electrical, Computer and Communication Engineering, Cox’s Bazar, 7-9 February 2019, 1-6.

https://doi.org/10.1109/ECACE.2019.8679309

[11] Nguyen, B.T.H., Tran, H.V. and Bui, N.D. (2018) Human Upper Body Detection in Still Images Based on Extended PCA. 2018 Joint 7th International Conference on Informatics, Electronics & Vision and 2018 2nd International Conference on Imaging, Vision & Pattern Recognition, Kitakyushu, 25-29 June 2018, 14-18.

https://doi.org/10.1109/ICIEV.2018.8641014

[12] Xia, C., Zhang, Y., Zhang, P., Zheng, C.Q.R. and Liu, S. (2017) Multi-RPN Fusion-Based Sparse PCA-CNN Approach to Object Detection and Recognition for Robot-Aided Visual System. The 7th Annual IEEE International Conference on Cyber Technology in Automation, Control and Intelligent Systems, 31 July-4 August 2017, 394-399.

https://doi.org/10.1109/CYBER.2017.8446491

[13] Kalia, S., Joshi, A. and Agrawal, A. (2019) PAPR Analysis of IFFT and DWT Based OFDM-IM System. International Conference on Signal Processing and Communication, Noida, 7-9 March 2019.

https://doi.org/10.1109/ICSC45622.2019.8938217

[14] Feng, D. and Chen, L. (2019) A Blind Image Information Hiding Algorithm in the HSI Color Space Based on BEMD and DWT. 2019 International Conference on Communications, Information System and Computer Engineering, Haikou, 5-7 July 2019.

https://doi.org/10.1109/CISCE.2019.00080

[15] Ontiveros-Robles, E., Melin, P. and Castillo, O. (2019) Relevance of Polynomial Order in Takagi-Sugeno Fuzzy Inference Systems Applied in Diagnosis Problems. 2019 IEEE International Conference on Fuzzy Systems, New Orleans, 23-26 June 2019.

https://doi.org/10.1109/FUZZ-IEEE.2019.8859028

[16] Priyadarshi, H., Padmanaban, S., Holm-Nielsen, J.B., Ramachandaramurthy, V. and Bhaskar, M.S. (2019) An Adaptive Neuro-Fuzzy Inference System Employed Cuk Converter for PV Applications. 2019 IEEE 13th International Conference on Compatibility, Power Electronics and Power Engineering, Sonderborg, 23-25 April 2019.

https://doi.org/10.1109/CPE.2019.8862398

[17] Panna, M.B. and Islam, M.I. (2019) Human Face Detection Based on Combination of Linear Regression, PCA and Fuzzy C-Means Clustering. International Journal of Computer Science and Information Security, 17, 57-62.

[18] Ahammad, B., Rozario, R.J., Majumder, A. and Islam, M.I. (2018) Combination of SVM, LDA, PCA and Linear Regression under Fuzzy System in Human Face Recognition. International Journal of Engineering &Technology, 7, 6970-6976.

[19] Swamy, K.V., Sravani, N. and Radhika, V. (2019) CBIR Using Multi-Level DWT. 2019 IEEE International Conference on Electrical, Computer and Communication Technologies, Coimbatore, 20-22 February 2019.

https://doi.org/10.1109/ICECCT.2019.8869505

[20] Madrid-Herrera, L., Mario, I., Chacon-Murguia, D.A., Posada-Urrutia and Ramirez-Quintana, J.A. (2019) Human Image Complexity Analysis Using a Fuzzy Inference System. 2019 IEEE International Conference on Fuzzy Systems, New Orleans, 23-26 June 2019.

https://doi.org/10.1109/FUZZ-IEEE.2019.8858966

[21] Elakkiya, S. and Audithan, S. (2014) Feature Based Object Recognition Using Discrete Wavelet Transform. Second International Conference on Current Trends in Engineering and Technology, Coimbatore, 8 July 2014, 393-396.

https://doi.org/10.1109/ICCTET.2014.6966323

[22] Patil, J.P., Nayak, C. and Jain, M. (2015) Palmprint Recognition Using DWT, DCT and PCA Techniques. 2015 IEEE International Conference on Computational Intelligence and Computing Research, Madurai, 10-12 December 2015, 1-5.

https://doi.org/10.1109/ICCIC.2015.7435677

[23] Mahajan, A.S.B. and Karande, K.J. (2015) PCA and DWT Based Multimodal Biometric Recognition System. 2015 International Conference on Pervasive Computing, Pune, 8-10 January 2015, 1-4.

https://doi.org/10.1109/PERVASIVE.2015.7087185

[24] Kuspijani, K., Watiasih, R. and Prihastono, P. (2020) Fault Classification of Induction Motor Using Discrete Wavelet Transform and Fuzzy Inference System. 2020 International Conference on Smart Technology and Applications, Surabaya, 20 February 2020, 1-6.

https://doi.org/10.1109/ICoSTA48221.2020.1570615773

[25] Tabassum, F., Islam, M.I., Khan, R.T. and Amin, M.R. (2020) Human Face Recognition with Combination of DWT and Machine Learning. Journal of King Saud University—Computer and Information Sciences.

https://doi.org/10.1016/j.jksuci.2020.02.002