Image Super Resolution Reconstruction Based MCA and PCA Dimension Reduction

Show more

1. Introduction

With the improvement of living standard, people’s demand for high quality image becomes more and more urgent. For digital image, the spatial resolution and signal-to-noise-ratio of the image is an important standard to measure the quality of the image. High resolution and noise level are the two basic requirements of high quality image. In practical applications, due to the limitation of imaging device (such as a camera, camera), images obtained quality is often poor, such as low resolution or noisy, so that through the low-quality image processing to obtain high-resolution images become research focus. High resolution image is widely used in computer vision, medical imaging, video monitoring and satellite imaging [1] . Since Tsai and Huang [2] first proposed the super resolution reconstruction [3] problem in 1984, many super resolution reconstruction methods have been proposed. Super resolution reconstruction methods mainly can be divided into three categories: interpolation-based, the reconstruction-based and learning-based approaches [4] . Compared to interpolation-based method and the reconstruction-based method, learning based method not only emphasizes on the understanding of the content and structure of the image, but also uses of a priori knowledge about the image.

Yang [5] et al. who proposed a compressed sensing [6] learning-based algorithm, by learning directly to obtain low resolution image dictionary pairs for D_{h} and D_{l} and learning the relationship between high and low resolution of the dictionary pairs between high and low resolution images obtained. The image quality of this algorithm is better, but the effect of the training sample is larger, the training speed is slow, the effect of the reconstruction is more dependent on the choice of the training samples, and the characteristics of the input image is not considered. On the basis of Yang proposed a new MCA [7] [8] based and dictionary learning [9] [10] of image super-resolution reconstruction algorithm, first MCA method is decomposed low-resolution image into structure and texture part, and then low-resolution texture image is trained the over-complete dictionary. Because of the complexity of texture images, so using super resolution reconstruction method based on sparse representation [11] [12] . In the feature extraction process of dictionary training phase using the method that combined with second derivative and gradient direction. In the reduction dimension process using two-dimensional principal component analysis (2DPCA) [13] [14] for dimensionality reduction and using the K-SVD [15] [16] algorithm training the over-complete dictionary to reconstruct the texture image; the structure of the image is relatively flat, using Bicubic interpolation algorithms can better restore the high resolution edge information. Finally, the texture and structure of the reconstructed image is superimposed to get a final high-resolution image. The experimental results show that compared with the traditional method and the Yang method, the proposed algorithm not only improves the convergence speed and the robustness of the image, but also improves the quality of the reconstructed image.

2. MCA Image Decomposition Algorithm

The main idea of MCA is to use the morphological diversity of the different features of the image to give the optimal sparse representation of image morphology.

The core idea of MCA is to use the optimal sparse representation of image morphology. Let X be processed image signal contains M different signals, that is not the same layer of M: {X_{i}},
$i=1,2,\cdots ,M$ ,
$X={X}_{1}+{X}_{2}+\cdots +{X}_{M}$ , the M layer is superimposed to form the original image X. The decomposition effect of MCA is shown in Figure 1.

MCA assume that the morphology of each layer is not the same, any layer can be used to the corresponding Dictionary T_{i} of the sparse representation, but other layers of the dictionary T_{j} (j ≠ i) can not be sparse representation, which can achieve the separation of the image. For the low resolution image X with R pixels, the X is a linear combination of two different parts: the texture part X_{t} and the structural part X_{s}, X_{t} corresponding to the high frequency detail part, X_{s} corresponding to the low frequency smoothing part,

$X={X}_{t}+{X}_{s}$ (1)

In order to separate the low resolution image X which contains the texture part X_{t} and the structure part X_{s}, the MCA theory assumes that each part can be represented by a given dictionary sparse representation,
${T}_{t},{T}_{s}\in {M}^{R}{}^{\times L}$ , it can be written as:

${X}_{t}={T}_{t}{\alpha}_{t}$ (2)

${X}_{s}={T}_{s}{\alpha}_{s}$ (3)

where ${\alpha}_{t}$ and ${\alpha}_{s}$ are the sparse representation coefficients in the corresponding dictionary. For the low resolution image X, which includes both the texture and the structure, a dictionary and an optimal sparse representation are

(a) (b) (c)

Figure 1. The result of MCA decomposition. (a) the original image; (b)the texture of the original image; (c) the structure of the original image.

required. Optimal sparse representation of low resolution image X in a joint dictionary:

$\{{\alpha}_{t}^{opt},{\alpha}_{s}^{opt}\}=\mathrm{arg}\underset{\left\{{\alpha}_{t},{\alpha}_{s}\right\}}{\mathrm{min}}{\Vert {\alpha}_{t}\Vert}_{1}+{\Vert {\alpha}_{s}\Vert}_{1}$ s.t. $X={T}_{t}{\alpha}_{t}+{T}_{s}{\alpha}_{s}$ (4)

For images containing noise, can not be clearly divided into sparse texture and structural layer, can not be sparse to represent the image content, the remainder of the image is to be processed in another way. The general practice is to use a different norm to handle different types of noise, if the remainder of the form is similar to a zero-mean Gaussian white noise, select the norm as the error handling, and uniform noise processing using the infinity norm. After the above treatment, the image decomposition has taken into account the impact of noise on the image. The formula is optimized for the following formula:

$\{{\alpha}_{t}^{opt},{\alpha}_{s}^{opt}\}=\mathrm{arg}\underset{\left\{{\alpha}_{t},{\alpha}_{s}\right\}}{\mathrm{min}}{\Vert {\alpha}_{t}\Vert}_{1}+{\Vert {\alpha}_{s}\Vert}_{1}+\lambda {\Vert X-{T}_{t}{\alpha}_{t}-{T}_{s}{\alpha}_{s}\Vert}_{2}^{2}$ (5)

In addition, the whole variation (Variation TV:Total) method can also be used to achieve the decomposition of based on the sparsity. TV has good result for recovering smooth targets with significant edges, such as the restoration of the structural layer. After introducing the TV, the MCA decomposition is optimized for processing:

$\{{\alpha}_{t}^{opt},{\alpha}_{s}^{opt}\}=\mathrm{arg}\underset{\left\{{\alpha}_{t},{\alpha}_{s}\right\}}{\mathrm{min}}{\Vert {\alpha}_{t}\Vert}_{1}+{\Vert {\alpha}_{s}\Vert}_{1}+\lambda {\Vert X-{T}_{t}{\alpha}_{t}-{T}_{s}{\alpha}_{s}\Vert}_{2}^{2}+\lambda TV\left\{{T}_{s}{\alpha}_{s}\right\}$ (6)

For image I, the TV of the image I represents the ${l}_{1}$ norm of the image gradient:

$TV\left(I\right)={\displaystyle \underset{x,y}{\sum}\left|gradient\left(I\left(x,y\right)\right)\right|}$ (7)

3. Dictionary Training

The dictionary training is one of the most important steps in image super resolution reconstruction algorithm based on sparse representation. It will choose the training of library operations, training of high and low resolution corresponding dictionary. First, in the feature extraction process will combine the second order derivative and gradient direction to produce a new descent direction, an algorithm is designed with a new descent direction, this method has fast convergence speed, feature extraction effect is better. Then in the dimension reduction process using two-dimensional principal component analysis (2DPCA) to reduce the dimension, eliminate the link between rows and columns. At last, using K-SVD complete the training.

4. Using 2DPCA to Reduce the Feature Dimension

The advantages of dimension reduction are energy-saving in the subsequent calculation of training and super-resolution algorithm. The final step before the dictionary learning phase is to reduce the input low-resolution image block vector dimension, 2DPCA algorithm is applied on these vectors and expected to retain a subspace to 99% of the average information content, while retaining the 99% of the patch can be projected. Set the size of the image matrix A is m ´ n, the $X\in {R}^{n\times d}\left(n\ge d\right)$ is a matrix, and its column vectors are orthogonal to each other. Through the $Y=AX$ linear transformation, A is projected onto the image matrix X, will generate the size of 5 of the projection feature vector Y. With total dispersion of the sample as a criterion function projection you can find the best projection matrix X:

$J\left(X\right)=tr\left({S}_{X}\right)$ (8)

where ${S}_{X}$ is the covariance matrix of Y, and the trace of Q is M.

$\begin{array}{c}J\left(X\right)=tr\left\{E\left[\left(AX-E\left(AX\right)\right){\left(AX-E\left(AX\right)\right)}^{\text{T}}\right]\right\}\\ =tr\left\{{X}^{\text{T}}E\left[{\left(A-EA\right)}^{\text{T}}\left(A-EA\right)\right]X\right\}\end{array}$ (9)

Image covariance matrix is defined as

$G=E\left[{\left(A-EA\right)}^{\text{T}}\left(A-EA\right)\right]$ (10)

Suppose that the number of training samples is M, matrix is ${A}_{i}\left(i=1,2,\cdots ,M\right)$ , the mean image is:

$\stackrel{\xaf}{A}=\frac{1}{M}{\displaystyle \underset{i=1}{\overset{M}{\sum}}{A}_{i}}$ (11)

Then G is estimated as follows:

$G=\frac{1}{M}{\displaystyle \underset{i=1}{\overset{M}{\sum}}{\left({A}_{i}-\stackrel{\xaf}{A}\right)}^{\text{T}}}\left({A}_{i}-\stackrel{\xaf}{A}\right)$ (12)

Set
${X}_{opt}=\left\{{X}_{1},{X}_{2},\cdots ,{X}_{d}\right\}$ , X_{opt} is the optimal solution.
${X}_{opt}=\left\{{X}_{1},{X}_{2},\cdots ,{X}_{d}\right\}$ is obtained after the image feature extraction, for a given A:

${Y}_{m}=A{X}_{m}\text{\hspace{0.17em}}\left(m=1,2,\cdots ,d\right)$ (13)

This can be obtained by a set of feature vectors projected later, which is called the principal component of the image A vector, this paper uses 2DPCA to reduce the feature dimension from 324 to 16.

5. K-SVD Dictionary Training

The advantages of K-SVD [17] method is faster, better anti noise performance, more suitable for the problem with hundreds of samples, in this paper, the texture sub image of the super resolution reconstruction for the requirements of the dictionary training. Figure 2 is used for the reconstruction of high and low resolution dictionary pairs, a) the low-resolution dictionary, b) high-resolution dictionary.

K-SVD dictionary training steps:

1) The high resolution image database is under sampling, and the corresponding low resolution image database is obtained.

(a) (b)

Figure 2. LR and HR dictionary. (a) Low-resolution dictionary; (b) High-resolution dictionary.

2) Extracting features of low resolution image. The image of low resolution image is divided into the image block size of M, and the characteristics of the image are extracted. The specific method is to use 4 one-dimensional filters:

${f}_{1}=\left[-1,0,1\right],\text{\hspace{0.17em}}\text{\hspace{0.17em}}{f}_{2}={f}_{1}^{\text{T}},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{f}_{3}=\left[-1,0,-2,0,1\right],\text{\hspace{0.17em}}\text{\hspace{0.17em}}{f}_{4}={f}_{3}^{\text{T}}$ (14)

where T denotes transpose. Using the four one-dimensional filters for low resolution image, thus each image block will be four feature vectors, which link up as a feature of the image block representation. Then through high pass filtering pre-processing, for low resolution image pre-processing, gradient algorithm for optimization method was improved, then through high pass filtering pre-processing. Low resolution image before processing, gradient algorithm for optimization

method was improved. When $\frac{{\partial}^{2}f}{{\partial}^{2}{x}^{2}}\ne 0$ , then combine the second derivative and gradient direction to produce a new drop down with a new direction $d=\left[1+\frac{\delta}{\frac{{\partial}^{2}f}{{\partial}^{2}{x}^{2}}}\right]\left(\frac{\partial f}{\partial x}\right)$ . This method is rapid convergence and have better feature extraction results.

3) Using 2DPCA to reduce the dimension of the low resolution image, and to train the low resolution dictionary. Using K-SVD algorithm to train the characteristics of low resolution image into a low resolution dictionary.

4) Taking the structure of the interpolation image set. The low resolution training image is interpolated to the same size as the high resolution training image, and the MCA decomposition is carried out to obtain the structure of the interpolated image.

5) Extraction feature of high resolution image. The remaining portion of the high-resolution subtracts interpolation image of low-resolution training images r as a texture portion of high-resolution image. The texture part is divided into RN × RN image blocks and connected into vectors, as the feature vectors of the high resolution image blocks.

6) Calculate the high resolution dictionary. Assuming that the high and low resolution image blocks have the same sparse representation coefficients, the high resolution image blocks can be computed by minimizing the formula to approximation error:

${D}_{h}=\mathrm{arg}\mathrm{min}{\Vert {X}_{h}-{D}_{h}\alpha \Vert}_{F}^{2}$ (15)

Using a pseudo-inverse to solve:

${D}_{h}={X}_{h}{\alpha}^{+}={X}_{h}{\alpha}^{\text{T}}{\left(\alpha {\alpha}^{\text{T}}\right)}^{-1}$ (16)

Where + represents the pseudo-inverse.

6. Image Reconstruction

With respect to the texture portion of the image, the structure portion of the image saved smooth regions of the image. The sensitivity of the human eye to this part is less than the texture of the image. The sensitivity of the human eye to this part is less than the texture of the image. For the structure of the image, this paper uses Bicubic interpolation algorithm for super resolution reconstruction. For the testure of the image, this paper uses the super-resolution reconstruction algorithm based on sparse representation. Using the obtained D_{l} and D_{h}, we can reconstruct the low resolution texture image with high resolution. The low- resolution image is blocked as the size of
$n\times n$ , two adjacent overlapping into one pixel, so that the corresponding adjacent high resolution image is splicing more smooth. Calculated for each block of the optimal sparse representation α, so that high-resolution image block can be represented by D_{h}α. This sparse representation can be solved by:

$\underset{\alpha}{\mathrm{min}}{\Vert \stackrel{\u02dc}{D}\alpha -\stackrel{\u02dc}{y}\Vert}_{2}^{2}+\lambda {\Vert \alpha \Vert}_{1}$ (17)

where $\stackrel{\u02dc}{D}=\left[\begin{array}{c}{D}_{l}\\ P{D}_{h}\end{array}\right]$ , $\stackrel{\u02dc}{y}=\left[\begin{array}{c}y\\ w\end{array}\right]$ , λ is the regularization coefficient, p is used to

extract the the current estimate of high resolution image feature blocks and the adjacent regions that have already been estimated, w represents the estimated value of the high resolution image feature block in the overlapping region. The sparse representation of each block is obtained, which is the corresponding high resolution image block, which will get the final high resolution texture image by combining all the high resolution image block.

7. Experiment and Result Analysis

In order to balance the computational efficiency and the image reconstruction quality in the experiments, atoms of dictionary is fixed of 1024, down sampling coefficient is 2, the regularization parameter is 5, image block size is 6 × 6 We select 45 natural high resolution images as the training examples base, and select about 75,000 training image blocks to train the high and low resolution dictionary pairs. In this paper, we first reconstruct the image without noise. The experimental results are compared with the traditional Bicubic and the Yang method of the reconstructed image. Results are shown Figure 3. (a) original image, (b) low-resolution image (c) Bicubic image (d) local enlarged image using method

(a) (b) (c) (d)

(e) (f) (g) (h)

Figure 3. Super-resolution reconstruction result of Lena under different methods.

of Bicubic (e) image reconstruction using method of Yang (f) local enlarged image using method of Yang (g) the proposed method (h) local enlarged image using method of proposed method. In order to verify the universality of the algorithm, the effect of the algorithm applied to the head image of gray level images is analyzed. The experimental results are shown in Figure 4. (a) the original image(b) low resolution image (c) Bicubic image (d) image reconstruction using method of Yang (e) the proposed method. Color image and gray image reconstruction results of several algorithms are good, but it is intuitive to see that the proposed algorithm has a better effect on the details. From the Table 1 and Table 2 can be seen, the use of MCA decomposition of super resolution reconstruction algorithm in PSNR and SSIM values than the Bicubic method and method of Yang have a certain upgrade.

(a) (b) (c) (d)

(e)

Figure 4. Super-resolution reconstruction result of Head under different methods.

Table 1. PSNR and SSIM values of different algorithms for Lena.

Table 2. PSNR and SSIM values of different algorithms for Head.

(a) (b) (c)

Figure 5. Image reconstruction results with standard deviation of 5.

As can be seen from the table, the use of MCA decomposition of super resolution reconstruction algorithm in PSNR and SSIM values compared with the traditional linear interpolation method and Yang algorithm have a certain improvement. From Figure 4, we can also see that the algorithm proposed in the paper has a better effect on the details.

In practical applications, low resolution images are often noisy, so we need to test the reconstruction results of the proposed algorithm in this paper. Image magnification is 2. During the experimental process, we first give a low resolution image with zero mean, standard deviation of 5 White Gaussian noise, then carry on the reconstruction. The results shown in Figure 5. (a) Bicubic algorithm (b) the algorithm of Yang (c) proposed algorithm. Image super resolution reconstruction algorithm based on MCA and dictionary learning is separated from the noise, and the effect is better than that of the traditional Bicubic method and algorithm of Yang. However, it can be known from Figure 6 that with

Figure 6. The PSNR comparison of image reconstruction results under different intensity noise.

Figure 7. Comparison of algorithm reconstruction time.

the increase of the standard deviation of the noise, the PSNR value of the reconstructed image is also reduced, and the effect of the reconstruction is gradually reduced.

Figure 7 shows the reconstruction time of each algorithm. Since the reconstruction time of the double three times is 10 - 3 to 10 - 2 orders of magnitude, the Yang algorithm and the algorithm are given in this paper. As can be seen from Figure 7, the algorithm can improve the efficiency of the algorithm to a certain extent.

8. Conclusion

The MCA decomposition method is applied to image super resolution reconstruction based on sparse representation, the feature extraction and dimension reduction process of the dictionary training phase is improved, and the convergence speed of the algorithm is improved. For the texture part and structure part, the super resolution reconstruction algorithm based on the sparse representation learning method and the double three interpolation method is adopted. Not only improve the robustness of the image, better preserve the details of the image information, improve the quality of the reconstructed image, and achieved a better reconstruction effect. However, the complexity of the algorithm is higher and the decomposition rate is slower, which increases the time of dictionary training and image reconstruction. Future research will be committed to the low complexity of the algorithm was able to find a good decomposition algorithm, or MCA algorithm was improved to ensure the decomposition, while reducing the complexity of the algorithm.

References

[1] Chen, X.X. and Qi, C. (2014) Nonlinear Neighbor Embedding for Single Image Super-Resolution via Kernel Mapping. Signal Process, No. 94, 6-12.

https://doi.org/10.1016/j.sigpro.2013.06.016

[2] Tsai, R.Y. and Huang, T.S. (1984) Multiframe Image Restoration and Registration Advances in Computer Vision and Image Processing. 317-339.

[3] Park, S.C., Park, M.K. and Kang, M.G. (2003) Superresolution Image Reconstruction: A Technical Overview. IEEE Signal Processing Magazine, 20, 21-36.

https://doi.org/10.1109/MSP.2003.1203207

[4] Song, H.H. (2011) Algorithm of Image Super-Resolution Reconstruction Based on Sparse Representation. Chinese Scientific and Technical University, Beijing.

[5] Yang, J.C., Wright, J., Huang, T., et al. (2008) Image Super Resolution as Sparse Representation of Raw Image Patches. IEEE, Piscataway, 1-8.

[6] Yang, J., Wright, J., Huang, T.S. and Ma, Y. (2010) Image Super-Resolution via Sparse Representation. IEEE Trans. Image Processing, 19, 2861-2873.

https://doi.org/10.1109/TIP.2010.2050625

[7] Jing, G.D., Shi, Y.H. and Lu, B. (2010) Single-Image Super-Resolution Based on Decomposition and Sparse Representation. IEEE, Piscataway, 127-130.

https://doi.org/10.1109/MEDIACOM.2010.73

[8] Michael, E. (2005) Simultaneous Cartoon and Texture Image Inpainting Using Morphological Component Analysis (MCA). Applied and Computational Harmonic Analysis, 19, 340-358.

https://doi.org/10.1016/j.acha.2005.03.005

[9] Jia, K., Tang, X. and Wang, X. (2013) Image Transformation Based on Learning Dictionaries across Image Spaces. The IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 367-380.

https://doi.org/10.1109/TPAMI.2012.95

[10] Yang, J., et al. (2010) Coupled Dictionary Training for Image Super-Resolution. IEEE Transactions on Image Processing, 21, 3467-3478.

https://doi.org/10.1109/TIP.2012.2192127

[11] Yang, J., et al. (2010) Image Super-Resolution via Sparse Representation. IEEE Transactions on Image Processing, 19, 2861-2873.

https://doi.org/10.1109/TIP.2010.2050625

[12] Kulkarni, N., et al. (2012) Understanding Compressive Sensing and Sparse Representation-Based Super-Resolution. IEEE Transactions on Circuits and Systems for Video Technology, 22, 778-789.

https://doi.org/10.1109/TCSVT.2011.2180773

[13] Yang, J., Zhang, D., Frangi, A.F. and Yang, J. (2004) Two-Dimensional PCA: A New Approach to Appearance-Based Face Representation and Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 131-137.

[14] Wang, S., Ye, J. and Yang, D. (2013) Research of 2DPCA Principal Component Uncertainty in Face Recognition. 8th International Conference on Computer Science & Education, Colombo, 26-28 April 2013, 159-162.

[15] Aharon, M., Elad, M. and Bruckstein, A. (2006) K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. IEEE Transactions on Signal Processing, 54, 4311-4322.

https://doi.org/10.1109/TSP.2006.881199

[16] Rubinstein, R., Zibulevsky, M. and Elad, M. (2008) Efficient Implementation of the K-SVD Algorithm Using Batch Orthogonal Matching Pursuit.

[17] Xu, J., Qi, C. and Chang, Z. (2014) Coupled K-SVD Dictionary Training for Super-Resolution. International Conference on Image Processing, Paris, 27-30 October 2014, 3190-3914.