Enhanced Adaptive Approach of Video Coding at Very Low Bit Rate Using MSPIHT Algorithm

Show more

Received 20 February 2016; accepted 16 March 2016; published 7 June 2016

1. Introduction

Nowadays, video resolution is an escalating factor even in mobile phones; video compression emerges as a supporting term to increase resolution. Encoding information contains less number of bits of data compression, source coding, or bit-rate reduction than the original message in computer science and information theory. Data compression is a process of reducing the data file size. It can be either a lossy or lossless compression. Lossless compression removes the bits by finding the unwanted and necessary data and it eliminates the data by maintaining the redundancy. There is no information loss in this compression technique. The unnecessary data are searched and eliminated by the method of the lossy compression. It has an advantage of reducing the usage of resource such as data storage space or transmission capacity.

Space-time complexity trade-off is administered by data compression. For an example, compression scheme requires over-priced hardware to decompress it and to options to view in full while decompressing enhances additional storage space. Data compression design needs trade-offs of the following factors, degree of compression, distortion introduced (e.g., in using lossy data compression) and the usage of computational resources to compress and decompress the data.

Generally, there is an increase in multimedia images. Storage and transmission become difficult, so there is need of high device storage and large bandwidth network systems. The wavelet decomposition method is implemented in three-dimensional (3-D) video compression, which uses temporal axis along with it. Large number of video frames should be buffered for temporal decomposition taking place. In many applications, the main aim is to target transparent coding of audio and speech signals in the multimedia workstations, high quality audio transmission and storage.

The mechanism of wavelet transform is decomposing video frame into sub-set frames with verities of resolutions and different frequency bands. The global motion structure of the video signal is obtained from multi resolution frames at different scales. The motion activities differ each frame, but they are highly correlated because there are specific motion structures at different scales. There are emerging methods in partitioning the wavelet coefficients into spatio-temporal (s-t) blocks to obtain higher error flexibility and also in supporting error secrets. Rather than adjacent coefficient grouping, the concealment is grouped in fixed intervals with the lowest sub- band to interleaved trees.

The grouped sub-blocks are framed representation within the sequence of image, since all the interleaved tree groups contain coefficients dispersed in the entire frame group. Then the stream is separated and formed as fixed length packets and each one is encoded along with the channel encode. From the conducted experiments, it is clear that higher error resilience in noisy channels is obtained because the decoded coefficients are related to early decoding error and lost coefficients are concealed with the neighboring coefficients though there is missing of sub-streams.

To capture spatial temporal correlation in video signals, motion estimations are implemented. Their growth is visible from the pervasive existence of estimation/compensation steps in the video coding standards, from MPEG-1 to H.264. The difficulty of motion estimation is due to problems like, uncertainty of motion trajectory (i.e. the aperture problem), illumination variances, and noise in the video sequences. In many practical applications, a trade-off between motion vector’s accuracy and computational complexity always exist.

A new framework for video processing is explained in this paper which is based on recently proposed wavelet transform with SPIHT algorithm modification. Wavelet transform is considered in the place of an explicit motion estimation step, to provide motion-selective sub-band decomposition of video signals. Several researchers have proposed a video processing technique using motion-selective 3-D transforms. By invoking more levels of decomposition, directional resolution can be refined, and it is an added advantage. Practically, 192 or more directional sub-bands are chosen at finer scales of wavelets with SPIHT. It is suitable for coding and decoding. The main part of the coding method is not uncertainly transmitting the ordering data.

The rest of the paper is organized into section wise. In section 2, the literature survey of the encoding approach and its various techniques is discussed. In section 3, the proposed approach of video encoding and enhancement technique is explained. The simulation performances and the comparison of the proposed approach with existing are analyzed in section 4. Finally, the conclusion of the work is presented in section 5.

2. Literature Survey

In this section, the related work of the EWT, video coding and image enhancement is discussed. The compression is mostly wanted for providing higher resolutions and for also exercising current draft standards which are already meeting or overwhelming the target. The architecture and building blocks of HEVC are reviewed and selects with the regards to the two compression capability versus complexity. It enables parallelism for the signal processing operations [1] .

Transform Domain Wyner-Ziv (TDWZ) video coding is an efficient approach applied in distributed video coding. A new method is implemented on the basis of the finding, in which discrete cosine transform followed by compression using the Set Partitioning in Hierarchical Tree (SPIHT) is implemented instead of the wavelet transform. SPIHT algorithm is a fast and efficient technique for compression [2] .

The conventional interframe coding is a two-step process and the transform steps are obtained based on 2D Markov processes. Initially, by copying previously reconstructed neighbor pixels of the block. The blocks of pixels are predicted along the angular direction of the block internally. A recursive prediction approach is being implemented for improving intra prediction performance [3] .

To estimate the power-quality indices (PQIs), application of an empirical wavelet transforms (EWT)-based time-frequency technique is discussed. The frequency components present in the distorted signal, calculated the boundaries, and then filtering is done depend upon the boundaries calculated [4] .

To achieve higher coding efficiency, in-loop filter, sample adaptive offset (SAO) allows the latest video compression standard, HEVC or H.265 are subjected and measured [5] . Even though the basic architecture is built on hybrid block-based approach of combining predictions with transform coding. HEVC contains a number of coding tools with highly enhanced coding-efficiency capabilities which are prior to the video coding standards [6] .

An embedded compression engine is designed hardwired that mainly targets in reducing the full high-defi- nition (HD) video transmission network. An Adaptive Golomb-Rice coding scheme in conjunction with a context modeling technique is implemented to reduce the complexity and it is also used in lieu of an adaptive arithmetic coder. A decoder is used to synthesize by choosing virtual view via depth-image-based rendering. The shaped sub-block motion prediction can point to very small prediction residuals; it acquires an overhead for transmitting the dividing boundaries for sub-block identification at decoder [7] .

In region-of-interest (ROI)-based video coding, the frame of ROI are encoded with higher quality than non-ROI parts. The main aim is reducing salient coding artifacts in non-ROI frame parts to maintain user’s attention on ROI [8] . Signals are represented with a high degree of scarcity using wavelet transforms. NeighShrink is an adequate image de-noising algorithm based on decimated wavelet transform (DWT). In NeighShrink, the optimal threshold and Neighborhood window size in all sub bands is changed and the necessary information is obtained from the removed coefficients by using neighborhood window size and optimal threshold [9] .

The reliability of the health care service is critical in the quality of experience and service provided by them. It has emerged as an integral part of the medical data communication system. Quality of metrics is used in addressing them [10] .

A hybrid pattern matching is a transform-based compression method used for scanning the documents. Regular video interframe prediction is used for pattern matching algorithm. It can generate residual data and they are compressed adequately by transform-based encoder [11] .

Empirical mode decomposition (EMD) method is used in decomposing the signal according to the contained information. The main aim is to excerpt different signal modes by adopting different wavelet filter blank [12] .

The 3-D content compression is considered to be an important factor for a smooth transmission in the network with the forced bandwidth in 3-D multimedia application. A new compression framework for dynamic 3-D facial expressions is proposed which takes an advantage of near-isometric property of human facial expressions and it parameterize the dynamic 3-D faces into an expression-invariant canonical domain [13] [14] .

The abrupt signal change in the object boundaries leads to a depth video which is compressed by conventional video coding standards. The coding artifacts are abolished by implementing an efficient post processing method on the basis of weighted mode filtering and they are employed as an in-loop filter. The down/up sampling coding access the spatial resolution and the dynamic range are used along with the filter to minimize the bit rate [15] .

Digital videos gain an exposure due to the popularity and easy handling of video editing software. Markov based features are accepted for detecting double compression artifacts [16] .The tone-mapping scheme is imposed to convert high-bit-depth to eight-bit videos in a bit-depth scalable video coding. An applicable choice of a tone-mapping operator is ideal in improving the efficiency in coding the bit-depth scalable encoders [17] .

A transportation video coding and wireless transmission system is presented and modifies the automated vehicle tracking applications. By considering the video characteristics and the lossy nature, video pre-processing is delivered and error control approaches are complemented in tracing the performance while bandwidth resources and computational power are conserved [18] .

The exact transformation of an image or motion can compensate residually with the use of small transform coefficients fraction. Two algorithms are developed to solve this. The first algorithm, it’s simple, obtains local optimal solution. The second algorithm, it’s computationally intensive, obtains globally optimal solution [19] .

One of the common portions of the state-of the-art video processing algorithms is motion estimation. A new framework for video processing is explored by the basis of wavelet transform with SPIHT algorithm. The wavelet transform provides motion-selective sub band decomposition for video signals in the place of explicit motion estimation step [20] . A wavelet pyramid is applied to the random transforms over the ridgelet transforms where wavelets are having compact support. It includes decimated or un-decimated wavelet transform’s thresholding. It encompasses tree-based Bayesian posterior mean methods with it [21] .

3. Proposed Work

In this section, the proposed system explanation and the implementation are presented. The wavelet transforms of the system compress the frames and provides reliable process at a very low bit rate. The frames are separated into high and low frequency. The empirical wavelet transform is used for the decomposition of video sequences. In this system, the process of video coding is carried out in an efficient manner by implementing the proposed approach of MSPIHT for encoding the high frequency frames of the transform. The low frequency is encoded by the standard H.264/AVC. The process of enhancement is implemented to enhance the quality of the data with the better performance of reconstructing the frames.

In the proposed system, the video stream is decomposed by the Empirical Wavelet Transform based on the level of frequencies for the separation of frames with the parameters. The H.264 codec is coded the sequence of low frequency components at low bit rate by utilizing the Empirical wavelet transform. The low frequencies have a smaller dimension and quantizing it by more bits.

The high frequency frames are encoded by the MSPIHT approach. In decomposition process, by zero values initialization in the process of threshold, then smaller quantities are neglected. By applying on a threshold value (T) the rate of compression is increased. Then the value is applied to the high frequency coefficient value of the transform and the flow of the system is carried out as shown in the proposed system block diagram.

The proposed approach of Modified SPIHT is used to encode and decode the high frequency. The data through the wavelet decomposition is considered with the coefficient distribution into a tree structure. It has four levels of spatial orientation tree with the sub-band coefficient of high and low (LL; HL; LH; HH). The method of set partition is represented by the following coordinates set in the MSPIHT algorithm (Figure 1).

The coefficient location is denoted as (i, j) to represent the column and row indices respectively. Here, the all spatial orientation trees roots are defined as H and the coefficient set of offspring is denoted as O(i, j). The

Figure 1. Block diagram of proposed approach.

coefficient of all descendants set as D(i, j).

. (1)

The above function consists (i, j) is in LL. If the set is in LL subband, then the O(i, j) will be as given below.

(2)

where, w_{LL}_{ }and h_{LL} are the width and height of the LL subband respectively.

. (3)

The significance function S_{n}(τ) is considering the selection of significance coordinates set (τ) with respect to the threshold 2^{n}as given below. The wavelet coefficient is defined as W_{i}_{,j}.

. (4)

During the set partition function the significance information is stored in the ordered list of List of insignificant sets (LIS), list of insignificant pixels (LIP) and list of significant pixels (LSP). The transform of wavelet is processed in the method with the indication of actual coefficient of wavelet in term of pixels.

In the proposed approach of MSPIHT, the bit stream of output for encoding is consider with a huge number of seriate “0” situation. By the statistical, the outcomes appear of 000 with the highest value of probability; usually, it will be 1/4. Therefore, the proposed algorithm output streams of binary consider the group with 3 bits and every group recorded as a symbol. The statistical probability and eight kinds of symbols are performing to encode using arithmetic coding.

The process of encoding and decoding is considered as per the procedure given below.

1) Applying the SPIHT encoding technique to the high frequency components of empirical wavelet transform.

2) Divide the binary output stream of SPIHT encoding scheme into every 3 bits as a group.

3) In this process, there will be remain 0, 1, 2 bits cannot participate. So, in order to unity, record the number of bits that do not participate in the group and those remain bits are the direct output in the end.

4) The grouped bits are threshold to the index value of each symbol.

5) The index value of each symbol is encoded using the arithmetic encoding technique.

6) The encoded bits are first decoded using arithmetic decoding technique, the decoded index values are replaced by equivalent symbols.

7) In the final step, the grouped bits are again converted into individual bits and then decoded using the SPIHT decoding technique.

In the final stage of the reconstructing frames, an enhancement approach of HBLPCE (Histogram Based Locality Preserving Contrast Enhancement) is implemented to have an efficient process and better quality of reconstructing the data for final output. The Histogram equalization (HE) is a contrast enhancement to show extreme enhancement and gives unusual artifacts in images with high peaks. A histogram-based CE method is proposed to overcome the issue of HE.

In image histogram, the condition of locality is determined by using intensity level and the locality is used to recognize the local contrast enhancement (CE). The global form of CE is by combining the local CE with the property of locality preserving. It is defined by the probability mass function (PMF). Here, the sum of PMF is

equal to one and the vector is specified with the vector of intensity and the transformed vector of intensity. The intensity level and histogram equalization are represented as given below.

(5)

. (6)

By subtracting x_{i} from x_{i}_{+1}, the recursive function is,

. (7)

In input histogram zero PMF means, then the locality condition is consider as given below.

. (8)

The condition of the intensity formed with the optimization issue through the range of the entire intensity. The transformed vector of the intensity consist the function of objective in matrix form. The issue of optimization is defined below and the tridiagonal matrix Q is carried out with the coefficient.

(9)

In transformed histogram various vectors are considering between the levels of consecutive intensity to solve the issues of optimization. The solution for the issues is consists of using quadratic programming method and the function is given below.

(10)

4. Simulation Results

In this section, the analysis and evaluation of the proposed system simulation outcomes is presented. The performance analysis is carried out with the estimation of PSNR, execution time and speed. When compare to existing system the proposed system required very low bit rate to process and provide better computation. To obtain the results considering two video sequences for comparison.

Table 1 illustrated the PSNR obtained value of video sequence coding and performance comparison between the proposed and existing system for Nvip traffic Video Sequence and in Table 2 the execution time for coding is illustrated.

Table 3 evaluates the PSNR value with comparison for Miss America Video Sequence and the execution speed of the proposed system is compared with the existing system and the obtained results are time is illustrated in Table 4.

Table 5 shows the frames of the video sequence input, reconstructed and enhanced frame from the Nviptraffic Video Sequence and Miss America Video Sequence. Table 6 shows the comparison of the enhancement of

Table 1. Comparison of average PSNR value for Nviptraffic Video Sequence.

Table 2. Comparison of execution time for Nviptraffic Video Sequence.

Table 3. Comparison of average PSNR value for Miss America Video Sequence.

Table 4. Comparison of execution time for Miss America Video Sequence.

Table 5. Video sequence frame.

Table 6. Comparison of enhancement of reconstructed video sequence.

reconstructed video sequence performances from the decoding part and it shows the improvement of the proposed system.

5. Conclusion

In this paper, a novel approach of video coding is proposed by the decomposition process of empirical wavelet transform and coding scheme of the H.264/AVC standard and MSPHIT coding. In order to have better quality and to enhance the frames, HBLPCE approach is proposed and implemented from the decoding part for enhancement of frames. The performance of the proposed system provides better accuracy and quality; also communicates at very lower bit rate than the existing system. Therefore, in real time application, it is more reliable and preferable for video coding. By the proposed system, the process of transmitting and receiving is carried out with accuracy and fast process with less time of execution without factor loss of video.

References

[1] Ohm, J. and Sullivan, G.J. (2013) High Efficiency Video Coding: The Next Frontier in Video Compression [Standards in a Nutshell]. IEEE Signal Processing Magazine, 30, 152-158.

http://dx.doi.org/10.1109/MSP.2012.2219672

[2] Amritha, K.M. and Nithin, S.S. (2015) Adaptive Encoding & Decoding of Compressed Video Using SPIHT Algorithm. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 4, No. 5.

[3] Kamisli, F. (2015) Block-Based Spatial Prediction and Transforms Based on 2D Markov Processes for Image and Video Compression. IEEE Transactions on Image Processing, 24, 1247-1260.

[4] Thirumala, K., Umarikar, A.C. and Jain, T. (2015) Estimation of Single-Phase and Three-Phase Power-Quality Indices Using Empirical Wavelet Transform. IEEE Transactions on Power Delivery, 30, 445-454.

[5] Choi, Y. and Joo, J. (2015) Exploration of Practical HEVC/H.265 Sample Adaptive Offset Encoding Policies. IEEE Signal Processing Letters, 22, 465-468.

[6] Nguyen, T., Helle, P., Winken, M., Bross, B., Marpe, D., Schwarz, H. and Wiegand, T. (2013) Transform Coding Techniques in HEVC. IEEE Journal of Selected Topics in Signal Processing, 7, 978-989.

[7] Hwang, Y.-T., Lyu, M.-W. and Lin, C.-C. (2015) A Low-Complexity Embedded Compression Codec Design with Rate Control for High-Definition Video. IEEE Transactions on Circuits and Systems for Video Technology, 25, 674- 687.

[8] Daribo, I., Florencio, D. and Cheung, G. (2014) Arbitrarily Shaped Motion Prediction for Depth Video Compression Using Arithmetic Edge Coding. IEEE Transactions on Image Processing, 23, 4696-4708.

[9] Hadizadeh, H. and Bajic, I.V. (2014) Saliency-Aware Video Compression. IEEE Transactions on Image Processing, 23, 19-33.

[10] Neelima, M. and Mahaboob Pasha, Md. (2014) Wavelet Transform Based on Image Denoising Using Thresholding Techniques. International Journal of Advanced Research in Computer and Communication Engineering, 3, No. 9.

[11] Razaak, M., Martini, M.G. and Savino, K. (2014) A Study on Quality Assessment for Medical Ultrasound Video Compressed via HEVC. IEEE Journal of Biomedical and Health Informatics, 18, 1552-1559.

[12] Zaghetto, A. and de Queiroz, R.L. (2013) Scanned Document Compression Using Block-Based Hybrid Video Codec. IEEE Transactions on Image Processing, 22, 2420-2428.

[13] Gilles, J. (2013) Empirical Wavelet Transform. IEEE Transactions on Signal Processing, 61, 3999-4010.

http://dx.doi.org/10.1109/TSP.2013.2265222

[14] Hou, J.H., Chau, L.-P., He, Y., Zhang, M.Q. and Magnenat-Thalmann, N. (2013) Rate-Distortion Model Based Bit Allocation for 3-D Facial Compression Using Geometry Video. IEEE Transactions on Circuits and Systems for Video Technology, 23, 1537-1541.

[15] Nguyen, V.-A., Min, D.B. and Do, M.N. (2013) Efficient Techniques for Depth Video Compression Using Weighted Mode Filtering. IEEE Transactions on Circuits and Systems for Video Technology, 23, 189-202.

[16] Jiang, X.H., Wang, W., Sun, T.F., Shi, Y.Q. and Wang, S.L. (2013) Detection of Double Compression in MPEG-4 Videos Based on Markov Statistics. IEEE Signal Processing Letters, 20, 447-450.

[17] Mai, Z.C., Mansour, H., Nasiopoulos, P. and Ward, R.K. (2013) Visually Favorable Tone-Mapping with High Compression Performance in Bit-Depth Scalable Video Coding. IEEE Transactions on Multimedia, 15, 1503-1518.

[18] Chen, Z.F., Tsaftaris, S.A., Soyak, E. and Katsaggelos, A.K. (2013) Application-Aware Approach to Compression and Transmission of H.264 Encoded Video for Automated and Centralized Transportation Surveillance. IEEE Transactions on Intelligent Transportation Systems, 14, 2002-2007.

[19] Cai, X. and Lim, J.S. (2012) Algorithms for Transform Selection in Multiple-Transform Video Compression. 2012 19th IEEE International Conference on Image Processing (ICIP), 30 September-3 October 2012, 2481-2484,

[20] Sao, C. (2011) Spiht Based Video Compression. International Journal of Image Processing and Applications, 2, 111- 117.

[21] Starck, J.-L., Candes, E.J. and Donoho, D.L. (2002) The Curvelet Transform for Image Denoising. IEEE Transactions on Image Processing, 11, 670-684.

[22] Ohm, J. and Sullivan, G.J. (2013) High Efficiency Video Coding: The Next Frontier in Video Compression [Standards in a Nutshell]. IEEE Signal Processing Magazine, 30, 152-158.

http://dx.doi.org/10.1109/MSP.2012.2219672

[23] Amritha, K.M. and Nithin, S.S. (2015) Adaptive Encoding & Decoding of Compressed Video Using SPIHT Algorithm. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 4, No. 5.

[24] Kamisli, F. (2015) Block-Based Spatial Prediction and Transforms Based on 2D Markov Processes for Image and Video Compression. IEEE Transactions on Image Processing, 24, 1247-1260.

[25] Thirumala, K., Umarikar, A.C. and Jain, T. (2015) Estimation of Single-Phase and Three-Phase Power-Quality Indices Using Empirical Wavelet Transform. IEEE Transactions on Power Delivery, 30, 445-454.

[26] Choi, Y. and Joo, J. (2015) Exploration of Practical HEVC/H.265 Sample Adaptive Offset Encoding Policies. IEEE Signal Processing Letters, 22, 465-468.

[27] Nguyen, T., Helle, P., Winken, M., Bross, B., Marpe, D., Schwarz, H. and Wiegand, T. (2013) Transform Coding Techniques in HEVC. IEEE Journal of Selected Topics in Signal Processing, 7, 978-989.

[28] Hwang, Y.-T., Lyu, M.-W. and Lin, C.-C. (2015) A Low-Complexity Embedded Compression Codec Design with Rate Control for High-Definition Video. IEEE Transactions on Circuits and Systems for Video Technology, 25, 674- 687.

[29] Daribo, I., Florencio, D. and Cheung, G. (2014) Arbitrarily Shaped Motion Prediction for Depth Video Compression Using Arithmetic Edge Coding. IEEE Transactions on Image Processing, 23, 4696-4708.

[30] Hadizadeh, H. and Bajic, I.V. (2014) Saliency-Aware Video Compression. IEEE Transactions on Image Processing, 23, 19-33.

[31] Neelima, M. and Mahaboob Pasha, Md. (2014) Wavelet Transform Based on Image Denoising Using Thresholding Techniques. International Journal of Advanced Research in Computer and Communication Engineering, 3, No. 9.

[32] Razaak, M., Martini, M.G. and Savino, K. (2014) A Study on Quality Assessment for Medical Ultrasound Video Compressed via HEVC. IEEE Journal of Biomedical and Health Informatics, 18, 1552-1559.

[33] Zaghetto, A. and de Queiroz, R.L. (2013) Scanned Document Compression Using Block-Based Hybrid Video Codec. IEEE Transactions on Image Processing, 22, 2420-2428.

[34] Gilles, J. (2013) Empirical Wavelet Transform. IEEE Transactions on Signal Processing, 61, 3999-4010.

http://dx.doi.org/10.1109/TSP.2013.2265222

[35] Hou, J.H., Chau, L.-P., He, Y., Zhang, M.Q. and Magnenat-Thalmann, N. (2013) Rate-Distortion Model Based Bit Allocation for 3-D Facial Compression Using Geometry Video. IEEE Transactions on Circuits and Systems for Video Technology, 23, 1537-1541.

[36] Nguyen, V.-A., Min, D.B. and Do, M.N. (2013) Efficient Techniques for Depth Video Compression Using Weighted Mode Filtering. IEEE Transactions on Circuits and Systems for Video Technology, 23, 189-202.

[37] Jiang, X.H., Wang, W., Sun, T.F., Shi, Y.Q. and Wang, S.L. (2013) Detection of Double Compression in MPEG-4 Videos Based on Markov Statistics. IEEE Signal Processing Letters, 20, 447-450.

[38] Mai, Z.C., Mansour, H., Nasiopoulos, P. and Ward, R.K. (2013) Visually Favorable Tone-Mapping with High Compression Performance in Bit-Depth Scalable Video Coding. IEEE Transactions on Multimedia, 15, 1503-1518.

[39] Chen, Z.F., Tsaftaris, S.A., Soyak, E. and Katsaggelos, A.K. (2013) Application-Aware Approach to Compression and Transmission of H.264 Encoded Video for Automated and Centralized Transportation Surveillance. IEEE Transactions on Intelligent Transportation Systems, 14, 2002-2007.

[40] Cai, X. and Lim, J.S. (2012) Algorithms for Transform Selection in Multiple-Transform Video Compression. 2012 19th IEEE International Conference on Image Processing (ICIP), 30 September-3 October 2012, 2481-2484,

[41] Sao, C. (2011) Spiht Based Video Compression. International Journal of Image Processing and Applications, 2, 111- 117.

[42] Starck, J.-L., Candes, E.J. and Donoho, D.L. (2002) The Curvelet Transform for Image Denoising. IEEE Transactions on Image Processing, 11, 670-684.