OPJ  Vol.3 No.2 B , June 2013
System-on-a-Chip (SoC) Based Hardware Acceleration for Video Codec
Abstract: Nowadays, from home monitoring to large airport security, a lot of digital video surveillance systems have been used. Digital surveillance system usually requires streaming video processing abilities. As an advanced video coding method, H.264 is introduced to reduce the large video data dramatically (usually by 70X or more). However, computational overhead occurs when coding and decoding H.264 video. In this paper, a System-on-a-Chip (SoC) based hardware acceleration solution for video codec is proposed, which can also be used for other software applications. The characteristics of the video codec are analyzed by using the profiling tool. The Hadamard function, which is the bottleneck of H.264, is identified not only by execution time but also another two attributes, such as cycle per loop and loop round. The Co-processor approach is applied to accelerate the Hadamard function by transforming it to hardware. Performance improvement, resource costs and energy consumption are compared and analyzed. Experimental results indicate that 76.5% energy deduction and 8.09X speedup can be reached after balancing these three key factors.
Cite this paper: X. Niu and J. Fan, "System-on-a-Chip (SoC) Based Hardware Acceleration for Video Codec," Optics and Photonics Journal, Vol. 3 No. 2, 2013, pp. 112-117. doi: 10.4236/opj.2013.32B028.

[1]   R. Saleh, S. Mirabbasi, G. Lemieux, et al., “System-on-Chip: Reuse and Integration” Proceedings of the IEEE, Vol. 94, No. 6, 2006, pp. 1050-1069. doi:10.1109/JPROC.2006.873611

[2]   Joint Video Team of ITU-T and ISO/IEC JTC 1, “Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264 ISO/IEC 14496-10 AVC),” Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-G050, March 2003.

[3]   G. Stitt, R. Lysecky, F. Vahid, “Dynamic Hardware/ Software Partitioning: A First Approach.” Proceedings of the 40th conference on Design Automation, 2003, pp. 250-255.

[4]   L. Shannon and P. Chow, “Using Reconfigurability to Achieve Real-Time Profiling for Hardware/Software Codesign,” Proceedings of the 12th International Symposium on Field Programmable Gate Arrays, 2004, pp. 190-199.

[5]   R. Duarte, C. Liu and X. Niu, “RSA Cryptography Acceleration for Embedded System,” The 6th International Workshop on Unique Chips and Systems (UCAS-6), in conjunction with MICRO-43, Atlanta, GA, December 4, 2010.

[6]   J. Villarreal, D. Suresh, G. Stitt, F. Vahid, et al., “Improving Software Performance with Configurable Logic Kluwer,” Journal on Design Automation of Embedded Systems, Vol. 7, No. 4, 2002, pp. 325-339. doi:10.1023/A:1020359206122

[7]   D. C. Suresh, W. A. Naj-jar, F. Vahid, et al., “Profiling Tools for Hardware/Software Partitioning of Embedded Applications,” Proceedings of Language, Compiler, and Tool for Embedded Systems, Vol. 38, No. 7, 2003, pp. 189-198.

[8]   T. C. Chen, Y. W. Huang and L. G. Chen, “Analysis and Design of Macroblock Pipelining for H.264/AVC VLSI Architecture,” Proceedings of International Symposium on Circuits and Systems, Vol. 2, 2004, pp. 273-276.

[9]   R. C. Kordasiewicz and S. Shirani, “ASIC and FPGA Implementations of H.264 DCT and Quantization Blocks,” IEEE International Conference on Image Processing, Vol. 3, 2005, pp. 1020-1023.

[10]   Elgato website:[Online].

[11]   H. C. Lin, Y. J. Wang, K. T. Cheng, et al., “Algorithms and DSP Implementation of H.264/AVC,” Design Automation, pp. 24-27, 2006.

[12]   Iain E. G. Richardson, “H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia,” John Wiley & Sons, Ltd. 2003.

[13]   D. Marpe, H. Schwarz and T. Wiegand, “Context-Adaptive Binary Arithmetic Coding in the H.264/AVC Video Compression Standard,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, pp. 620-636, 2003.doi:10.1109/TCSVT.2003.815173

[14]   J. G. Tong and M. A. S. Khalid, “Profiling CAD Tools: A Proposed Classification,” Proceeding of the 19th International Conference on Microelectronics, 2007, pp. 253-256.

[15]   R. Lysecky, S. Cotterell, and F. Vahid, “A Fast On-Chip Profiler Memory,” Proceedings of the 39th Conference on Design Automation, pp. 28-33, 2002.

[16]   Jason G. Tong, Mohammed A. S. Khalid, “Profiling tools for FPGA-Based Embedded Systems: Survey and Quantitative Comparison,” Journal of Computers, Vol. 3, No. 6, 2008, pp. 1-14. doi:10.4304/jcp.3.6.1-14

[17]   G. B. Newby, “Hardware Acceleration Prospects and Challenges for High Performance Computing,” IEEE/ACS International Conference on Computer Systems and Applications, 2009, pp. 841-844.

[18]   ML505/506/507 Platform Manual.

[19]   Introduction of Xilinx LMB.

[20]   Introduction of Xilinx PLB bus.

[21]   LogiCORE IP Fast FSL V20 Bus.

[22]   R.C. Gonzalez, R. E. Woods, “Digital Image Processing,” Prentice Hall, 2nd Edition, Jan, 2002.

[23]   Intel Corporation, Using Intel VTune’s Counter Monitor. January 2005.

[24]   K. J. Horadam, “Hadamard Matrices and Their Applications,” Princeton university press, 2006.