j-lanes hashing is a tree mode that splits an input message to j slices, computes j independent digests of each slice, and outputs the hash value of their concatenation. We demonstrate the performance advantage of j-lanes hashing on SIMD architectures, by coding a 4-lanes-SHA-256 implementation and measuring its performance on the latest 3rd Generation IntelR CoreTM. For messages whose lengths range from 2 KB to 132 KB, we show that the 4-lanes SHA-256 is between 1.5 to 1.97 times faster than the fastest publicly available implementation that we are aware of, and between ~2 to ~2.5 times faster than the OpenSSL 1.0.1c implementation. For long messages, there is no significant performance difference between different choices of j. We show that the 4-lanes SHA-256 is faster than the two SHA3 finalists (BLAKE and Keccak) that have a published tree mode implementation. Finally, we explain why j-lanes hashing will be faster on the coming AVX2 architecture that facilitates using 256 bits registers. These results suggest that standardizing a tree mode for hash functions (SHA-256 in particular) could be useful for performance hungry applications.
Cite this paper
S. Gueron, "A j
-Lanes Tree Hashing Mode and j
-Lanes SHA-256," Journal of Information Security
, Vol. 4 No. 1, 2013, pp. 7-11. doi: 10.4236/jis.2013.41002
 National Institute of Standards and Technology, “Cryptographic Hash Algorithm Competition,” 2005.
 S. Gueron, V. Krasnov, “Simultaneous Hashing of Multiple Messages”, Journal of Information Security, Vol. 3, 2012, pp. 319-325. doi:10.4236/jis.2012.34039
 Virtual Applications and Implementations Research Lab, “SUPERCOP,” 2012.
 G. Bertoni, J. Daemen, M. Peeters, G. Van Assche, “Keccak Sponge Function Family Main Document,” 2009. http://cuda-keccak.googlecode.com/svn/trunk/docs/Keccak-main-2.1.pdf
 G. Bertoni, J. Daemen, M. Peeters, G. Van Assche, “Sufficient Conditions for Sound Tree and Sequential Hashing Modes,” 2009. http://eprint.iacr.org/2009/210
 Y. Dodis, L. Reyzin, R. L. Rivest and E. Shen, “Indifferentiability of Permutation-Based Compression Functions and Tree-Based Modes of Operation, with Applications to MD6,” Proceedings of FSE 2009, Lecture Notes in Computer Science, 22-25 February 2009, Leuven, pp. 104-121.
 R. C. Merkle, “A Certified Digital Signature,” Advances in Cryptology—Proceedings of CRYPTO’89, Lecture Notes in Computer Science, Santa Barbara, 20-24 August 1989, pp. 218-238.
 P. Sarkar, P. J. Schellenberg, “A Parallelizable Design Principle for Cryptographic Hash Functions,” 2002.
 S. Gueron, V. Krasnov, “Parallelizing Message Schedules to Accelerate the Computations of Hash Functions,” Journal of Cryptographic Engineering, Vol. 4, No. 4, 2012, pp. 241-253.
 S. Gueron, V. Krasnov, “[PATCH] Efficient Implementations of SHA256 and SHA512, Using the Simultaneous Message Scheduling method,” 2012.
 M. Buxton, “Haswell New Instruction Descriptions Now Available!” 2011.