Monday, January 3, 2011

Range0 : new version (v0.7)

Several learnings from the Huff0 new version were generic enough to become applicable to Range0. More specifically, the interleaved counter and the call to memcpy(), resulted in significant improvements, as displayed below :

                       Range0 v0.6      Range0 v0.7
                       R    C    D      R    C    D 
Not compressible 
enwik8.7z            1.000 870 1400   1.000 885 1930 
Hardly compressible 
audio1               1.071 174   83   1.071 174   83 
Distributed 
enwik8-lz4-lit       1.370 155   76   1.370 155   76 
Lightly ordered 
enwik8-lz4-offh      1.191 138   76   1.191 138   76 
Ordered 
enwik8-lz4-ml        2.946 155   83   2.946 160   83
Squeezed 
enwik8-lz4-run       4.577 163  116   4.577 180  116 
Ultra compressible
loong                1430. 362  427   1430. 555  427  


The memcpy() makes wonder at improving speed for incompressible segments.
More importantly, interleaved counter speed up squeezed distribution compression.

The ultra-compressible corner case is not so important, in spite of the huge performance benefit, but the squeezed distribution benefit is, since this is Range0 most likely use. At 180 MB/s, it makes it almost as fast as standard Huffman coders, which basically means extra compression performance for free.

The bad point is, and will remain, the decoding speed, which cannot beat Huffman, due to the presence of a division into the main loop. However, the decoding speed is not too bad for the specific "squeezed" situation we want Range0. Indeed, should the distribution become even more squeezed, decoding speed would become even better.

Evolving the benchmark corpus in order to consider LZ4HC output, instead of Fast LZ4, is likely to make this statement more relevant.

For the time being, Range0 can be downloaded here.

No comments:

Post a Comment