RealTime Data Compression: Range0 : new version (v0.7)

Monday, January 3, 2011

Range0 : new version (v0.7)

Several learnings from the Huff0 new version were generic enough to become applicable to Range0. More specifically, the interleaved counter and the call to memcpy(), resulted in significant improvements, as displayed below :

Range0 v0.6 Range0 v0.7

R C D R C D

Not compressible

enwik8.7z 1.000 870 1400 1.000 885 1930

Hardly compressible

audio1 1.071 174 83 1.071 174 83

Distributed

enwik8-lz4-lit 1.370 155 76 1.370 155 76

Lightly ordered

enwik8-lz4-offh 1.191 138 76 1.191 138 76

Ordered

enwik8-lz4-ml 2.946 155 83 2.946 160 83

Squeezed

enwik8-lz4-run 4.577 163 116 4.577 180 116

Ultra compressible

loong 1430. 362 427 1430. 555 427

The memcpy() makes wonder at improving speed for incompressible segments.
More importantly, interleaved counter speed up squeezed distribution compression.

The ultra-compressible corner case is not so important, in spite of the huge performance benefit, but the squeezed distribution benefit is, since this is Range0 most likely use. At 180 MB/s, it makes it almost as fast as standard Huffman coders, which basically means extra compression performance for free.

The bad point is, and will remain, the decoding speed, which cannot beat Huffman, due to the presence of a division into the main loop. However, the decoding speed is not too bad for the specific "squeezed" situation we want Range0. Indeed, should the distribution become even more squeezed, decoding speed would become even better.

Evolving the benchmark corpus in order to consider LZ4HC output, instead of Fast LZ4, is likely to make this statement more relevant.

For the time being, Range0 can be downloaded here.

RealTime Data Compression

Monday, January 3, 2011

Range0 : new version (v0.7)

No comments:

Post a Comment