Range0 v0.6 Range0 v0.7
R C D R C D
enwik8.7z 1.000 870 1400 1.000 885 1930
audio1 1.071 174 83 1.071 174 83
enwik8-lz4-lit 1.370 155 76 1.370 155 76
enwik8-lz4-offh 1.191 138 76 1.191 138 76
enwik8-lz4-ml 2.946 155 83 2.946 160 83
enwik8-lz4-run 4.577 163 116 4.577 180 116
loong 1430. 362 427 1430. 555 427
The memcpy() makes wonder at improving speed for incompressible segments.
More importantly, interleaved counter speed up squeezed distribution compression.
The ultra-compressible corner case is not so important, in spite of the huge performance benefit, but the squeezed distribution benefit is, since this is Range0 most likely use. At 180 MB/s, it makes it almost as fast as standard Huffman coders, which basically means extra compression performance for free.
The bad point is, and will remain, the decoding speed, which cannot beat Huffman, due to the presence of a division into the main loop. However, the decoding speed is not too bad for the specific "squeezed" situation we want Range0. Indeed, should the distribution become even more squeezed, decoding speed would become even better.
Evolving the benchmark corpus in order to consider LZ4HC output, instead of Fast LZ4, is likely to make this statement more relevant.
For the time being, Range0 can be downloaded here.