RealTime Data Compression: 5/15/11

Saturday, May 21, 2011

ZhuffMax parsing improvements

As a small attempt to improve parsing, i've made the lazy matcher of ZhuffMax a bit deeper. It resulted in some small gains, but also new problems.

With a deeper sequence to look for potentially better matches, we indeed increase our chances to find them.
However, in some circumstances, we end up with a serie of matches of increasing lengthes. At some point, the amount of bytes skipped by this recursive search may become so large that we could have used the first match to begin with.

In order to counterbalance this side-effect, we have to keep track of an history of matches, and try to re-use them if we have skipped too many bytes in the search process. It's not complex, and just requires a bit of local memory.

With both mechanisms active, we end up with a small but measurable compression gain. Nothing dramatic, but still, since ZhuffMax is about learning advanced parsing techniques, it's the right place to implement this improvement.
Deeper searches means slower speed though, so the gain comes at some cost.

Benchmark platform : Core 2 Duo E8400 (3GHz), Window Seven 32-bits

	version	threads	Compression Ratio	Speed	Decoding
ZhuffMax	0.2	-t1	2.825	14.1 MB/s	325 MB/s
ZhuffMax	0.2	-t2	2.825	28.0 MB/s	640 MB/s

You can get the new version of ZhuffMax at Zhuff's Homepage.