Tuesday, March 29, 2011

Ross William's LZRW

Once upon a time...
Well, okay, it's not that far far away, but still, early 90's seems like prehistoric era for personal computing by now.
Ross Williams created a fairly interesting family of LZ compressors, featuring excellent speed and memory properties, and still providing good compression ratio (by these days standards), which he modestly called LZRW, for Lempel-Ziv-Ross Williams.

It has since become a classic in early compression algorithms, still being a competitive one even by today standards. Indeed, it's also the basis of several modern compressors, such as QuickLZ.

The main idea behind LZRW is this one : instead of providing an offset to the LZ  decoder, just provide it with the pointer position into a table.

It looks complex or crazy at first, but it is quite logical, following Dr Ross's mind into his successive versions of LZRW :
Back in these days, computers featuring some 64KB of RAM were still commonplace. Better ones could reach an impressive 256KB. Therefore, since table sizes were seriously limited, capability to track all positions in the buffer to find the best match was, well, mostly out of reach.

LZRW does not target best maximal compression, but fast and practical (ie memory efficient) compression level. By accepting scarcity of references, it created some "holes" in addressable space, which translates into compression inefficiency.
Hence the main idea : by using the table position instead of the offset, there is no more any hole in addressable space : as long as the table is completely filled, all positions have, roughly, an equivalent probability to appear (well, roughly enough...). Furthermore, all matches, even long range ones, use exactly the same number of bits to be described, making them more efficient.

The downside is that the decoder must reference the data exactly as the compressor did, making it more complex (and slower) than more standard LZSS. But still, this trade-off is very acceptable, since the speed penalty is not too large.

The LZRW scheme features a "best spot" for optimal compression ratio : the larger the table, the better its "match find capability", hence a better compression ratio. However, the larger the table, the more bits it requires to describe the table slot to the decoder, hence a worse compression ratio.

The "best spot" is found where both forces collide.

Or is it ?
One problem is, the optimal "nb of bits" is not the same depending on the file to be compressed.
For example, with "enwik8", this optimal number is 14 bits.
For "open office", it is 19 bits.
For very small files, this number can go down to 10 or less.
Therefore, the question is : how to find the optimal table size, in order to get the "optimal compression ratio" for any given input ?

Alas, that question has no easy answer, and i still have to find one myself...


  1. It's nice to see LZRW mentioned. While it was not the best at any one job, it is a nice mix for people who want to implement their own and learn about compression. I code (as a hobby) for a 1981-era 8088 PC, and LZRW is a nice place to start writing your own for that platform. (I replaced the hash function with the one from LZP, as it is better and faster to compute; I also added RLE for long runs of a single symbol.)

    In fact, I don't know of any faster decompression system for an 808x platform -- everything that compresses better using entropy encoding takes too long to decode (the goal is to beat pkunzip's decompression speed on that platform, not match it). If you know of one or could suggest one, I'd love to hear your thoughts.

  2. @trixter : sure, indeed, LZRW is very simple, and provides a nice compression boost for memory-limited systems. However, it is unlikely to be the fastest alternative for decompression speed.

    If decompression speed is the objective, then a more standard LZSS scheme will prove faster.
    You can probably keep most of your existing code as it is, simply you will need to provide an offset, instead of a hash value.

    The reason is : while LZRW requires the decoder to update the table like the compressor would, to remain synchronous, there is no table at all with LZSS.
    It translates into a much faster decoding loop.

    An example of this is LZ4 : http://fastcompression.blogspot.com/p/lz4.html

    Note however that, everything else remaining equal, you will probably lose a bit of compression ratio in the process.

  3. Is there any source available for LZ4? Or for your compressors?

    I have actually written an LZSS decoder in assembly for my target platform that uses all the registers to the point where the only memory reads are from the symbol sources and target buffer, and the only writes are to the target buffer -- it's really efficient code. My problem is that in order to make the most of it, I would need to write an encoder that does optimal parsing (all compression is offline so compression speed doesn't matter)... but no matter how many times I read S&S's paper or other explanations of optimal parsing, I don't understand it :-(

  4. Unfortunately, LZ4 source code is not available for download. Well, at least, not yet ... ;)

    Your LZSS implementation seems powerful enough as you describe it.

    Regarding Optimal Parsing, you are actually starting a pretty tough part. I would recommend this pleasant reading from Charles Bloom :

    He's not really keen of S&S methodology either, and aims not at "absolute maximal" but "good enough" compression improvement. And I tend to like his view and explanations.