Quite unusually, we have started this serie on FSE by focusing on the decompression part. It may seem strange to try to decode a data we have not even started to encode. That's because it's much easier to understand the logic this way, the decoder being much more "straighforward".
But we are still left, now, with the problem of properly encoding our source data.For an overall view on ANS theory, I will point to the updated paper by Jarek Duda, and for a in-depth description of ANS properties, let's mention the excellent 12-parts serie of Charles Bloom. With this context in mind, this blog will now concentrate on actual programming concepts, allowing you to build your own FSE encoder / decoder.
One key concept to keep in mind when dealing with ANS streams is the requirement to encode and decode in opposing directions. This is unusual, since in known entropy construction codes (such as Huffman, or Arithmetic, but also Golomb, Shannon, Elias, Tunstall, etc.), we keep the intuitive property to encode and decode forward. No longer with ANS.
This property is important to understand what we have to do to make compression work : we will simply reverse operations.
Let's have a look at these operations. Taken from previous article, here is a snippet from decoding loop :
// Persistent variables across the loops : state and bitStream
BYTE symbol = decodeTable[state].symbol;
int nbBits = decodeTable[state].nbBits;
int rest = getBits (bitStream, nbBits);
state = decodeTable[state].newStateBaseline + rest;
If we roll back in time, we start by inverting the last operation :
newState = decodeTable[oldState].baseline + rest;
/* rest => */ int rest = getBits (bitStream, nbBits);
rest is basically what the decoder will read from the bitStream. So the inverse operation is to write those bits into bitStream.
rest represents the lower part of newState. More precisely, it is the nbBits lower bits of state.
Which leads us to the next question : how to determine nbBits ?
Well, in contrast with the decoder, this time, the encoder knows which symbol it has to encode. This is the critical piece of information we will use.
Remember that we emulate fractional bits by using sometimes n bits, sometimes n+1 bits, resulting in an average of n.fractional bits. Therefore, knowing symbol, we already can tell that nbBits is going to be n or n+1.
But in fact, we know a bit more than that. Remember that, by construction, all sub-ranges much fill the entire range of state values. Remember also that smaller sub-ranges are present at the beginning, while larger ones occupy the rest. It gives the following example (for a probability of 5/4096) :
In this example : we know that if newState is >= 1024, then we are supposed to use 10 bits. If newState < 1024, then we will use only 9 bits.
You obviously don't need to store a full table per symbol, just knowing the threshold is enough.
And that's basically what FSE is doing : start by calculating nbBits :
int nbBitsOut = symbolTT[symbol].minBitsOut; // n
if (state >= symbolTT[symbol].threshold) nbBitsOut++; // or n+1
Note that in later version, this process was changed for a different formula giving the same result :
U32 nbBitsOut = (U32)((statePtr->value + symbolTT.deltaNbBits) >> 16);
In this new version, there is no branch, nor cmov. It is replaced by a fixed right shift. The result is controlled by symbolTT.deltaNbBits, which ensures the result is n or n+1 depending on state value.
Knowing nbBits is key, since it allows to determine rest, and therefore to flush this part into the bitStream.
int rest = newState & mask[nbBits];
writeBits(bitStream, rest, nbBits);
With that part solved, we are left with :
intermediateState = newState - rest = decodeTable[oldState].baseline;
which brings us one step closer to determine oldState, the final objective of the encoding stage.
to be continued...