In this serie explaining the behavior of Finite State Entropy, I find it interesting to make a quick comparison with Huffman entropy. It's a direct result from the description of its decoding principles, nonetheless I believe it's useful to clearly establish the equivalence since it will be illustrative for future articles.

To better grasp the similarities, let's have a simple example. We'll use the following probabilities, suitable for an Huffman tree :

A : 50% ; B : 25% ; C : 12.5% ; D : 12.5%

This tree only needs 8 values to be properly represented, and gives the following linear representation :

This representation uses the "high bit first" methodology, which means we are starting our node travel from the highest bit.

If it's

**0**, then we have an

**A**. We stop there, shift left the register by one bit and load a new one at the bottom.

If the highest bit is

**1**, we now we are not done, and need to read the second bit. If this second bit is

**0**, we know we have a

**B**. If not we continue reading. And so on.

The decision tree looks as follows :

But we could have done differently : we could have used "low bit first". In this case, the linear representation of the tree looks like this :

Which seems a lot messier, but it's not : we simply reversed the bit order of the decision tree :

The 2 decision trees are therefore completely equivalent.

Now, look again at the "low bit first" linear representation. It's important to state that

*this is what FSE would produce*, but it also embeds a few additional properties :

- The "newStateBaseline" does not need to be calculated and stored into the decoding table : it can be automatically determined from the state itself : newStateBaseline = currentState & ~mask(nbBits);
- nbBits is stable for a given character : for
**A**, it's always**1**, for**B**, it's always**2**, etc. It simplifies the decoding table, which does not need to store this information.

So basically, an Huffman-low_bit_first table looks the same as an FSE table, and it guarantees a stable number of bits per character, and newStateBaseline is automatically deduced from remaining bits.

There is, however, a small but important difference.

An Huffman-low_bit_first would

*shift right*the current state, and load the next bit(s) high. In contrast, the FSE decoder

*keeps them*, and only reload the low bits. This creates a dependency on future state : basically, once you've reached state

**2**with FSE for example, you can already guarantee that you won't see a

**D**before you see a

**C**. Which brings the question : how would the encoder select the right state, since it

*depends on future characters yet to be decoded*?

The answer : the encoder will work in

*backward*direction, encoding characters from last to first, solving dependencies along the way.

To be continued...