In the end, i finally completed a new version of SecFull, an aggressive implementation which tries to promote all characters it meets during its travel across the different levels. The changes required the capability to "move back" by one level, since in some cases, chains were broken and needed repair.
A much simpler version was produced in the process, called Sec1. It just tries to promote the last seen character, and only when a valid gateway is confirmed. This ensures that the changes are minimal and well located, keeping all the rest of the code unmodified.
Complexity difference between Sec1 and SecFull is quite large, and therefore it is not so clear which one will bring better benefit.
Anyway, it is now possible to compare. All search algorithms generate exactly the same references, resulting in the same output. Therefore, all changes affect only the speed of searches.
As usual, let's just start by a simple counter of comparisons :
Nb of comparisons - Window Search 64K :
That's for comparisons. Now, what about speed ?
Surely, the less number of comparisons, the better the speed ?
Well, yes, approximately; but we also have to take in consideration code complexity. If each search is "more costly", then the total may end up being slower.
As can be guessed, Sec 1 is more complex than NoSec, and SecFull is the most complex of all. So, does it pay back ? Here are some results :
Speed - Window Search 64K :
|Calgary||32.8 MB/s||31.5 MB/s||32.0 MB/s|
|Firefox||40.5 MB/s||38.9 MB/s||40.0 MB/s|
|Enwik8||30.8 MB/s||29.9 MB/s||30.1 MB/s|
|Calgary||16.6 MB/s||16.9 MB/s||17.9 MB/s|
|Firefox||19.4 MB/s||18.6 MB/s||19.9 MB/s|
|Enwik8||13.0 MB/s||13.2 MB/s||14.7 MB/s|
|Calgary||6.5 MB/s||6.8 MB/s||8.1 MB/s|
|Firefox||2.3 MB/s||2.4 MB/s||2.7 MB/s|
|Enwik8||1.9 MB/s||2.4 MB/s||3.0 MB/s|
Time for some serious thinking.
Even with less comparisons, the new algorithms are not guaranteed to be faster. This is especially true at 64K, where all Secondary algorithm fail to produce any gain, even though the number of comparisons is well reduced.
We need larger search buffer to make Secondary promotions worthwhile. And this is reached as soon as 512K, and then growing with search size.
Actually, SecFull seems to win over Sec1 in each and every circumstance. So it can be preferred.
But to be fair, results are a bit disappointing. The very large difference in comparison loops would make us believe that the benefit could be larger. But in the end, it is still relatively mild.
No ground-breaking progress this time...