CompressionRatings benchmark were good, i was nonetheless surprised to notice that decoding speed would not improve on increasing the number of threads.
The only explanation i could come up with is that LZ4 decoder is decompressing data faster than the RAM Drive can deliver. Which is likely a correct statement, considering the in-memory benchmark results (LZ4 can decompress at an average 800MB/s per core, while RAM Drive can only deliver between 400 and 500 MB/s).
However, it would not explain why compressing with 4 threads was faster than decoding...
Here also, there is a plausible explanation : writing to RAM Drive may be slower than reading. In this case, compression has an advantage, since it writes less data.
Now, let's put that hypothesis to the test. I built a quick benchmark to measure read and write speed from a RAM Drive installed into a Windows XP box. I'm using Gavotte's Ramdisk for this test. Using another ramdisk might result in different speed, but is unlikely to dramatically change the conclusions. A different OS may also change results, but since CompressionRatings run with Windows XP, i'm mostly interested in mimicking the same conditions.
On running the benchmark, i witnessed a very stable 1190 MB/s for read operations.
On the other hand, write operations were limited at "only" 770 MB/s.
So now, that's confirmed : writing is slower than reading.
It's consistent with CompressionRatings results. Extrapolating from these figures :
if compressing at a ratio of 2:1, then the ramdrive r+w speed is limited to 525 MB/s.
On decoding the same data 1:2, the speed limit is now down to 455MB/s.
Hey, that's about the same ratio as LZ4 compression/decompression speed difference ...