Thursday, June 26, 2014

Debunking the LZ4 "20 years old bug" myth

[Edit] The below post was redacted in the heat, right after the initial publication. Should you wish a summarized, and arguably more neutral tone, view of the situation, please consider reading the follow up.

 A recent post on a security blog has made a wonderful job at sexing up a subtle bug affecting LZ4, claiming that it would result in remote code execution. Given the current spread of use of LZ4, from every modern Linux distro, including critically Android, to modern file systems such as ZFS, shipped with FreeBSD and Illumos, through widely deployed databases such as MySQL, or big data nodes powered by Hadoop, the implication is that it must be a pretty big deal.

The detailed article then makes a minimal job at describing the strong conditions required to trigger that risk, but these explanations are lost, scattered within the content of an overly long and boring technical article, ensuring most readers will stop at the dramatic headline. The present article has the objective to counterweight those "high stakes" claims.

I won't spend too much time underlining the vast difference between reporting a bug for correction to the relevant team, and creating a dramatic communication for maximum coverage without even bothering if a fix is available for the exploit. The first behavior is valuable, helpful and welcomed. The second one is at best irresponsible, if not downright harmful. A security auditor is expected to know this simple common sense, and behave accordingly.

But perhaps more importantly, this bug wasn't unknown, which contradicts the statement that it was found through careful code study. The risk was identified and openly published on the LZ4 public issue board by Ludwig Strigeus, the creator of ĀµTorrent.  Instead of trying to make a headline, he even proposed a solution for it (unfortunately with some side effects). Ludwig made a fine job at describing the risk and opening a discussion.The bug was classified minor (rightly or wrongly) for reasons which will be detailed below. Thus, it was tracked for correction, but with no urgency, trying to design a fix without the initial side effects. After multiple partial improvements, a change of code related to the new streaming mode finally allowed such a fix to be created, closing the issue (by chance, merely a few days before DonB second "disclosure").

Let's now get into the technical details.
In order to use the vulnerability, a number of conditions must be met.
A first minor one is : it is necessary to work within a 32-bits environment. 64-bits is totally unaffected. That basically put most server-side applications outside of the risk zone.

Second, the attacker need to forge a special compressed block to overflow 32-bits address space. This is possible ... if the compressed block is something like 16 MB.

There is just a little problem with that : the legacy LZ4 file format is limited to 8 MB blocks. Any value larger than that just stops the decoding process. 8MB is not enough to trigger a problem, since 32-bits application are restricted to low address ranges. The newer streaming format is even stricter, with a hard limit at 4 MB. As a consequence, it's not possible to exploit that vulnerability using the documented LZ4 file/streaming format.

Well, but what about programs which use their own, undocumented, format ?
Indeed, same condition applies. To be exposed to this risk, very large blocks of 16MB or larger must be read by the decoder.
Does that happen ?
Let's have a look at several high-profile programs using LZ4. ZFS ? Max block size is 128 KB. Lucene ? Typical index size is 16 KB. It could be a bit more, say 64 KB, that's still far short of the objective. zram maybe ? nope, 4 KB memory segments. Linux kernel ? The boot section has to decompress a multi-megabytes kernel into memory, surely it can match the limit ? Yes, it does, but it uses the LZ4 Legacy File format, which limits each block to 8 MB maximum. OK, maybe AAA games such as Guild Wars 2 ? nope, real-time protocol is the realm of short messages, the protocol itself doesn't allow anything beyond a few KB. And so on, and on.

At the end of the day, none of the known implementation of LZ4 is exposed to this risk.
Sure, someone creating some private program at home could get into this pitfall, but this situation is not exactly a "security risk". Only widely deployed programs actually matter, since they are the potential targets by hackers or surveillance organizations.
Basically, most user programs employ LZ4 for small data packet structure, way below the critical limit. Programs which generate and distribute large compressed blocks (notably the lz4c pos-x compression utility, distributed within Linux Distro) use the documented framing format, which limits block size to 4 or 8 MB. Remove also from the list programs which never take "externally provided" data as input, they can't be targeted either.

So sorry, this is not a "new heartbleed" situation the author seems to dream for.

Sure, since a fix is currently available, it's nonetheless a good move to close this risk. But I wouldn't flag that action as "critical", since most applications stand outside of the "custom compression format using large blocks of > 8 MB on 32-bits system, and receiving data from untrusted external sources" scenario.

It's one thing to tell there is a potential vulnerability that should be fixed, to ensure it does not become exploitable in the future. It's a totally different thing to pretend there is a dangerous RCE (Remote Code Execution) exploit currently active on the Internet, which is scary. DonB article cleverly conflates the two, implying the second to create a flashy headline, while disseminating some minor disclaimer elements throughout the long article to pretend, whenever necessary, having said the first. That's clever, but conflating gravity level to grab some free ad is not a respectful behavior from a security firm.

The world is already full of real security threats, from which heartbleed is just one of the most recent examples. Triggering too many alarms to grab a bit of fame is a good way to weaken the power of future alarms. You can guess that when every headline claim a "new heartbleed" situation, no one will pay attention to the real next one which will matter. And that's dangerous.

The long list of "credits" at the end of the article is another reason for caution. It happens I know some the influential names listed there, and they told me they barely heard indirectly about the guy. So why are they listed there ? Fame by association ? Sure, please thank Linus Torvald for "coordinating and advising" on the issue. This is becoming plain ridiculous.

Anyway, should you feel nonetheless at risk now, please don't hesitate to update your LZ4 version. It's a good thing to do anyway, and as stated previously, the vulnerability was already patched.


  1. I did a LZW implementation for Ticketmaster in 1992. I got the algorithm from a magazine pseudocode. I used the same algorithm in TempleOS. You have no idea what you are talking about.

    1. Care to elaborate? What points has the article gotten wrong?

    2. Maybe he's being sarcastic ?

    3. Just a heads up: the guy you are responding to is quite (in)famous on Reddit. He's a really smart guy but he sadly has schizophrenia and thanks to the American medical system being a joke he has no access to proper treatment. You might want to take his response with a pinch of salt.

  2. Unfortunately, the poster above, Terry Davis, suffers from what many people believe to be schizophrenia. He is a person of very high intelligence, but often says very inappropriate things. :(

    1. Yeah, I guess it happens, this is Internet after all. ;-) No worry

    2. And you are one of those guys with compulsive-obsessive disorder that always, ALWAYS have to point out Davis' condition whenever his name is written somewhere.


    1. For the record: this comment was meant as a reply to Terry A Davis

      The article makes perfect sense :)

  4. Thank you very much for making that clear !

  5. Nice analysis. Thank you!

  6. but but... it's writing to memory here:


    1. The example linked to seems to confuse "kernel space" from "user space". It's not possible to simply call a "kernel function", as he does within his example by copy/pasting the code into his sample. Kernel functions can only be called from within the kernel.

      Besides, his example use the unsafe version "LZ4_decompress_fast" (named lz4_uncompress within Linux kernel), which is, by definition, unprotected from malicious input. The protected version is "LZ4_decompress_safe" (named lz4_uncompress_unknownoutputsize within Linux Kernel)

      LZ4 kernel version is currently used by the BootLoader, zRam, SquashFS and BTRFS. None of them is providing the decoder with data blocks large enough to risk being exposed. That's why current Linux kernels are safe.

      Now, it could that, in the future, a yet unknown program within Linux Kernel may do just that, use blocks of 16MB and beyond. It's unlikely but it *could* happen. That's why the Kernel version must be patched (and is being patched, I'm in contact with Greg for that).

      But there is no risk *righ now*. No external program can exploit it using any of the entry points offered by the kernel so far.

    2. Well, there are reports telling libav/ffmpeg could be at risk. They both deal with ton of custom data formats and can use blocks over 16M.

      Basically you never know how people would use your code. So leaving landmines like this in lib is really wrong idea, especially if you're aware of them. And speaking for myself, I think "unprotected"/non-validated functions like LZ4_decompress_fast just should not exist on kernel side. This gives user mode heck a lot chances to try to exploit kernel and elevate rights.

    3. Well, here is your comment, dug back from the grave.

      As far as I know, libav/ffmpeg don't use LZ4, so I can't comment. Furthermore, they seem to use a "custom version" of LZO, so they are basically managing their own risk, which is also limited to their application.

      Regarding LZ4_decompress, the reference implementation clearly promotes the _safe() variant, which is at the top of the interface, while the _fast() variant must be found later on, within the "advanced section".

      It's also my recommendation that LZ4_decompress_safe() should be used under most circumstances.
      There are programmers though which are willing to use the _fast variant, because they are working in a closed environment, with no external interaction. It's typically the case when the program compresses and decompresses its own data, directly from memory (without intermediate storage).

    4. Still, it is better save than sorry, so it would be nice if such generic and widespread things like LZ4 would not cause unexpected surprises, even in "uncommon" scenarios. I think I should agree that fortunately most usecases do not suffer from bug, so it wouldn't go as bad as it could be. However, there could be usecases which could be affected and it is really good idea to update LZ4 and LZO libs before blackhats find them and start exploiting it to gain unauthorised access, etc.

      As for _fast, got it. Though from my experience, "closed" environment usually happens to be not "as closed as expected", leading to funny attack vectors. But I would agree on that. And after re-checking Linux kernel, I can tell they were smartass enough not to use _fast in their hazardous area.

      P.S. and please accept my apologies for over-reacting on comment removal. Since it turned out to be spam protection, it has been really wrong idea to aggravate on post removal.

    5. While closed environments are often not as closed as expected, C provides much easier ways of shooting yourself in the foot than explicitly choosing an unsafe decompressor.

    6. C programmers are usually aware of common pitfalls. And searching for unknown bug in unknown program isn't very easy. OTOH decompressor already does half of job attacker needs (accesses memory under control of external data in predictable ways). So attempt to try to fool decompressor looks like promising idea if you're about to break into "closed" system. But I would agree its not compression lib author fault if someone uses unsafe but faster algos and then fails to ensure data were valid.

  7. Interesting. LZ4 author seems to be coward and deletes posts admitting that libav/ffmpeg seems to be affected by this vulnerability due to custom formats. Should I tell it is irresponsible attitude to downplay security issues? And it is also quite irresponsible to fix security problems one year after getting aware of them.

    1. Apparently, a commenter is courageous enough to send a libellous comment in Anonymous mode

      I don't "hide" posts, they are all displayed here automatically, without my approval. Now if you make a habit to send comment in anonymous mode, it may happen sometimes that one comment would automatically be stored into "spam" folder for reasons outside of my control. It will then require me to make a bit of research to find it and promote it.

    2. Hmm, if we're about "courage", its simple: I just do not have any of mentioned "profiles". Hence, posting as anonymous user has been just most convenient and fast option. And if it has been anti-spam measures what removed my comment, then I'm really sorry for being too harsh on that.