Advisory: Impact of Log4j 2 CVE-2021-44228 on Snowplow components

On 9 December 2021, a vulnerability was identified in Apache Log4j 2, the popular Java logging library, which shows up as a dependency in a number of Snowplow applications.

The exploit can give attackers the power of Remote Code Execution (RCE) by logging a certain string.

The vulnerability has been published as CVE-2021-44228.

Snowplow applications are not affected

We have reproduced the exploit using log4j. However, in order to do so, we needed to use it directly, which is not happening anywhere in our code estate. Instead, in applications like stream-collector, enrich, snowplow-s3-loader and snowplow-elasticsearch-loader, as well as the snowplow-java-tracker we use a library called slf4j.

Each slf4j distribution binds to a single logging backend at compile time, which in our case is slf4j-simple and not log4j. But we have also not been able to reproduce the exploit when using log4j via slf4j – only when using it directly.

Currently we believe no Snowplow components are affected by this vulnerability. However, if you believe you have identified a vulnerable component, could you please contact us via our Hacker One program.


For a more detailed description of the exploit, a good resource is:

For more information on how slf4j binds to logging backends, see here.

8 Likes

Providing an update to the above in response to the coverage to date and also taking time to reflect on the extent of our testing in response to the log4j security vulnerability.

Testing summary
Our tests were two-fold: 1) We went through the code bases of key components to evaluate our logging strategy for externally provided strings and usage of log4j. And 2) we ran a set of tests against our estate in an attempt to trigger the vulnerability.

The outcome of these tests give us confidence that at least the latest (and likely, but not guaranteed, previous versions of) Snowplow components are not exploitable by this vulnerability. However, our dependencies may still be. We continually monitor our dependencies (and their dependencies, and so on) in both our components and containers - and we will of course respond promptly to vulnerability reports and fixes. But the analysis of the vulnerability suggests the reports and fixes may take some time to surface.

Our testing yesterday was performed on the configurations we run ourselves and deploy for our customers, which have the latest versions of components and late (if not latest) versions of the runtime. It is worth noting that these tests would not indicate that your pipeline is safe from this exploit if you are running an older version of Snowplow and/or an affected version of Java,

Anyone consuming our recent Docker images (the most common way to deploy Snowplow OS) should be using an unaffected version of Java. If you’re deploying our jar files, you may be at higher risk.

As a reminder, this vulnerability is expected to have widespread impact. While we’re doing what we can to protect the assets the Snowplow community is responsible for, there may be issues discovered in the platforms our components run on, depend on and integrate with.

Our Recommendation
Our strong recommendation to our community is to follow the guidance regarding checking and bumping Java versions where you can, and applying the config change where you can’t - ideally both. And as standard, we would also recommend ensuring that you are on the latest versions of our components. The Internet is being actively scanned for this exploit and it’s a trivial test to identify it.

Snowplow Customers in our community reading this can be assured that our confidence around your pipelines remains very high given our commitment to keeping the components and runtime up to date.

This is a gnarly issue that seems like it’s going to run for some time. We wish all of the community the best with this and hope we can support each other - and the wider tech community - through what could be a tough time.

5 Likes

Hi @dilyan / @stevecs - is this statement relevant regarding the interaction of log4j2 and slf4j?

http://slf4j.org/log4shell.html

How about the SLF4J API?

The SLF4J API is just an API which lets message data go through. As such, using log4j 2.x even via SLF4J does not mitigate the vulnerability.

However, as mentioned previously, log4j 1.x is safe with respect to CVE-2021-44228. Thus, if your SLF4J provider/binding is slf4j-logj12.jar , you are safe regarding CVE-2021-44228.

If you are using log4j-over-slf4j.jar with SLF4J API, you are safe unless the underlying implementation is log4j 2.x.

Thanks!

1 Like

Hi @timolin - yes it is certainly relevant, both for Snowplow pipeline applications, and for user applications that use the Snowplow java tracker. Thank you for linking to the statement.

The statement explains how it is the logging implementation that needs to be considered, rather than just the slf4j api. Slf4j comes with several logging implementations but from what I can work out none of them depend on version 2 of log4j. There is a binding for log4j version 1.2, but that version does not suffer from CVE-2021-44228. Edit: log4j-slf4j-impl is a slf4j binding using log4j version 2, which therefore would be vulnerable to CVE-2021-44228.

Users of the Snowplow java tracker need to check that the slf4j logging implementation they use is safe. And the statement is helpful in its advice: If your slf4j implementation is based on log4j then make sure your log4j configuration file is read-only; And be aware that there are known security vulnerabilities in many of the slf4j implementations.

As for our pipeline applications (collector, enrich, loaders etc), we use slf4j-simple as the logging implementation, which does not depend on log4j at all.

3 Likes

Providing an update on this. The scoping to Java versions has been dropped from the CVE and there are examples of exploits circulating performed on later versions. While we have still not identified anywhere in the pipeline estate that is logging plain text user-submitted content to log4j, we would highly recommend applying the recommended configuration change.

1 Like

Update 2021-12-16:

We are recommending users of Snowplow Mini update to at least 0.13.2 which includes additional mitigations for the log4shell vulnerability.

Out of an abundance of caution, we have also published updates to a variety of Snowplow components. We now recommend running each of these components at the following versions:

Stream Collector: v2.4.3
Enrich: v2.0.4
S3 Loader: v2.1.2
Elasticsearch Loader: v1.0.3 and v2.0.2
GCS Loader: v0.3.2
Snowplow Mini: v0.13.2

1 Like

Hi Paul,

Thank you and the whole team for all your work on this.

One question regarding the update recommendation: what is the best way to upgrade components such as Collector, Enrich and S3 Loader to their latests versions in a Snowplow Quick Start environment?

Should we wait for a new Quick Start release (latest is from 11 Oct 2021)? Similar to Snowplow Mini. Or is there a quicker way to upgrade components (preferably via terraform) in a Quick Start environment?

Thanks!

Hi @alexv
Thanks for asking. We’re currently working through the terraform updates to make it easier to upgrade your Quick Start environments. We’ve already published new modules for Enrich. The collector is next, then we will do the S3 Loader (which is a major version bump, v1 to v2, so we need to take a little extra caution here).

https://registry.terraform.io/modules/snowplow-devops/enrich-kinesis-ec2/aws/latest

https://registry.terraform.io/modules/snowplow-devops/enrich-pubsub-ce/google/latest

Once we’ve bumped all the modules, we’ll update the quickstart-examples too.

1 Like

Hi @alexv

The Quickstart Examples have now been bumped to use the latest terraform module components.

2 Likes

Awesome Paul! Thank you for the update.

1 Like