Stream Collector 2.2.0 released

UPDATE This release contains an issue with Anonymous Tracking, see here: Anonymous Tracking issues in Stream Collector versions 2.1.1, 2.1.2 and 2.2.0 - #2

We have released version 2.2.0 of the Snowplow Stream Collector which adds a new module that allows SQS to be configured as a sink.

SQS-only collector

In version 2.0.0 we added the capability to configure an SQS buffer to which traffic can be routed in case Kinesis is unavailable. The new module now gives you the option to use SQS as the primary sink.

It is distributed as a separate Docker image and configured in a similar way to the Kinesis collector. You can find more details on how to install and configure the stream collector on our docs site: Setup the Snowplow collector - Snowplow Docs

At the moment, there is no direct way for Enrich to consume data collected in SQS. You can use the sqs2kinesis app to move the data over to Kinesis, which can then serve as the source for Enrich in the regular way. Currently, we only intend the SQS sink to be used as a fallback for deployment in an alternative region in case of region-wide AWS outage in the main location of the Snowplow pipeline.

Static robots.txt

We have also added a new endpoint that serves a static robots.txt file to crawlers. This should reduce requests by well-behaved robots to the other collector endpoints, and may also result in fewer adapter_failure bad rows.

Many thanks to Mike Robins from Poplin Data for contributing this change.

4 Likes