We’re pleased to announce we’ve released RDB Loader version 5.3.2
This patch release brings a few features and bug fixes improving stability and observability of RDB Loader:
Caching and reusing tokens now makes all Loaders (Redshift, Databricks, Snowflake) less prone to authentication failures.
Before 5.3.2, if certain schema was referenced (by the set of events coming from the same batch) as a schema for context and as a schema for self-describing event, the Redshift Loader would execute two
COPY statements. That would result in a duplicates in the warehouse, because of a double data load from the same source path. Starting from 5.3.2, only one
COPY statement will be executed for each unique schema.
Improved folder monitoring
To simplify troubleshooting, we added a short message informing about an existence of unloaded folders (without superfluous details about specific folders).
Service responsible for receiving alerts can choose to react on the new summary message or rely on the messages for specific folders (as these messages are still sent and remain unchanged).
- Added configurable Databricks JDBC driver logging, helpful for troubleshooting issues in the Databricks Loader.
- Added scanning Docker images with Snyk on Github Actions. This allows us to detect Docker container security vulnerabilities on CI.
- Added timeouts on rollbacks to prevent RDB Loader getting stuck indefinitely.
Full changelog available here.
If you are already using a recent version of RDB Loader (3.0.0 or higher) then upgrading to 5.3.2 is as simple as
pulling the newest docker images. There are no changes needed to your configuration files.
docker pull snowplow/transformer-pubsub:5.3.2 docker pull snowplow/transformer-kinesis:5.3.2 docker pull snowplow/rdb-loader-redshift:5.3.2 docker pull snowplow/rdb-loader-snowflake:5.3.2 docker pull snowplow/rdb-loader-databricks:5.3.2
The Snowplow docs site
has a full guide
for running the RDB Loader.