Version 5.7.1 of Snowplow’s RDB loader is now released.
Skipping schemas in the transformer
We added a new optional config field to the Spark batch transformer:
"skipSchemas": [
"iglu:com.example/my_schema_1/jsonschema/1-0-0",
"iglu:com.example/my_schema_2/jsonschema/1-*-*",
]
If you supply the skipSchemas
config option, then the transformer’s output files will omit any columns using that schema. This works for any self-describing events and entities.
This feature could be helpful when recovering from edge-case schemas which for some reason cannot be loaded to the table. If you follow the usual rules of schema evolution then hopefully you will never need this feature.
Other changes
We also made a few under-the-hood changes, including bumping some dependencies to newer versions, adding some tests, and removing some unhelpful logging lines.
Upgrading
If you are already using a recent version of RDB Loader then upgrading to 5.7.1
is as simple as pulling the newest docker images. There are no required changes to your config file.
docker pull snowplow/rdb-loader-redshift:5.7.1
docker pull snowplow/rdb-loader-snowflake:5.7.1
docker pull snowplow/rdb-loader-databricks:5.7.1
docker pull snowplow/transformer-pubsub:5.7.1
docker pull snowplow/transformer-kinesis:5.7.1
docker pull snowplow/transformer-kafka:5.7.1