We’re pleased to announce release 0.6.0 of Snowplow Snowflake Loader, bringing a string truncation on loading step.
String truncation
If you ever encountered Loader’s error similar to following:
time="2019-11-25T17:06:06Z" level=info msg="Error during rt/enriched/good/run=2019-12-25-11-05-03/ load. SQL execution internal error:
Processing aborted due to error 300010:4064119700; incident 8941141.
Chances are big that your tracking SDKs send tracker protocol fields with exceeding lengths, which can break loading at casting stage. Previously, if you ran across this problem - you’d have to find an offending row and delete it, which is a very tedious operation.
We significantly decreased chance of this error in 0.4.2, by intoducing truncation on transforming step, but unfortunately it did not work in 100% of cases as very long strings with special characters get escaped after truncation, making string longer again.
In 0.6.0 we started to truncate all columns that have big chances of being escaped:
refr_term
mkt_clickid
- all
VARCHAR
columns longer than 1000 characters
Dry run
Snowplow Snowflake Loader supports --dry-run
flag since 0.3.0. With this flag enabled, the Loader will print all SQL statements to stdout instead of executing them. This flag was available only for load
subcommand. Since 0.6.0 it is also available for setup
and migrate
subcommands.
Other changes
Due an internal refactoring, we had to change CLI API of setup
command. Now in order to skip some steps you need to specify several skip options, e.g. --skip step1
, --skip step2
instead of previous --skip step1,step2
.
Upgrade
To make use of the new versions of the Snowflake Transformer and Loader, you will need to update your Dataflow Runner configurations to use the following jar files.
playbook.json
s3://snowplow-hosted-assets/4-storage/snowflake-loader/snowplow-snowflake-transformer-0.6.0.jar
s3://snowplow-hosted-assets/4-storage/snowflake-loader/snowplow-snowflake-loader-0.6.0.jar