I’ve just added link click tracking to a tracker instance, and I’m getting an ETL failure with the following:
Error during unstruct_event_com_snowplowanalytics_snowplow_link_click_1 column creation, while loading enriched/run=2020-03-04-15-15-20/. SQL compilation error:
column 'UNSTRUCT_EVENT_COM_SNOWPLOWANALYTICS_SNOWPLOW_LINK_CLICK_1' already exists
No new columns were added, safe to rerun. Trying to rollback and exit
It does exist in the database ( we have used this event in the past). How do I get the loader to stop from trying to re-create the column?
Loader takes the current state from DynamoDB manifest table and it seems that you ran into an invalid state. You need to edit the manifest table manually in order to trick Loader. Here’s a short description of table structure: https://github.com/snowplow-incubator/snowplow-snowflake-loader/wiki/Snowflake-Manifest. Loader tries to create new columns that don’t yet known from ShredTypes columns.
What you will need to do is to add unstruct_event_com_snowplowanalytics_snowplow_link_click_1 to any previous item.
However, I’m wondering how did you end up with invalid state. Did you re-create the manifest table by any chance? Asking because I afraid that if you add this shredded type - you eventually run into a similar problem with next one.
I then re-ran the snowflake import and it’s run successfully. I can then re-add the data from the column using an UPDATE based on event_id from the cloned table.
As to how it got that way - that’s my fault, I did something hacky when setting up the table originally and copied an existing database table structure because I was having problems with the snowflake loader setup at the time. I’ll know not to do that in future!
Could you please elaborate on what you need to edit in the manifest table in order to trick the Loader so it knows the column already exists? I read through the wiki and couldn’t figure it out, so reverted to using the method @iain used (dropped the already existing column)
I checked the record in event-manifest and it looked like this:
The “SavedTo” value didn’t exist at the time when I re-ran loader and got the same “column already exists” error, but everything else existed. So is it perhaps a combination of the Shredtypes existing as well as “SavedTo” value existing?
@Ryan_Newsome, if SavedTo wasn’t filled that means the transdformation step hasn’t completed - that is the status was not PROCESSED. Typically, you wouldn’t want to start loading data to Snowflake DB as it is likely the data transformation hasn’t completed.
If you meant to amend manifest table by adding an extra shredded type you should do it to the one of the previous run ID.