Redshift RDB Loader Error (mobile_context)

We have recently had an incident where the Redshift RDB Loader stopped working.
We’ve noticed (after the investigation) that the mobile_context from iOS is sent with the structure of schema version 1-0-3 while the schema version field was 1-0-2.
This led the transformer to have both different data structures in the same files and didn’t let the data go through to Redshift throwing the following error
│ ERROR Loader: Loading of s3://clue-data-prod-snowplow/snowplow-kinesis-transformer-output/run=2023-05-03-18-00-00-28dfcd01-8ef3-41a1-a1ce-51b63b9cb0b8/ has failed. Not adding into retry queue. [Amazon](500310) Invalid operation: Load into table 'com_snowplowanalytics_snowplow_mobile_context_1' failed. Check 'stl_load_errors' system table for details.; - SqlState: XX000

in the stl_load_errors we had the following error: delimiter not found

When looking at the files, we could see clearly that we have 2 different data structures, one was fitting mobile_context schema version 1-0-2

while the other was fitting the mobile_context schema version 1-0-3

Could someone please advise on why/how could this happen and what would be a preventive measure for this incident?

As a workaround we pushed the 1-0-2 mobile_context schema version to our local Iglu server to have the 1-0-3 properties

@Sharon_Lavie , the version of the mobile context depends on the version of the tracker. If you upgraded the tracker in your application then that introduced a new stricture of mobile_context. In practice, it means additional properties captured by the tracker.

In other words, the corresponding table has to have the new columns added. The error “delimiter not found” is likely an indication that the table has not evolved to accommodate the new properties.

Are you using the new RDB Loader with tables auto-migration feature or you still rely on JSONPaths? In the latter case, you would need to add the columns manually ensuring they are in the right order.

This is interesting because the 1-0-3 mobile_context schema has only been released recently on Iglu and we haven’t yet published any tracker release using that schema – the latest 5.0.1 Android and iOS trackers are still using the 1-0-2 schema (we are currently working on the 5.1.0 release that will use the new schema).

So I don’t really understand what is happening here. How do you know that your events are tracked with the 1-0-3 schema? Do you have any examples of the payload?

Hi Sharon, after discussing this issue in the team, we think it is related to the following bug in the Redshift Loader: Redshift Loader: Handle missing columns in CSV · Issue #1065 · snowplow/snowplow-rdb-loader · GitHub

Unfortunately, we don’t have a solution or a workaround for the bug yet.

Hi Matus and Ihor,

Thank you guys for the quick response,
It definitely is this bug in the RDB loader that created this issue.

What I’m still not sure of is why it started on 3.5.2023 while we didn’t the SDK upgrade on 21.4.2023.

We did indeed have another release on iOS that didn’t have any specific change related to Snowplow.

At least we know now what is the root cause.

Best,
Sharon.

@Sharon_Lavie , would you be able to look into the event that has mobile context 1-0-3 and check the fields that correspond to your application and tracker itself. For example, the field v_tracker would point out at the actual tracker version that produced that event payload. Again, it is the tracker that has the mobile context version hardcoded unless your tracking code uses the new JSON schema with custom tracking. As I can see the JSON schema itself was added to the Central Iglu on April 27.