Is there a definitive source outlining which tables should be created in Redshift?
I was doing some load testing and the load failed because the com_snowplowanalytics_snowplow_mobile_context_1 table wasn’t present. Presumably Avalanche uses a tracker which sets this context?
I found this post with some information, which is helpful.
Also, has there been any discussion around a more graceful way to fail when a Redshift table isn’t present?
We use COMMIT when loading data. This means that we deliberately load all the tables in Redshift using a single transaction. This is important because it means that either the complete load succeeds or fails. In the event of failure, recovery is straightforward - if a load was part successful it would be complicated to recover without the risk of introducing duplicates.
The post you found is the only “definitive source” available for now. Do let us know if something is missing from that mapping.
I was just surprised to see that the Avalanche events didn’t load. Do you happen to know if it needs any additional tables beyond the one I mentioned above?
Looks like Avalanche sends through page view and structured events and both of these contain the mobile_context and the client_session context so you’ll want to make sure you’ve got com_snowplowanalytics_snowplow_mobile_context_1 as well as com_snowplowanalytics_snowplow_client_session_1 tables.