Snowbridge to multiple GTM containers (destinations)

Hi all,

we have an organisational setup with multiple applications, and would like to set up event forwarding to GTM-SS, based on app_id. The app_id is set per application, and we would like to have events from each application in a separate GTM container in order to manage access control (Person A only has access to container A, from application A, and is unaware of events from container B).

Does anyone have a best practice solution for this? From what we have seen, Snowbridge is by design a single source, single sink application. Would be interesting if it’s possible to dynamically swap the destination URL based on app_id:

  • app_id = A → destination URL A
  • app_id = B → destination URL B

We found that it is possible to filter out events with an existing transformation:

transform {
  use "spEnrichedFilter" {
    atomic_field = "app_id"
    regex = "A"
    filter_action = "keep"
  }
}

However, we haven’t found a way to build a “switch” that would direct events to the relevant destination, and seems like one way is running a Snowbridge instance per application, with a filter transformation keeping only the relevant events. Will be grateful for any tips!

Hey @daria - there’s no such ‘switch’ functionality in Snowbridge unfortunately.

To satisfy this use case you would need to set up one Snowbridge per app ID. It’s a relatively cheap app to run, so you might find that this is actually not as problematic as one might initially suspect.

Alternatively, you could put a load balancer in front of GTM SS and set up forwarding rules there.

Hope that helps!

Hey @Colm many thanks for the response, much appreciated!

And costs will scale based on number of incoming events, correct? If you can share any cost info you have of running an instance of Snowbridge (any cloud provider), would be grateful as well. Even though it’s not a big contributor to the overall pipeline costs most likely, would still be interesting.

There are three main factors to the cost to run:

  1. Data transfer costs

When sending data from the cloud to outside the cloud, you will incur data transfer cost. This is determined by the total volume of data that gets sent through the network. So if you have one snowbridge with no filters sending 2k events a second, or three of them, each with filters sending 2k/s in total across the 3, this cost is the same.

Pricing is complicated for this but you can find the pricing page for each cloud (eg. here’s the GCP one) and estimate it based on your event volumes. I use 1.5kb as a rough estimate of the size of a single event in JSON format.

I calculated this a long time ago for an estimate and worked out that at a volume of 2k events/second, it cost ~$450 per month for data transfer on AWS.

Importantly for this category of cost is how the cloud provider charges for it - if you’re not sending the data out of the network, or out of the region, it’s free. So for example on AWS if you self-host GTM SS on the same VPC & same region/zone (I can’t remember the details but apparently there’s some counterintuitive rule) as your Snowbridge, the cost for this is 0.

  1. Cost to run the app

Snowbridge doesn’t do a huge pile of work and is very efficient. When you filter data it does even less work. Costs depend on how you run things obviously - but if processing 2k events/s takes max 3 instances, then eg. on ECS that’ll ballpark to about $50-60 per month. If you have 3 deployments reading the same data, each filtering 33% out, each will likely only need one instance - and so won’t cost much (if any) more than that.

If you have more transformations it’ll take more resources to run, but in general you’re dealing in spare change compared to the data transfer element.

  1. Other costs, like the source stream

Reading data from the stream needs to be taken into account where relevant of course - for Kinesis in particular, because you have limits on the amounts of consumers per shard, you reach a point where more Snowbridge instances reading data necessitate adding shards - which can get expensive. Other platforms and streaming technologies don’t have this problem.

If this works out to be very expensive then other architectures should indeed be considered! For example I have encountered use cases where the solution was to set up a Snowbridge to stream data to kafka, and then set up Snowbridges and/or consumers on that.

Hope that helps!