There are three main factors to the cost to run:
- Data transfer costs
When sending data from the cloud to outside the cloud, you will incur data transfer cost. This is determined by the total volume of data that gets sent through the network. So if you have one snowbridge with no filters sending 2k events a second, or three of them, each with filters sending 2k/s in total across the 3, this cost is the same.
Pricing is complicated for this but you can find the pricing page for each cloud (eg. here’s the GCP one) and estimate it based on your event volumes. I use 1.5kb as a rough estimate of the size of a single event in JSON format.
I calculated this a long time ago for an estimate and worked out that at a volume of 2k events/second, it cost ~$450 per month for data transfer on AWS.
Importantly for this category of cost is how the cloud provider charges for it - if you’re not sending the data out of the network, or out of the region, it’s free. So for example on AWS if you self-host GTM SS on the same VPC & same region/zone (I can’t remember the details but apparently there’s some counterintuitive rule) as your Snowbridge, the cost for this is 0.
- Cost to run the app
Snowbridge doesn’t do a huge pile of work and is very efficient. When you filter data it does even less work. Costs depend on how you run things obviously - but if processing 2k events/s takes max 3 instances, then eg. on ECS that’ll ballpark to about $50-60 per month. If you have 3 deployments reading the same data, each filtering 33% out, each will likely only need one instance - and so won’t cost much (if any) more than that.
If you have more transformations it’ll take more resources to run, but in general you’re dealing in spare change compared to the data transfer element.
- Other costs, like the source stream
Reading data from the stream needs to be taken into account where relevant of course - for Kinesis in particular, because you have limits on the amounts of consumers per shard, you reach a point where more Snowbridge instances reading data necessitate adding shards - which can get expensive. Other platforms and streaming technologies don’t have this problem.
If this works out to be very expensive then other architectures should indeed be considered! For example I have encountered use cases where the solution was to set up a Snowbridge to stream data to kafka, and then set up Snowbridges and/or consumers on that.
Hope that helps!