Snowplow Collector, Enricher and Lambda's run in Containers?

Hi Snowplow team

Is there any reason all the Snowplow docs online have the enricher and collector using exclusively ec2 instances. Could I easily use container instances for these components? Our team is also considering moving the lambda pieces into a container too.

Can you forsee any issues with either approach? Or do you know of others who have successfully done this?

ec2 tends to make the provisioning easy for a lot of customers but there’s absolutely nothing to stop you using containers - that’s in fact what we use for Snowplow BDP.

re: Lambdas - unless you are operating at very low volumes Lambdas (AWS) tend to be significantly more expensive with less desirable latency characteristics compared to just running something as a container (e.g., Fargate, k8s).

Thanks Mike !

Do you have a repo link to the deployment of the enricher and collector docker images?

The community edition will give you the Terraform templates to do this (the Docker images in turn are hosted on Docker Hub).

Thanks Mike but I can’t see docker referencing any of those docker images?

I looked at this one too and it referenced an instance template but no sign of anywhere deploying a docker image.
https://github.com/snowplow-devops/terraform-google-service-ce

Are you aware where I might find a deployment that references one of those dockerhub snowplow images?

*can’t see terraform referencing any of the docker images

The quick start guide will reference each underlying resource - here’s an example of the Docker image being deployed as part of the collector.

That link you sent is for an ec2 collector, I did a search for docker and so no reference. Could you link me to the line you’re referring to?

Sure - here’s the line for the Docker command that starts the collector container.

Wait why would you use an ec2 instance to run a container? Wouldn’t it be easier to just use ECS or EKS?

Our quickstart guides are designed to get folks up and running quickly and generally the easiest way to do that is a simple deployment onto EC2. Our production deployments generally use EKS / Fargate.

Ok thanks, would you have a link to those prod deployments that deploy the collector or enricher into EKS/fargate?

The production deployments are part of our commercial Snowplow BDP product. The community version follows a quick start rather than the fully fledged HA deployments as part of BDP.

1 Like